Machine
Translation and Natural Language Generation
Vision: Making Communication Easier among All Languages & Making
Techniques Available for All Languages
Course Description
The course is designed for beginners in Natural Language Generation,
Natural Language Processing and Machine Translation. The main aim is to
exploring the theories and methods in automatically understanding and
generating natural language text, with special focus on
multilingualism.
The participation requires basic knowledge of machine learning.
Please consider taking online courses from Coursera, or watch online
videos.
The course is taught every Spring semaster in Department of Computer
Science and Technology, Nanjing University, since 2020. It is firstly
designed for graduate students, and then opened for both graduate and
undergraduate students.
Objectives
- Understand the fundamental concepts, challenges and applications in
Natural Language Processing/Generation.
- Learn the evolution of research in Mutilingualism, including machine
tralsation, multiligual models, etc.
- Practice the design, training and application of natural language
generation models.
Outline
1. Introduction
- Problems in Natural Language Processing
- NLP as Classifications
- NLP as Structured Predictions
- Natural Language Generation
2. Language Models
- Probabilistic Modeling of Natural Language
- Statistical Language Models
- Neural Language Models and Pretraining
- Language Language Models
3. Machine Translation
- Traditional Machine Translation (Rule-based Machine Translation,
Statistical Machine Translation)
- Deep Learning and Machine Translation
- *Machine Translation with Less Parallel Data (Low-resource,
Unsupervised Machine Translation)
- *Non-Autoregressive Machine Translation (Parallel Generation)
- *Interactive Machine Translation
- *Translation Quality Evaluation
4. Other Generation Tasks
- Summarization: Content Selection
- Paraphrase: Semantical Equivalence
- Style Transfer: Controlled Generation
- Image Captioning: Multi-modal Interaction
5. Multiliualism in
Large Language Models
- Evaluation of Multilinguality
- Extending to New Languages
- Aligning Language Abilities
Assessments
- Homework 1: interacting with Language Models
- Homework 2: research on Large Language Models
- Homework 3: research on Machine Translation and Multilingualism
- Final Project: implementation, experiments and discussions on
self-selected topics
- Shujian Huang (homepage)
- Email: huangsj at nju dot edu dot cn
Acknowledgement
The course is constantly improved with the help from wonderful
teaching assistants: Zaixiang Zheng(2020), Yu Bao(2021), Jiahuan
Li(2022), Wenhao Zhu(2023), Changjiang Gao(2024)