This course is about a variety of ways to represent human languages (like English and Chinese) as computational systems, and how to exploit those representations to write programs that do neat stuff with text and speech data, like
This field is called Natural Language Processing or Computational Linguistics, and it is extremely multidisciplinary. This course will therefore include some ideas central to Machine Learning and to Linguistics.
We'll cover computational treatments of words, sounds, sentences, meanings, and conversations. We'll see how probabilities and real-world text data can help. We'll see how different levels interact in state-of-the-art approaches to applications like translation and information extraction.
From a software engineering perspective, there will be an emphasis on rapid prototyping, a useful skill in many other areas of Computer Science.
CS courses on data structures and algorithms, and strong programming skills.
# | Date | Topic | Readings | Assignments and Project milestones |
|
---|---|---|---|---|---|
1 | Feb 2 |
Course overview; What does it mean to know language? Slides Lecture Video |
Chap 1 | ||
2 | Feb 4 |
Information extraction, question answering, and NLP in IR Slides Lecture Video |
Chap 22.0-2, 23.0-2 | ||
3 | Feb 9 |
Project Slides Example Project Video Lecture Video |
|||
4 | Feb 11 |
Words, morphology, and lexicons Slides Lecture Video |
Chap 3.1-3.9 | HW1 due | |
5 | Feb 16 |
Language models and smoothing Slides Lecture Video |
Chap 4.3-8 | Project initial plan due | |
6 | Feb 18 |
Noisy channel models and edit distance Slides Lecture Video |
Chap 3.10, 3.11, 5.9 | ||
7 | Feb 25 |
Part of speech tags Slides Lecture Video |
Chap 5.0-3 | HW2 due | |
8 | Mar 2 |
Classification 1 Slides Lecture Video |
|||
9 | Mar 4 |
Classification 2 Slides Lecture Video |
HW3 due | ||
10 | Mar 9 |
Sequence Tagging Slides Lecture Video |
|||
11 | Mar 11 | Deep Learning Slides Lecture Video |
|||
12 | Mar 16 |
Syntactic representations of natural language Slides Lecture Video |
Chap 12.0-3 |
HW 4 due |
|
13 | Mar 18 |
Chomsky hierarchy and natural language Slides Lecture Video |
Chap 15 | Project Progress Report due | |
— | Mar 23 | Midterm #1 |
|||
14 | Mar 25 |
Treebanks Slides Lecture Video |
Chap 12.4, 14.7 | HW5 due | |
15 | Mar 30 |
Lexical semantics Slides Lecture Video |
Chap 17.0-2, 19.0-3 | ||
16 | Apr 1 |
Word embeddings/vector semantics Slides Lecture Video |
SLP3 Chap 6 | ||
17 | Apr 6 |
Contextualized representations (BERT) Slides Lecture Video |
|||
18 | Apr 8 |
Verb/sentence semantics Slides Lecture Video |
Chap 17.2-4, Chap 19.4-6 | HW6 due | |
19 | Apr 13 |
Compositional semantics, semantic parsing Slides Lecture Video |
Chap 18.1-3 | ||
20 | Apr 20 | Discourse, entity linking, pragmatics Slides Lecture Video |
Chap 20.0-6, 20.8-11 | ||
21 | Apr 22 | Speech 1 Slides Lecture Video |
|||
22 | Apr 27 |
Speech 2 Slides Lecture Video |
Project dry run code due | ||
23 | Apr 29 |
Natural Language Generation Slides Lecture Videos |
|||
24 | May 4 |
Machine Translation Slides Lecture Videos |
Chap 25.0-1, 25.9 | Final project code due | |
— | May 6 | Midterm #2 |
Final project progress peport due |
A major component will be the project: build a program whose input is a web page P and whose output is a set of questions about the content in P (that a human could answer if she read P), and can also, if given a question Q about the content of P, answer the question intelligently. Projects will be pitted against each other in a competition at the end of the course.
Project Resources:
Students will be evaluated by exam (midterm and final, totaling 30%), regular short quizzes and weekly pencil-and-paper or small programming homework problems (40% together), and the group project (30%).
Should I take this course?
Yes, if:
University of California, Berkeley, Brown University, University of Colorado, Columbia University, Cornell University, University of Illinois at Urbana-Champaign, Johns Hopkins University, University of Maryland, New York University, University of Pennsylvania, Stanford University, University of Utah, University of Wisconsin-Madison