Natural Language Processing S21

11-411 for undergrads | 11-611 for grads

Overview

Syllabus
Lecture
Section A: Tuesdays and Thursdays, 4:00-5:20pm (Remote)
Instructors
Office Hours by appointment
Teaching Assistants
TA Office Hours
(Remote on Zoom)
Textbook
Speech and Language Processing (2nd Edition, 2007, Prentice-Hall), by Daniel Jurafsky and James Martin
Cheating Policy
Resources

Course Description

This course is about a variety of ways to represent human languages (like English and Chinese) as computational systems, and how to exploit those representations to write programs that do neat stuff with text and speech data, like

This field is called Natural Language Processing or Computational Linguistics, and it is extremely multidisciplinary. This course will therefore include some ideas central to Machine Learning and to Linguistics.

We'll cover computational treatments of words, sounds, sentences, meanings, and conversations. We'll see how probabilities and real-world text data can help. We'll see how different levels interact in state-of-the-art approaches to applications like translation and information extraction.

From a software engineering perspective, there will be an emphasis on rapid prototyping, a useful skill in many other areas of Computer Science.

Course Prerequisites

CS courses on data structures and algorithms, and strong programming skills.

Schedule

# Date Topic Readings Assignments and
Project milestones
1 Feb 2 Course overview; What does it mean to know language?
Slides
Lecture Video
Chap 1
2 Feb 4 Information extraction, question answering, and NLP in IR
Slides
Lecture Video
Chap 22.0-2, 23.0-2
3 Feb 9 Project
Slides
Example Project Video
Lecture Video
4 Feb 11 Words, morphology, and lexicons
Slides
Lecture Video
Chap 3.1-3.9 HW1 due
5 Feb 16 Language models and smoothing
Slides
Lecture Video
Chap 4.3-8 Project initial plan due
6 Feb 18 Noisy channel models and edit distance
Slides
Lecture Video
Chap 3.10, 3.11, 5.9
7 Feb 25 Part of speech tags
Slides
Lecture Video
Chap 5.0-3 HW2 due
8 Mar 2 Sequence Tagging
9 Mar 4 Classification 1
HW3 due
10 Mar 9 Classification 2
11 Mar 11 Deep Learning
HW 4 due
12 Mar 16 Syntactic representations of natural language
Chap 12.0-3
13 Mar 18 Chomsky hierarchy and natural language
Chap 15 Project progress report due
Mar 23 Midterm #1
Project Progress Report due 11:59pm
14 Mar 25 Treebanks
Chap 12.4, 14.7 HW5 due
15 Mar 30 Lexical semantics
Chap 17.0-2, 19.0-3
16 Apr 1 Word embeddings/vector semantics
SLP3 Chap 6
17 Apr 6 Contextualized representations (BERT)
18 Apr 8 Verb/sentence semantics
Chap 17.2-4, Chap 19.4-6 HW6 due
19 Apr 13 Compositional semantics, semantic parsing
Chap 18.1-3
20 Apr 20 Discourse, entity linking, pragmatics
Chap 20.0-6, 20.8-11
21 Apr 22 Speech 1
22 Apr 27 Speech 2
Project dry run code due
23 Apr 29 Sequence-to-sequence models
24 May 4 Machine Translation
Chap 25.0-1, 25.9 Final project code due
May 6 Midterm #2
Final project progress peport due

Competitive Project

Project description

A major component will be the project: build a program whose input is a web page P and whose output is a set of questions about the content in P (that a human could answer if she read P), and can also, if given a question Q about the content of P, answer the question intelligently. Projects will be pitted against each other in a competition at the end of the course.

Project Resources:

Grading

Students will be evaluated by exam (midterm and final, totaling 30%), regular short quizzes and weekly pencil-and-paper or small programming homework problems (40% together), and the group project (30%).

FAQ

Should I take this course?

Yes, if:

Related courses elsewhere (not exhaustive!)

University of California, Berkeley, Brown University, University of Colorado, Columbia University, Cornell University, University of Illinois at Urbana-Champaign, Johns Hopkins University, University of Maryland, New York University, University of Pennsylvania, Stanford University, University of Utah, University of Wisconsin-Madison