Natural Language Processing

11-411 for undergrads | 11-611 for grads

Course Description

This course is about a variety of ways to represent human languages (like English and Chinese) as computational systems, and how to exploit those representations to write programs that do neat stuff with text and speech data, like
  • translation,
  • summarization,
  • extracting information,
  • question answering,
  • natural interfaces to databases, and
  • conversational agents.

This field is called Natural Language Processing or Computational Linguistics, and it is extremely multidisciplinary. This course will therefore include some ideas central to Machine Learning and to Linguistics.

We'll cover computational treatments of words, sounds, sentences, meanings, and conversations. We'll see how probabilities and real-world text data can help. We'll see how different levels interact in state-of-the-art approaches to applications like translation and information extraction.

From a software engineering perspective, there will be an emphasis on rapid prototyping, a useful skill in many other areas of Computer Science.

Course Prerequisites

CS courses on data structures and algorithms, and strong programming skills.


Date Topic Readings Assignments
1 Jan 17 Course overview; What does it mean to know language?
Chap 1
2 Jan 19 Information extraction, question answering, and NLP
in IR
Chap 22.0-2, 23.0-2
3 Jan 24 Project
Slides    Notes    Example video
4 Jan 26 Words, morphology, and lexicons
Chap 3.1-3.9
5 Jan 31 Language models and smoothing
Chap 4.3-8
6 Feb 2 Noisy channel models and edit distance
Chap 3.10, 3.11, 5.9 Assignment 1 due
7 Feb 7 Classification
8 Feb 9 Part of speech tags
Chap 5.0-3 Assignment 2 due

Project Initial Report due
9 Feb 14 Hidden Markov models
Chap 6.0-4
10 Feb 16 Syntactic representations of natural language
Chap 12.0-3 Assignment 3 due
11 Feb 21 Chomsky hierarchy and natural language
Chap 16
12 Feb 23 Context-free recognition, CKY
Assignment 4 due
13 Feb 28 Parsing algorithms
Chap 13
14 Mar 2 Parsing algorithms contd.
Chap 12.7, Chap 14-14.2 Assignment 5 due -->
15 Mar 7 Treebanks and PCFGs
Chap 12.4, 14.7
Mar 9 Midterm
Practice Problems    Practice Solutions
Mar 14,16 Spring Break
Project Progress Report I due 19th March 11:59pm
16 Mar 21 Lexical semantics
Chap 17.0-2, 19.0-3
17 Mar 23 Word embeddings/vector semantics
JM v3 Chap 19

Assignment 6 due
18 Mar 28 Verb/sentence semantics
Slides A
Slides B
Chap 17.2-4, Chap 19.4-6
19 Mar 30 Compositional semantics, semantic parsing
Chap 18.1-3 Assignment 7 due
20 Apr 4 Discourse, entity linking, pragmatics
Chap 21
21 Apr 6 Word Sense Disambiguation and Semantic Role Labelling
Chap 20.0-6, 20.8-11
22 Apr 11 Speech 1
23 Apr 13 Speech 2
24 Apr 18 Interpreting Social Media
Slides A
Slides B
Slides C
Slides D
Apr 20 Final project submission due (on project server)
25 Apr 25 Deep Learning
Final project report due (via YouTube)
26 Apr 27 Machine Translation
Chap 25.0-1, 25.9 Question evaluations due
27 May 2 Non-English NLP
Answer evaluations due
28 May 4 Conclusion
Slides I
May 11Final examPractice problems
Practice problem solutions

Competitive Project

A major component will be the project: build a program whose input is a web page P and whose output is a set of questions about the content in P (that a human could answer if she read P), and can also, if given a question Q about the content of P, answer the question intelligently. Projects will be pitted against each other in a competition at the end of the course.


Students will be evaluated by exam (midterm and final, totaling 40%), regular short quizzes and weekly pencil-and-paper or small programming homework problems (30% together), and the group project (30%).


Should I take this course?

Yes, if:

  • you're a CS student interested in languages, language technology, or information processing
  • you're a CS student who needs an "applications" credit
  • you're a language technology minor (this course is an elective option)
  • you're a linguistics student who can write computer programs (this course is an elective option)
  • you always suspected natural language was kind of like Lisp (or Java or ...)
  • you want computers to take over the world
  • you don't want computers to take over the world, but if they do, you want to negotiate your release
  • you like AI, machine learning, and/or theoretical computer science, and want to apply them to a hard real-world problem

Related courses elsewhere (not exhaustive!)

University of California, Berkeley, Brown University, University of Colorado, Columbia University, Cornell University, University of Illinois at Urbana-Champaign, Johns Hopkins University, University of Maryland, New York University, University of Pennsylvania, Stanford University, University of Utah, University of Wisconsin-Madison