Spring 2020
T/Th 1:30-3 pm, Wean Hall 2302
Remote Classes until further notice
Yulia Tsvetkov (office hours: By appointment, GHC 6405), ytsvetko@cs.cmu.edu
David Mortensen (office hours: By appointment, GHC 5407), dmortens@cs.cmu.edu
Teaching Assistants:
Chan Young Park (office hours: Thu 5-6pm, Zoom), chanyoun@cs.cmu.edu
Lexi Luo (office hours: Wed 2-3pm, Zoom), xuechunl@cs.cmu.edu
Forum: Piazza
This course will explore current statistical techniques for the automatic analysis of natural (human) language data. The dominant modeling paradigm is corpus-driven statistical learning, with a split focus between supervised and unsupervised methods. This term we are making Algorithms for NLP a lab-based course. Instead of homeworks and exams, you will complete four hands-on coding projects. This course assumes a good background in basic probability and Python programming. Prior experience with linguistics or natural languages is helpful, but not required. There will be a lot of statistics, algorithms, and coding in this class.
Slides, materials, and projects for this iteration of Algorithms for NLP are borrowed from Dan Jurafsky at Stanford, Dan Klein and David Bamman at UC Berkeley and Nathan Schneider at Georgetown University
The lecture plan is subject to change.
Week | Date | Topics | Readings | HWs/Quizs |
---|---|---|---|---|
1 | Jan 14 | Course Introduction [slides] | ||
Jan 16 | Language Modeling [slides] | J+M III 3 | ||
2 | Jan 21 | Morphology I [slides] | J+M II 3.1-3.9 | Quiz 1 |
Jan 23 | Morphology II | |||
3 | Jan 28 | Word Embeddings I [slides] | J+M III 6, Turney and Pantel'10, Brown | Quiz 2 |
Jan 30 | Word Embeddings II [slides] | FastText, ELMo | ||
4 | Feb 4 | Lexical Semantics [slides] | J+M II 17.0-2, 19.0-3 | Quiz 3 |
Feb 6 | Parts of Speech [slides] | J+M II 5.0-3 | ||
5 | Feb 11 | POS Tagging [slides] | J+M II 6.0-4 | Quiz 4 |
Feb 13 | HMMs, POS, NER [slides] | J+M 5, J+M III Appendix A, Collins notes, Brants, Toutanova & Manning | HW1 release | |
6 | Feb 18 | Formal Grammars [slides] | J+M 12.0–3 | Quiz 5 |
Feb 20 | Parsing 1 [slides] | |||
7 | Feb 25 | Parsing 2 [slides] | J+M III Ch 14, Chen & Manning 2014, Dyer et al 2015 | Quiz 6 |
Feb 27 | Parsing 3 [slides] | Split, Lexicalized, K-Best A* | ||
8 | Mar 3 | Semantics and Discourse 1 [slides] | J+M II 17.2-4, 19.4-6 | Quiz 7 |
Mar 5 | Semantics and Discourse 2 [slides] | J+M II 20 | HW1 deadline | |
9 | Mar 10 | No Class (Spring Break) | ||
Mar 12 | No Class (Spring Break) | |||
10 | Mar 17 | Cancelled | J+M II 18.1-3 | |
Mar 19 | Semantics and Discourse 3 [slides] | |||
11 | Mar 24 | Semantics and Discourse 4 [slides] | Quiz 8 | |
Mar 26 | Pragmatics | J+M II 4, 21 | ||
12 | Mar 31 | Sentiment Analysis [slides] | Quiz 9 HW2 release |
|
Apr 2 | Intro to Neural Networks [slides] | |||
13 | Apr 7 | Sentiment Analysis 2 | ||
Apr 9 | Machine Translation [slides] | Quiz 10 | ||
14 | Apr 14 | Machine Translation | ||
Apr 16 | Text Summarization [slides] | |||
15 | Apr 21 | Multilingual NLP [slides] | ||
Apr 23 | Low-resource NLP [slides] | HW2 deadline | ||
16 | Apr 28 | Ethics [slides] | ||
Apr 30 | No Class |
The primary recommended texts for this course are:
Make sure you get the purple 2nd edition of J+M, not the white 1st edition.
Quiz. (50%; 5% each) We’ll have a in-class quiz per week, covering the material from the previous week. There will be 10 quizs in total, and each quiz will take up 5% of the overall grade.
Project. (50%; 25% each) There will be two Python programming projects; one for POS tagging and one for sentiment analysis. The detailed description on how to submit projects will be given when they are released.
Late policy. Each student will be granted 5 late days to use over the duration of the semester. There are no restrictions on how the late days can be used (e.g. all 5 could be used on one project.) Using late days will not affect your grade. However, projects submitted late after all late days have been used will receive no credit. Be careful!
Academic honesty. Homework assignments are to be completed individually. Verbal collaboration on homework assignments is acceptable, as well as re-implementation of relevant algorithms from research papers, but everything you turn in must be your own work, and you must note the names of anyone you collaborated with on each problem and cite resources that you used to learn about the problem. Suspected violations of academic integrity rules will be handled in accordance with the CMU guidelines on collaboration and cheating.
Take care of yourself! As a student, you may experience a range of challenges that can interfere with learning, such as strained relationships, increased anxiety, substance use, feeling down, difficulty concentrating and/or lack of motivation. All of us benefit from support during times of struggle. There are many helpful resources available on campus and an important part of having a healthy life is learning how to ask for help. Asking for support sooner rather than later is almost always helpful. CMU services are available, and treatment does work. You can learn more about confidential mental health services available on campus at: http://www.cmu.edu/counseling/. Support is always available (24/7) from Counseling and Psychological Services: 412-268-2922.
Accommodations for Students with Disabilities:
If you have a disability and have an accommodations letter from the Disability Resources office, I encourage you to discuss your accommodations and needs with me as early in the semester as possible. I will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, I encourage you to contact them at access@andrew.cmu.edu.