Computational Ethics for NLP

CMU CS 11830, Spring 2019

T/Th 10:30-11:50am, POS 146

Yulia Tsvetkov (office hours: Tuesday 12-1pm, GHC 6405), ytsvetko@cs.cmu.edu
Alan W Black (office hours: Wednesday 12-1pm, GHC 5701), awb@cs.cmu.edu
TA: Anjalie Field (office hours: Thursday 3-4pm, GHC 6609), anjalief@cs.cmu.edu

Summary

As language technologies have become increasingly prevalent, there is a growing awareness that decisions we make about our data, methods, and tools are often tied up with their impact on people and societies. This course introduces students to real-world applications of language technologies and the potential ethical implications associated with them. We discuss philosophical foundations of ethical research along with advanced state-of-the art techniques. Discussion topics include:


Announcements


Syllabus

The schedule is subject to change.

Week Date Theme Topics Readings Work due
1 1/15 Introduction Motivation, requirements, and overview. [slides] Hovy & Spruit (2016); Barocas & Selbst (2016); Gebru et al. (2018)
1/17 Foundations Philosophical foundations. [slides]
2 1/22 Foundations Philosophical foundations, history: medical, psychological experiments, IRB and human subjects [slides]
1/24 Misrepresentation and Bias Stereotypes, prejudice, and discrimination: background [slides] Dwork et al. (2011); Bolukbasi et al. (2016) HW1 released
3 1/29 Misrepresentation and Bias Stereotypes, prejudice, and discrimination. Debiasing. [slides] Zhao et al. (2017); Rudinger et al. (2017); Zhao et al. (2018)
1/31 [Class cancelled due to weather] Project proposal due
4 2/5 Misrepresentation and Bias Invited student speakers: Microaggressions (Luke, Aldrian, Emily) [slide ]; Bias in news reports (Anjalie) [slides] Breitfeller et al. (2019) [email instructors to get a draft]; Field et al. (2019)
2/7 Civility in Communication Identification of trolling, hate speech, abusive language, toxic comments [bias slides, hate speech slides] Voigt et al. (2017); Sap et al. (2017); Joseph et al. (2017); Massaro (1990) HW1 due; HW2 released
5 2/12 Project proposal presentations Blitz proposal presentations by students
2/14 Misrepresentation and Bias Invited speaker: Kody Manke. Psychological prespective of bias, stereotype threat
6 2/19 Civility in Communication Identification of trolling, hate speech, abusive language, toxic comment [slides] Warner & Hirschberg (2012), Schmidt & Wiegand (2017), Karlekar & Bansal (2018), Cheng et al. (2017)
2/21 Civility and Communication Invited speaker Shrimai Prabhumoye: Hate speech and bias in conversational agents [slides]; Class discussion on OpenAI language model [slides] Fessler (2017) ; Henderson et al. (2017) ; Metz (2017) HW2 due
7 2/26 Privacy and Security Demographic inference techniques [slides] Bamman et al. (2012); Jurgens et al.(2017); Sec. 3 in Nguyen et al.(2016)
2/28 Privacy and Security Anonymization, differential privacy, obfuscation of text attributes [slides]
8 3/5 The Language of Manipulation Computational propaganda [slides]
3/7 The Language of Manipulation Computational solutions [slides] King et al. (2017); Field et al. (2018); Starbird (2018)
9 3/12 Spring Break [No class]
3/14 Spring Break [No class]
10 3/19 The Language of Manipulation Targetered ads, fake news, US elections [slides]
3/21 Class Discussion Case study: Cambridge Analytica article in The Guardian;"Brexit: The Uncivil War" Channel 4 Movie (2019) HW3 released
11 3/26 Midterm project presentations Student presentations
3/28 Midterm project presentations Student presentations Midterm report due
12 4/2 Intellectual Property Plagiarism and plagiarism detection. Patents [slides]
4/4 Ethical Research Practices. NLP for Social Good. Pitfalls in statistical analysis. Low-resource NLP [slides]. Lorelei [slides] HW3 due; HW4 released
13 4/9 NLP for Social Good Invited speaker: Alex Hauptman
4/11 No class [Spring Carnival]
14 4/16 NLP for Social Good Endangered Languages [slides]
4/18 Fairness in ML Invited speaker: Maria De-Arteaga [slides] Swinger et al. (2019); De-Arteaga et al. (2019); Romanov et al. (2019) HW4 due
15 4/23 Class Discussion Code of Ethics; Model Report Cards
4/25 [No class] Reserved for project preparations
16 4/30 Final Project Presentations
5/2 Final Project Presentations
5/8 Final Report Due Final Report due

Readings

TBD


Grading

Homework assignments. (4 assignments; 15% each) annotation and coding assignments, focusing on important subproblems from the topics discussed in class.

For example:

  1. annotation of bias in directed speech;
  2. classification of different types of bias or hate speech;
  3. generation of paraphrases that anonymize or obfuscate demographic properties of a writer;
  4. computational analysis of newspaper corpora

In assignments, students will be given datasets, a baseline algorithm to implement, and code for automatic evaluation. Submissions will be evaluated on test data. A baseline solution will earn a passing grade (B Grade); additional credit will be given for creative solutions that surpass the baseline.

Project. (30%) a semester-long (normally) 3-person team project (see below).

Participation in class. (10%) classes will include discussions of reading assignments, students will be expected to read the papers and participate in discussions, and lead one discussion.


Projects

A major component will be a team project. It will be a substantial research effort carried out by each student or groups of students (expected group size = 3; 2-4 is acceptable). See this document for possible project ideas. All assignments are due at 11:59pm on the specified date

All project components should be submitted via email (ethicalnlpcmu@gmail.com). Each team should send 1 email, CC’ing all group members. The project requirements are:


Policies

Late policy. A penalty of 10% will be applied to homework assignments submitted up to 24 hours late; no credit will be given for homework submitted more than 24 hours after it is due.

Academic honesty. Homework assignments are to be completed individually. Verbal collaboration on homework assignments is acceptable, as well as re-implementation of relevant algorithms from research papers, but everything you turn in must be your own work, and you must note the names of anyone you collaborated with on each problem and cite resources that you used to learn about the problem. The project is to be completed by a team. You are encouraged to use existing NLP components in your project; you must acknowledge these appropriately in the documentation. Suspected violations of academic integrity rules will be handled in accordance with the CMU guidelines on collaboration and cheating.


Note to Students

Take care of yourself! As a student, you may experience a range of challenges that can interfere with learning, such as strained relationships, increased anxiety, substance use, feeling down, difficulty concentrating and/or lack of motivation. All of us benefit from support during times of struggle. There are many helpful resources available on campus and an important part of having a healthy life is learning how to ask for help. Asking for support sooner rather than later is almost always helpful. CMU services are available, and treatment does work. You can learn more about confidential mental health services available on campus at: http://www.cmu.edu/counseling/. Support is always available (24/7) from Counseling and Psychological Services: 412-268-2922.

Accommodations for Students with Disabilities:

If you have a disability and have an accommodations letter from the Disability Resources office, I encourage you to discuss your accommodations and needs with me as early in the semester as possible. I will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, I encourage you to contact them at access@andrew.cmu.edu.