Computational Ethics for NLP

Summary

As language technologies have become increasingly prevalent, there is a growing awareness that decisions we make about our data, methods, and tools are often tied up with their impact on people and societies. This course introduces students to real-world applications of language technologies and the potential ethical implications associated with them. We discuss philosophical foundations of ethical research along with advanced state-of-the art techniques. Discussion topics include:

Philosophical foundations: what is ethics, history, medical and psychological experiments, IRB and human subjects, ethical decision making.
Misrepresentation and bias: algorithms to identify biases in models and data and adversarial approaches to debiasing.
Privacy: algorithms for demographic inference, personality profiling, and anonymization of demographic and personal traits.
Civility in communication: techniques to monitor trolling, hate speech, abusive language, cyberbullying, toxic comments.
Democracy and the language of manipulation: approaches to identify propaganda and manipulation in news, to identify fake news, political framing.
NLP for Social Good: Low-resource NLP, applications for disaster response and monitoring diseases, medical applications, psychological counseling, interfaces for accessibility.
Multidisciplinary perspective: invited lectures from experts in behavioral and social sciences, rhetoric, etc.

Announcements

HW4 has been released! It is due 4/18. DEADLINE EXTENDED TO 4/22
HW3 has been released! It is due 4/4. DEADLINE EXTENDED TO 4/8
MIDTERM PRESENTATIONS ARE POSTPONED TO 3/26 AND 3/28. The written report is now due 3/28
HW2 has been released! It is due 2/21. This assignment requires you to work in groups of 2-3 people.
HW1 has been released! It is due 2/7
Project proposals are due on 1/31. Check out some project ideas here. DEADLINE EXTENDED TO 2/3
Please join the class Piazza!

Syllabus

The schedule is subject to change.


Week	Date	Theme	Topics	Readings	Work due
1	1/15	Introduction	Motivation, requirements, and overview. [slides]	Hovy & Spruit (2016); Barocas & Selbst (2016); Gebru et al. (2018)
1	1/17	Foundations	Philosophical foundations. [slides]
2	1/22	Foundations	Philosophical foundations, history: medical, psychological experiments, IRB and human subjects [slides]
2	1/24	Misrepresentation and Bias	Stereotypes, prejudice, and discrimination: background [slides]	Dwork et al. (2011); Bolukbasi et al. (2016)	HW1 released
3	1/29	Misrepresentation and Bias	Stereotypes, prejudice, and discrimination. Debiasing. [slides]	Zhao et al. (2017); Rudinger et al. (2017); Zhao et al. (2018)
3	1/31	[Class cancelled due to weather]			Project proposal due
4	2/5	Misrepresentation and Bias	Invited student speakers: Microaggressions (Luke, Aldrian, Emily) [slide ]; Bias in news reports (Anjalie) [slides]	Breitfeller et al. (2019) [email instructors to get a draft]; Field et al. (2019)
4	2/7	Civility in Communication	Identification of trolling, hate speech, abusive language, toxic comments [bias slides, hate speech slides]	Voigt et al. (2017); Sap et al. (2017); Joseph et al. (2017); Massaro (1990)	HW1 due; HW2 released
5	2/12	Project proposal presentations	Blitz proposal presentations by students
5	2/14	Misrepresentation and Bias	Invited speaker: Kody Manke. Psychological prespective of bias, stereotype threat
6	2/19	Civility in Communication	Identification of trolling, hate speech, abusive language, toxic comment [slides]	Warner & Hirschberg (2012), Schmidt & Wiegand (2017), Karlekar & Bansal (2018), Cheng et al. (2017)
6	2/21	Civility and Communication	Invited speaker Shrimai Prabhumoye: Hate speech and bias in conversational agents [slides]; Class discussion on OpenAI language model [slides]	Fessler (2017) ; Henderson et al. (2017) ; Metz (2017)	HW2 due
7	2/26	Privacy and Security	Demographic inference techniques [slides]	Bamman et al. (2012); Jurgens et al.(2017); Sec. 3 in Nguyen et al.(2016)
7	2/28	Privacy and Security	Anonymization, differential privacy, obfuscation of text attributes [slides]
8	3/5	The Language of Manipulation	Computational propaganda [slides]
8	3/7	The Language of Manipulation	Computational solutions [slides]	King et al. (2017); Field et al. (2018); Starbird (2018)
9	3/12	Spring Break [No class]
9	3/14	Spring Break [No class]
10	3/19	The Language of Manipulation	Targetered ads, fake news, US elections [slides]
10	3/21	Class Discussion	Case study: Cambridge Analytica	article in The Guardian;"Brexit: The Uncivil War" Channel 4 Movie (2019)	HW3 released
11	3/26	Midterm project presentations	Student presentations
11	3/28	Midterm project presentations	Student presentations		Midterm report due
12	4/2	Intellectual Property	Plagiarism and plagiarism detection. Patents [slides]
12	4/4	Ethical Research Practices. NLP for Social Good.	Pitfalls in statistical analysis. Low-resource NLP [slides]. Lorelei [slides]		HW3 due; HW4 released
13	4/9	NLP for Social Good	Invited speaker: Alex Hauptman
13	4/11	No class	[Spring Carnival]
14	4/16	NLP for Social Good	Endangered Languages [slides]
14	4/18	Fairness in ML	Invited speaker: Maria De-Arteaga [slides]	Swinger et al. (2019); De-Arteaga et al. (2019); Romanov et al. (2019)	HW4 due
15	4/23	Class Discussion	Code of Ethics; Model Report Cards
15	4/25	[No class]	Reserved for project preparations
16	4/30	Final Project Presentations
16	5/2	Final Project Presentations
	5/8	Final Report Due			Final Report due

Readings

Extended readings list.
Related courses

TBD

Grading

Homework assignments. (4 assignments; 15% each) annotation and coding assignments, focusing on important subproblems from the topics discussed in class.

For example:

annotation of bias in directed speech;
classification of different types of bias or hate speech;
generation of paraphrases that anonymize or obfuscate demographic properties of a writer;
computational analysis of newspaper corpora

In assignments, students will be given datasets, a baseline algorithm to implement, and code for automatic evaluation. Submissions will be evaluated on test data. A baseline solution will earn a passing grade (B Grade); additional credit will be given for creative solutions that surpass the baseline.

Project. (30%) a semester-long (normally) 3-person team project (see below).

Participation in class. (10%) classes will include discussions of reading assignments, students will be expected to read the papers and participate in discussions, and lead one discussion.

Projects

A major component will be a team project. It will be a substantial research effort carried out by each student or groups of students (expected group size = 3; 2-4 is acceptable). See this document for possible project ideas. All assignments are due at 11:59pm on the specified date

All project components should be submitted via email (ethicalnlpcmu@gmail.com). Each team should send 1 email, CC’ing all group members. The project requirements are:

Proposal [Due 1/31 DEADLINE EXTENDED TO 2/3] A <1 page document outlining: the names of team members, the topic of the project, intended data sets, a brief description of initial research question and planned approach. Students must also create a slide following this template (copy the template into a new presentation), which will be presented to the class. Both the slide (as a Google slide) and the proposal should be submitted via email (2 points)
Midterm checkpoints [Due 3/19-3/21] A report in ACL format (max 4 pages) containing: introduction, related work, results of baseline models, and intended next steps (preferably including a rough timeline) [Due 3/21]. Students will additionally present this information to the class on 3/19 and 3/21. Both the report and the presentation slides should be submitted via email (5 points)
Final Presentations [Due 4/30-5/2] Students will present their final projects to the class. Slides must be submitted via email (5 points)
Final Report [Due 5/8] An report in ACL format (max 8 pages) detailing the final project. The report should follow the format of a standard research paper (18 points).

Policies

Late policy. A penalty of 10% will be applied to homework assignments submitted up to 24 hours late; no credit will be given for homework submitted more than 24 hours after it is due.

Academic honesty. Homework assignments are to be completed individually. Verbal collaboration on homework assignments is acceptable, as well as re-implementation of relevant algorithms from research papers, but everything you turn in must be your own work, and you must note the names of anyone you collaborated with on each problem and cite resources that you used to learn about the problem. The project is to be completed by a team. You are encouraged to use existing NLP components in your project; you must acknowledge these appropriately in the documentation. Suspected violations of academic integrity rules will be handled in accordance with the CMU guidelines on collaboration and cheating.

Note to Students

Take care of yourself! As a student, you may experience a range of challenges that can interfere with learning, such as strained relationships, increased anxiety, substance use, feeling down, difficulty concentrating and/or lack of motivation. All of us benefit from support during times of struggle. There are many helpful resources available on campus and an important part of having a healthy life is learning how to ask for help. Asking for support sooner rather than later is almost always helpful. CMU services are available, and treatment does work. You can learn more about confidential mental health services available on campus at: http://www.cmu.edu/counseling/. Support is always available (24/7) from Counseling and Psychological Services: 412-268-2922.

Accommodations for Students with Disabilities:

If you have a disability and have an accommodations letter from the Disability Resources office, I encourage you to discuss your accommodations and needs with me as early in the semester as possible. I will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, I encourage you to contact them at access@andrew.cmu.edu.