Local Explanations for Deep Learning Models

CS 6966/5966, Fall 2023
MoWe / 4:35-5:55 PM , WEB 1230, Recordings

Instructor: Ana Marasović

Office Hours: Wed 12-1p (MEB 2166)

Today it is not unusual to ask chatbots such as ChatGPT to answer something as well as explain the answer in plain English. In this course, we will review five types of methods for explaining individual predictions of deep learning models for NLP/computer vision tasks that preceded this remarkable achievement. We will then revisit evaluations of these methods and focus on how to develop and evaluate explainability methods that best accomplish a concrete, real-world utility. For more information read FAQ.

Through this website we will share:

Schedule & lecture materials (slides, recordings)
Info about paper discussions and projects
FAQ before enrolling

Syllabus, announcements, communicating grades, policies, etc. are shared on Canvas. Communication will be done through Piazza. You will turn in your assignments to Gradescope.

Calendar

The calendar and readings are tentative and subject to change.

Date	Theme	Topic	Lecture Materials	Work due
8/21		Course Logistics	[recording] [slides] [readings]
8/23	Background	Transformer	[recording] [slides] [readings]
8/28	How certain is the model in its prediction?	Pretrain-Finetune; Uncertainty quantification	[recording] [slides] [readings]
8/30	In plain English, why is this input assigned this label?	Prompting; Instruction-finetuning; Chain-of-Thoughts	[recording] [slides] [readings]	HW1 (by 8/31)
9/4		Labor Day (no class)
9/6	cont.	Free-text explanations; Human evaluation	[recording] [slides] [readings]
9/11	cont.	In-class paper discussion	[slides] [readings]	Presentation / write-up
9/13	Which part of the input led to a prediction?	Attribution; Gradient-based; Select-then-predict	[recording] [slides] [readings]	HW2 (by 9/14)
9/18	cont.	In-class paper discussion	[slides] [readings]	Presentation / write-up
9/20	Which human-interpretable concept led to the prediction?	Pairwise interaction features (attention-based); Concept-based explanations	[recording] [slides] [readings]
9/25	cont.	In-class paper discussion	[slides] [readings]	Presentation / write-up
9/27	Which training examples caused the prediction?	Data influence; Influence functions	[recording] [slides] [readings]	HW3 (by 9/28)
10/2	cont.	In-class paper discussion	[slides] [readings]	Presentation / write-up
10/4	Which part of the input should be changed to change the prediction to a given label?	Contrastive explanations; Contrastive editing	[recording] [slides] [readings]	HW4 (by 10/5)
10/9		Fall Break (no class)
10/11		Fall Break (no class)
10/16	Application-grounded evaluations of explanations	Reliance and complementary human-AI team performance	[recording] [slides] [readings]
10/18	In-class group work	Project framing		Project proposal
10/23	cont.	Prerequisites, causes, & goals of human trust in AI	[recording] [slides] [readings]
10/25	cont.	Experimental design for human evaluations, use cases, and challenges (asynchronous learning)	[readings]
10/30	cont.	Challenges in fostering (dis)trust in AI	[recording] [slides] [readings]
11/1	cont.	In-class paper discussion	[slides] [readings]	Presentation / write-up; 2-week project report by 11/2
11/6	In-class group work	Peer feedback		Report of changes
11/8	Explainability as a dialog	Principles, roadmap, risks, & opportunities	[recording] [slides] [readings]
11/13	Asynchronous group work	Work on projects
11/15	cont.	In-class paper discussion	[slides] [readings]	Presentation / write-up
11/20	In-class group work	Prepare a user study	[slides]	User study v1
11/22	In-class group work	Peer user studies		User study w/ real values
11/27	Looking back	Looking back	[recording] [slides]	Poster v1 by 11/29
11/29	In-class group work	Finalizing posters		Poster final version
12/4	Exam	Exam
12/6		Poster presentations		Poster questionnaire

FAQ Before Enrolling

Is attending the class in person mandatory? [Click to expand!]

Six paper discussions, five in-class group activites, exam, and poster presentations must be attended in person. Given that this makes 13 class session in total, the instructor does not recommend enrolling in this class if you can’t attend it in person.

What will you learn in this course? [Click to expand!]

Train a transformer-based model for an NLP or computer vision application, obtain its predictions, and apply common explainability methods (covered in lectures) to explain the predictions. Four homework assignments are designed to work on this.
Define utility/function of an explanation for a given application. This course should teach you to ask: “What will best accomplish the explanatory functions [I hope to achieve in this case]?” (specific; application-based) instead of “Is this an appropriate explanation [in this case]?” (generic; very common in current ML research). A successful project proposal (a component of the grade) has to define an appropriate function.
Identify and apply a method that creates appropriate explanations for a desired utility/function. A successful intermediate project status report should identify such a method.
Conduct a user study to evaluate produced explanations based on their utility. The course has a dedicated session for preparing an interface for a user study and another session for conducting it with class peers. Having a user study ready ready for the class is a component of the project grade as well as a successful final project presentation that shows how well explanations accomplish the explanatory function according to the study.
Read cutting-edge research publications requiring being acquainted with the principles and concepts of explainable ML. There will be six class sessions devoted to paper discussions. This part of the course will be organized as a role-playing paper reading seminar with 5 regular roles: original author, scientific peer reviewer, archaeologist, imaginative researcher, and original author of a related paper. Those who are not assigned to be the official presenters will either play a wild card role (any role they’d like to be, such as an industry practitioner, a cranky researcher, etc.) or complete a written artifact (a study notes, a blog post, an opinion piece, a scribe of the class discussion). This format requires everyone to actually read the papers and engage in the discussion about them. The paper discussion is one of the main components of the grade.
Follow an *ACL/EMNLP (leading NLP conferences) review style to properly review a research paper on explainable ML. One of the roles in the role-playing paper discussion is “scientific peer reviewer”. You will take that role once and your review will contribute to your paper discussion grade.

How will you be evaluated? [Click to expand!]

Your performance in this course will be evaluated by:

Which modalities does this course cover? [Click to expand!]

We will almost exclusively talk about applications in NLP (so text) and in computer vision with static images. Inputs in these domains are represented with embeddings—high-dimensional vectors of floating point numbers whose individual dimensions are not interpretable. If you are interested in applications that fall under data science, you will instead likely work with “meaningful” features such as income of a person or zip code of a certain location. This course is not about such applications. We hope you find the course useful even if the data you work with is handled differently from text or images. We believe it can be inspiring to think whether these methods can be applied to a different domain, and realizing that they cannot, can be useful.

Which machine learning models does this course cover? [Click to expand!]

We focus on deep learning models (deep neural networks) and we will almost solely talk about transformer-based models.

Local vs. global explanations? [Click to expand!]

This course will not focus on global methods that analyze models’ behavior and internals such as probing. We focus on methods that answer questions such as:

Which part of the input led to assigning this label?
How to edit the input to change the model’s answer to something else?
In plain English, why is this input assigned this label?
Which training examples caused the prediction?

Formal pre-requisites? How to prepare? [Click to expand!]

This course doesn’t have formal pre-requisites because these days one can learn about machine learning and adjacent topics in many different ways, but we expected that you…

…are experienced with programming in Python,
…are comfortable with basic calculus, probability, and linear algebra,
…have solid machine learning foundations,
…have some familiarity with pytorch,
…are acquainted with Deep Learning 101.

If you completed CS 5353/6353 (Deep Learning) or CS 5340/6340 (Natural Language Processing) or CS 5350/6350 (Machine Learning), we expect you will be able to keep up.

My advice: If you are interested in the course, give it a try. We will spend the first two weeks going over the background and have a graded programming assignment about it. If you struggle with the background concepts and the first homework, you can withdraw—students may drop a course within the first two weeks of a given semester without any penalties.

Revisiting/polishing your knowledge. You can prepare by:

There are a ton of Python resources for people with some programming experience. Check them out here. My colleagues suggest these: 1, 2, 3, and 4.
Math and machine learning basics are nicely covered in the first part of the Deep Learning book. Obviously, you can use the same book to familiarize yourself with deep learning, especially with the contents of Chapter 6 and Chapter 8 that are a must for this course.
Deep Learning with PyTorch: A 60 Minute Blitz (highly recommended)
Practical Deep Learning for Coders by Fast.ai (3: Neural net foundations; 5: From-scratch model, 13: Backpropagation & MLP, 14: Backpropagation)
Getting started with NLP for absolute beginners is good to familiarize yourself a bit with NLP. During the semester we will use huggingface