Project
The centerpiece of this class is a group project in which your team of 5 students will take on the role of a Trust and Safety team at a corporate AI or social media company. Your group will pick a real safety problem on a real platform, study it, speak to victims (if possible), design a mitigation for the issue, implement it, test it, and then present your solution to real professionals at our final poster session on Tuesday, June 2nd from 5–7pm in the CoDa Sunken Courtyard.
Topics from previous years have included:
- Online cryptocurrency scams
- AI nudifiers and deepfakes
- AI chatbots being inappropriate with children
- Live streaming of terrorist attacks
- Government propaganda against domestic minorities
- Disinformation in online ads
- Coordinated harassment of journalists
- Sextortion
- Trading of child sexual abuse materials
- Hate speech on a streaming platform
- Terrorist recruitment
This is a good example of a final poster from a previous year.
This year, students will be allowed to pick their teammates, as long as there is an attempt to balance between the CS and non-CS listings of the class (we will know the final team size and ratio after the add/drop deadline). Groups will be finalized the week of April 20th. This year, we are adding substantially more structure to the software engineering components, as the goal is to simulate the workings of a real tech company. You will be setting up a CI/CD pipeline, writing a PRD in GitHub, using pull requests and code reviews, filing tickets, and using AI coding tools (Claude Code, GitHub Copilot, Cursor, or similar) like a real startup. Working software engineers — several of whom are alumni of this course — will run evening Zoom sessions to assist in this setup and provide mentorship as needed.
Two supported project paths
This year we support two first-class paths for your abuse topic:
- Human-abuser: humans harming humans online — harassment, scams, CSAM, disinformation, hate speech, etc.
- AI-as-abuser: an AI model itself behaves harmfully — a chatbot that grooms minors, a model that gives dangerous medical advice, a deepfake generator, an AI-powered scam bot. Teams on this path stand up a harmful model (e.g., an open-weight model from HuggingFace with safety tuning removed, or a standard open model with an adversarial system prompt) and build the guardrail / classifier / moderation layer that sits in front of it.
Both paths share the same deliverables and grading.
Milestone structure
The group project has three graded milestones. Milestone 1 is individual and is posted during the first week of class; you pick an area of interest, research it, make a slide deck, and record a 5–8 minute pitch video. Upload the PDF and video to Canvas — no links.
Milestone 2 is the group’s PRD, GitHub infrastructure, CI/CD pipeline, and MVP prototype. This year we grade on real software engineering practices: all five teammates commit, branch protection and PRs on main, issues tracked in GitHub, and automatic deployment to Google Cloud, Vercel, or Cloudflare on merge. Teams building on Google Cloud should claim the Google Cloud Education credits we’ve provided.
Milestone 3 is the working abuse prevention system — both a user-facing UI and a moderator UI, deployed — plus a classifier evaluation across traditional ML, LLM, and hybrid approaches, and the poster for the final session. There is no separate writeup for Milestone 3; the poster is the writeup.
Final poster session — CoDa Sunken Courtyard
The final session is held at the CoDa Sunken Courtyard (the outdoor courtyard at Stanford’s Computing and Data Science building) on Tuesday, June 2 from 5:00–7:00 PM. The session will have ~30 guest judges from industry who rotate through posters and ask questions.
⚡ Important: the courtyard does not have accessible power outlets. Bring a tablet or laptop that can run off battery for the full two hours; consider a charged battery pack as backup.
Attendance is required. Each team member who misses the session without an approved excuse costs the team 5% off its Milestone 3 grade — so a team with one absentee loses 5%, two absentees lose 10%, etc.
In previous years, our guest judges have come from:
- Meta
- OpenAI
- Anthropic
- TikTok
- Discord
- Match Group
- Uber
- Sony
- Microsoft
- xAI
- Adobe
- Apple
- Many startups and NGOs
Grading Policies
This class includes students with various backgrounds, areas of study, and grade levels, and we have designed the project with the intention that all students can contribute while also earning the same grade for Milestones 2 and 3. We do not expect every student to contribute equally, but we do expect every student to give their all based on their capabilities.
However, there will be exceptions to every student earning the same grade:
- If a student does not attend the final presentation, 5% will be deducted from the team’s grade.
- If the GitHub history shows that a student did not contribute, that student’s individual grade will be reduced.
- We will send out a survey to every student at the end of the class to ask about contribution levels. If the rest of the team agrees that a student did not contribute, that will be taken into account.
If personal circumstances are keeping you from fully contributing this quarter, we recommend that you withdraw from the class and take an Incomplete. We will adjust grading for teams with fewer than five members.
Due Dates
| Milestone | Due Date |
|---|---|
| Milestone 1 (Individual) | Friday, April 17 at 11:59 PM |
| Milestone 2 (PRD, Infrastructure, MVP) | Friday, May 8 at 11:59 PM |
| Milestone 3 (Final Project) | Poster session 5-7pm Tuesday, June 2, CoDa Sunken Courtyard. Final code on main by 11:59 PM same day — no upload required, we access GitHub directly. |
