Week 1: Course Overview & Threat Landscape
Week 1: Course Overview & Threat Landscape for Emerging Systems
Module: Foundations
Duration: 140-150 minutes
Instructor: Dr. Zhengxiong Li
Table of Contents
- Course Introduction & Logistics
- Overview of Emerging Systems
- Current Threat Landscape
- Security Challenges in AI-Enabled Systems
- Case Studies: Recent Security Incidents
- Wrap-up & Next Steps
Part 1: Course Introduction & Logistics (20 minutes)
Welcome to CSCI 5773! 🎯
This course focuses on the security of emerging systems in Computer Science—specifically, rapidly evolving AI-enabled systems that are being deployed at scale and having considerable societal impact.
Why This Course Matters
Key Question for Students: How many of you use AI tools daily? ChatGPT? GitHub Copilot? Smart home devices?
The systems you interact with daily face unprecedented security challenges:
- AI/ML systems are being deployed without adequate security testing
- LLMs can be manipulated to leak sensitive information or generate harmful content
- Edge AI devices in your homes and cars are potential attack vectors
- Multimodal AI systems can be fooled across different input modalities
Course Structure Overview
Modules (16 Weeks)
- Foundations (Weeks 1-2): Security fundamentals for emerging systems
- Adversarial Machine Learning (Weeks 3-5): Attacks on ML models
- LLM Security (Weeks 6-7, 9-11): Large language model vulnerabilities
- AI in Security (Week 12): Offensive and defensive applications
- Emerging Systems (Weeks 13-15): Edge, IoT, multimodal, and embodied AI
Assessment Breakdown
- 60% - Assignments and quizzes (hands-on security labs)
- 15% - Midterm exam (Week 8)
- 15% - Final project (research or product demo)
- 10% - Attendance and participation
Course Philosophy
Three Pillars:
- Hands-On Learning: You'll implement attacks and defenses yourself
- Current Relevance: We'll study incidents from the last 12-24 months
- Ethical Awareness: Understanding attacks to build better defenses
Logistics & Expectations
Class Format
- Monday: In-person (NORTH 1608)
- Wednesday: Zoom (flexible for conferences/travel)
- Office Hours: Tuesday/Thursday, 1:00-3:30 PM via Zoom
Resources
- No textbook required
- Materials on Canvas: lecture slides, research papers, tools
- Recommended reading: Top security conferences (Oakland, CCS, USENIX Security)
Ground Rules
- Ethical Use: All attack techniques taught are for defensive purposes only
- Academic Honesty: Zero tolerance for plagiarism or cheating
- Collaboration: Encouraged for learning, but submit individual work
- Responsible Disclosure: If you find vulnerabilities, report them properly
Part 2: Overview of Emerging Systems (35 minutes)
What Are "Emerging Systems"?
Definition: Computing systems that are:
- Rapidly evolving (monthly updates/improvements)
- Deployed at scale (millions of users)
- AI-enabled or AI-adjacent
- Creating new security paradigms
2.1 Machine Learning & AI Systems (15 minutes)
Traditional vs. ML-Based Systems
Traditional Software:
Machine Learning:
Security Implication: In ML systems, the "rules" (model weights) are learned from data, not explicitly programmed. This creates new attack surfaces!
The ML Pipeline: Where Security Matters
Example: Image Classification System
Let's consider a self-driving car's pedestrian detection system:
- Data Collection: Cameras capture street scenes
- Attack Vector: Can attackers poison the training data?
- Data Preprocessing: Images are labeled, augmented, normalized
- Attack Vector: Can malicious labels corrupt the model?
- Model Training: Neural network learns to recognize pedestrians
- Attack Vector: Can backdoors be inserted during training?
- Model Deployment: Model runs on the car's edge computer
- Attack Vector: Can the model be stolen or reverse-engineered?
- Inference: Real-time pedestrian detection
- Attack Vector: Can adversarial patches fool the detector?
Types of ML Systems in Production
| System Type | Example | Security Concern |
|---|---|---|
| Image Classification | Face recognition | Adversarial examples, bias |
| NLP/Text | Spam filters, chatbots | Prompt injection, toxic output |
| Recommendation | Netflix, YouTube | Data poisoning, manipulation |
| Autonomous Systems | Self-driving cars | Safety-critical failures |
| Speech Recognition | Alexa, Siri | Audio adversarial examples |
2.2 Large Language Models (10 minutes)
What Makes LLMs Different?
Traditional ML Models:
- Task-specific (e.g., cat vs. dog classifier)
- Fixed input/output formats
- Limited reasoning capabilities
Large Language Models:
- General-purpose (can perform many tasks)
- Flexible natural language interface
- Emergent abilities (reasoning, coding, math)
- Much larger attack surface!
LLM Architecture Basics
Demo Concept: Interactive LLM Behavior
Show students a simple interaction:
Key LLM Capabilities That Create Security Challenges
- Tool Use: LLMs can call external APIs, execute code
- Context Window: Can process large amounts of data (potential for data leakage)
- Reasoning: Can be manipulated to bypass safety guardrails
- Code Generation: Can generate malicious code if prompted cleverly
2.3 IoT & Edge Computing (10 minutes)
The IoT Explosion
Statistics (as of 2024-2025):
- 15+ billion connected IoT devices globally
- Expected 30+ billion by 2030
- Most have weak security by default
What is Edge Computing?
Cloud Computing:
Edge Computing:
Example: Smart Security Camera
- Cloud Approach: Streams video to cloud, processes there
- Edge Approach: Runs AI model locally, only sends alerts
Edge AI: The Intersection
Edge AI = Running AI/ML models directly on IoT devices or edge servers
Benefits:
- Low latency (real-time response)
- Privacy (data stays local)
- Reduced bandwidth
Security Challenges:
- Resource constraints (limited CPU/memory for security)
- Physical access (devices can be tampered with)
- Update mechanisms (how to patch millions of devices?)
Real-World Edge AI Examples
| Device | AI Function | Security Concern |
|---|---|---|
| Smart Doorbell | Face recognition | Model extraction, privacy |
| Autonomous Drone | Obstacle detection | Sensor spoofing, hijacking |
| Industrial Robot | Object manipulation | Safety attacks, sabotage |
| Medical Wearable | Health monitoring | Data privacy, false alarms |
Interactive Question: What IoT devices do you have at home? What data do they collect? Who has access to that data?
Part 3: Current Threat Landscape (30 minutes)
3.1 The Expanding Attack Surface (10 minutes)
Traditional Computing vs. AI-Enabled Systems
Traditional Attack Surface:
AI-Enabled System Attack Surface:
The AI Supply Chain Problem
Question: If the pre-trained model was backdoored, will fine-tuning remove it? (Spoiler: Usually not!)
3.2 Attack Taxonomy for AI/ML Systems (15 minutes)
Attack Dimensions
- Attack Goal:
- Confidentiality: Extract model or data
- Integrity: Corrupt model behavior
- Availability: Cause model failure
- Adversarial Knowledge:
- White-box: Full model access
- Gray-box: Partial knowledge
- Black-box: Query access only
- Attack Stage:
- Training-time: Data poisoning, backdoors
- Inference-time: Adversarial examples, prompt injection
Major Attack Categories
1. Evasion Attacks (Adversarial Examples)
Concept: Slightly modify input to fool the model
Visual Example:
Real-World Impact:
- Stop signs modified to be misclassified by autonomous vehicles
- Face recognition systems fooled by adversarial glasses
- Malware that evades ML-based detectors
2. Poisoning Attacks
Concept: Corrupt training data to influence model behavior
Example Scenario:
Backdoor Variant:
3. Privacy Attacks
Membership Inference:
- Goal: Determine if specific data was in training set
- Risk: Violates privacy (e.g., medical record exposure)
Model Inversion:
- Goal: Reconstruct training data from model
- Example: Recover face images from face recognition model
Model Extraction:
- Goal: Steal the model by querying it
- Impact: Intellectual property theft, enables white-box attacks
4. LLM-Specific Attacks
Prompt Injection:
Jailbreaking:
3.3 Threat Actors & Motivations (5 minutes)
Who Attacks AI/ML Systems?
| Actor Type | Motivation | Example Attack |
|---|---|---|
| Cybercriminals | Financial gain | Evade fraud detection systems |
| Competitors | Business advantage | Steal proprietary models |
| Nation-States | Espionage, sabotage | Backdoor military AI systems |
| Activists | Political statement | Expose bias in AI systems |
| Researchers | Knowledge, CVEs | Discover vulnerabilities |
| Insiders | Various | Data poisoning, sabotage |
Cost-Benefit Analysis
Traditional Software Bug:
- Find vulnerability → Exploit it → Patch released → Exploit no longer works
ML Model Vulnerability:
- Find attack technique → Often applies to entire model class
- Transferable across different models
- Harder to patch (retraining is expensive)
Part 4: Security Challenges in AI-Enabled Systems (35 minutes)
4.1 Unique Characteristics of ML Security (10 minutes)
Challenge 1: Lack of Formal Verification
Traditional Software:
- We can prove this function is correct
- Unit tests provide guarantees
Neural Network:
- No way to formally verify behavior for all inputs
- Testing is statistical, not exhaustive
Challenge 2: Brittleness vs. Robustness
Human Vision: Robust to variations
ML Model: Can be surprisingly brittle
Demonstration Concept:
Show two images side-by-side:
- Original image: Correctly classified
- Adversarial image: Visually identical to humans, completely misclassified
Code Example (Conceptual):
Challenge 3: Data Dependency
Key Insight: ML models are only as good as their training data
Problems:
- Training-Serving Skew: Model trained on ImageNet, deployed on security cameras
- Data Poisoning: Malicious samples in training set
- Bias & Fairness: Unrepresentative training data leads to biased models
Example: Face Recognition Bias
4.2 The CIA Triad in AI/ML Context (10 minutes)
Traditional CIA Triad
- Confidentiality: Prevent unauthorized information disclosure
- Integrity: Prevent unauthorized modification
- Availability: Ensure service accessibility
AI/ML-Specific Interpretations
Confidentiality Threats
Model Confidentiality:
- Model extraction attacks → Steal intellectual property
- Models cost millions to train (e.g., GPT-4)
Data Confidentiality:
- Training data leakage via model inversion
- Example: Language model memorizes and leaks training data
Demo Concept: Model Memorization
Show example of LLM reciting verbatim text:
Privacy Attack Example:
Integrity Threats
Model Integrity:
- Backdoor attacks: Model behaves normally except for specific triggers
- Data poisoning: Corrupt model during training
Prediction Integrity:
- Adversarial examples: Wrong predictions at inference time
- Prompt injection: Manipulate LLM behavior
Example: Backdoored Model
Real-World Scenario:
Availability Threats
Denial of Service:
- Sponge examples: Inputs that cause excessive computation
- Resource exhaustion: Queries that maximize model inference time
Example: Sponge Example for NLP
Model Degradation:
- Continuous poisoning in online learning systems
- Feedback loop attacks
4.3 Trust & Transparency Challenges (8 minutes)
The Black Box Problem
Question for Students: Would you trust a medical diagnosis from an AI you can't understand?
Explainability vs. Security:
- More explainable models → Easier to attack
- Black box models → Harder to trust
- Dilemma: We want both explainability AND security
Supply Chain Trust
Pre-trained Model Risk:
Real Scenario:
- Popular pre-trained model on model hub
- Attacker backdoors it and uploads
- Thousands download and use it
- Backdoor persists even after fine-tuning
Emergent Behaviors
Large Models Develop Unexpected Capabilities:
Example with LLMs:
4.4 Regulatory & Ethical Challenges (7 minutes)
Current Regulatory Landscape
EU AI Act (2024):
- Categorizes AI systems by risk level
- Banned applications (e.g., social scoring)
- High-risk systems require safety assessments
Executive Orders (US):
- Standards for AI safety and security
- Reporting requirements for large models
- Funding for AI security research
Ethical Considerations in Security Research
The Dual-Use Dilemma:
Questions to Consider:
- Should we publish attack methods before defenses exist?
- How do we balance transparency with security?
- Who is responsible when AI systems fail?
Bias & Fairness as Security Issues
Example: Facial Recognition in Law Enforcement
Discussion Point: Is a biased AI system a security vulnerability? Why or why not?
Part 5: Case Studies - Recent Security Incidents (25 minutes)
Case Study 1: The ChatGPT Data Leak Incident (7 minutes)
Background (2023-2024)
System: OpenAI's ChatGPT Vulnerability: Training data memorization and prompt injection Impact: Potential exposure of private information
What Happened?
- Researchers discovered ChatGPT could regurgitate training data verbatim
- Users found ways to extract personal information via clever prompts
- Bug in ChatGPT allowed users to see others' conversation histories
Technical Details
Training Data Leakage:
Why This Works:
- Large language models memorize parts of training data
- Adversarial prompts can trigger memorized content
- Especially problematic for rare/unique strings
Attack Demonstration (Conceptual):
Lessons Learned
- Data Sanitization: Training data must be carefully filtered
- Output Filtering: Need guardrails against regurgitation
- Privacy by Design: PII should not be in training data
- Prompt Injection Defenses: Input validation is critical
Mitigation Strategies
- Differential privacy during training
- Output filtering for known PII patterns
- Rate limiting on repetitive prompts
- User consent and data opt-out mechanisms
Case Study 2: The Autonomous Vehicle Stop Sign Attack (8 minutes)
Background (2018-2023)
System: Computer vision for traffic sign recognition Vulnerability: Adversarial perturbations on physical objects Impact: Safety-critical misclassification
The Attack
Physical Adversarial Examples:
Technical Breakdown
Step 1: Digital Attack Development
Step 2: Physical Realization
- Convert digital perturbation to physical stickers
- Account for viewing angles, lighting, distance
- Test in real-world conditions
Why This Is Particularly Dangerous
- Physically Realizable: Unlike digital-only attacks, anyone can print stickers
- Transferable: Works across different model architectures
- Persistent: Physical modification stays in place
- Safety-Critical: Directly impacts human safety
Real-World Experiments
Researchers showed:
- 100% attack success rate in controlled conditions
- Worked from various angles and distances
- Stickers cost < $5 to produce
- Difficult for humans to notice
Defense Mechanisms Proposed
- Robust Training:
- Ensemble Methods:
- Use multiple models with different architectures
- Require consensus for critical decisions
- Sensor Fusion:
- Don't rely on vision alone
- Combine camera, LIDAR, radar
- Cross-validate detections
- Anomaly Detection:
- Monitor for unusual confidence patterns
- Flag suspicious predictions for human review
Case Study 3: Microsoft's Tay Chatbot Incident (5 minutes)
Background (2016, still relevant)
System: Microsoft Tay - Twitter chatbot using ML Vulnerability: Lack of input filtering and online learning without safeguards Impact: Offensive outputs, PR disaster
What Happened?
Timeline:
The Attack Mechanism
Exploit: Unfiltered Online Learning
Example Interaction:
Lessons for Modern LLM Security
Even though this was 2016, the lessons apply to today's systems:
- Input Validation: Filter harmful content before processing
- Output Filtering: Check responses before posting
- Controlled Learning: Don't let models learn from every interaction
- Red Teaming: Test adversarial scenarios before deployment
- Kill Switch: Have ability to shut down quickly
Modern Parallels
ChatGPT Jailbreaking (2023-2024):
Case Study 4: GitHub Copilot Code Leakage (5 minutes)
Background (2021-2024)
System: GitHub Copilot - AI coding assistant Vulnerability: Training data leakage via code suggestions Impact: Potential copyright and security issues
The Issue
Problem: Copilot sometimes suggests code that is verbatim from training data
Example Scenario:
Security Implications
- Copyright Violation: Reproducing licensed code without attribution
- Credential Leakage: Training data included hardcoded API keys/passwords
- Vulnerable Code: Suggesting known-vulnerable code patterns
Actual Example (Simplified):
Broader Implications for AI-Generated Content
Questions Raised:
- Who owns AI-generated code?
- Is it plagiarism if AI memorized and reproduced training data?
- How do we handle AI suggesting vulnerable code?
Current Mitigations
- Duplicate Detection: Filter suggestions that match training data exactly
- User Warnings: Alert when suggestion might match existing code
- License Information: Show potential license conflicts
- Security Scanning: Check suggestions for known vulnerabilities
Wrap-up & Next Steps (5 minutes)
Key Takeaways from Week 1
- Emerging systems (AI/ML, IoT, Edge AI, LLMs) represent a paradigm shift in computing
- Attack surface is vastly larger than traditional systems
- New attack categories specifically target ML model behavior
- Real incidents demonstrate these aren't just theoretical concerns
- Ethical considerations are paramount in security research
Looking Ahead: Week 2 Preview
Next Week: Security Fundamentals for ML/AI Systems
We'll dive deeper into:
- ML system architecture and components
- Threat modeling specifically for ML pipelines
- Understanding the ML lifecycle security touchpoints
- Introduction to adversarial machine learning concepts
Action Items for Students
Before Next Class:
- Read: Skim 1-2 papers from top security conferences on ML security
- Suggested: "Intriguing Properties of Neural Networks" (adversarial examples)
- Setup: Prepare development environment
- Python 3.8+
- Install: TensorFlow/PyTorch, NumPy, scikit-learn
- Explore: Try interacting with an LLM (ChatGPT, Claude, etc.)
- Think about potential security issues
- Try to identify the system prompt or boundaries
- Reflect: Write 2-3 sentences answering:
- What emerging system security issue concerns you most?
- Why did you enroll in this course?
Discussion Questions for Reflection
- Ethics: Is it ethical to publish adversarial attack methods? What's the trade-off between disclosure and enabling malicious actors?
- Responsibility: If an AI system causes harm (e.g., autonomous vehicle crash), who is liable? The developer? The user? The AI itself?
- Privacy vs. Utility: How do we balance the benefits of training on large datasets with privacy concerns?
- Future Threats: What new attack vectors might emerge as AI systems become more capable?
Resources
Recommended Reading
- Papers:
- Goodfellow et al., "Explaining and Harnessing Adversarial Examples" (2015)
- Carlini & Wagner, "Towards Evaluating the Robustness of Neural Networks" (2017)
- Wallace et al., "Universal Adversarial Triggers for Attacking and Analyzing NLP" (2019)
- Websites:
- OWASP Top 10 for LLM Applications
- NIST AI Risk Management Framework
- Hugging Face Security Documentation
Tools to Explore
- Adversarial Robustness Toolbox (ART) by IBM
- CleverHans by Google Brain
- TextAttack for NLP adversarial examples
Appendix: Additional Examples & Demos
Demo 1: Simple Adversarial Example
Concept: Show how small changes fool models
Demo 2: Prompt Injection Simulation
Concept: Show how LLM prompts can be manipulated
Demo 3: Data Poisoning Visualization
Concept: Show how poisoned data affects model
Assessment Alignment
This Week's Content Aligns With:
Learning Objectives:
- ✓ Understand course structure and assessment methods
- ✓ Identify key security challenges in emerging systems
- ✓ Recognize the expanding attack surface of AI/ML systems
Course Goals:
- ✓ Introduction to fundamental challenges in emerging systems security
- ✓ Awareness of state-of-the-art solutions and ongoing research
- ✓ Foundation for hands-on security analysis in future weeks
Preparation for Week 2:
Students should now be able to:
- Explain what makes AI/ML security different from traditional cybersecurity
- Identify the main attack categories for ML systems
- Describe real-world security incidents involving AI/ML systems
- Understand the structure and expectations of the course
End of Week 1 Tutorial
Questions? Join office hours Tuesday/Thursday 1:00-3:30 PM
Next class: Security Fundamentals for ML/AI Systems