12. Retrieval Augmented Generation (RAG) Pipelines

12.1 What Is Retrieval-Augmented Generation (RAG)?
The Core RAG Workflow

Why Organizations Use RAG
Common RAG Use Cases
12.2 RAG Architecture and Components
Vector Databases and Embedding Stores
Retrieval Mechanisms
Document Processing Pipeline
LLM Integration Layer
Orchestration and Control
12.3 RAG System Data Flow
End-to-End RAG Data Flow
Critical Security Checkpoints
12.4 Why RAG Systems Are High-Value Targets
Access to Sensitive Enterprise Data
Expanded Attack Surface
Trust Boundary Violations
Integration Complexity
12.5 RAG-Specific Attack Surfaces
12.5.1 Retrieval Manipulation
Techniques
Example
Query Type
Query Content
12.5.2 Embedding Poisoning
Example Trojan Document
12.5.3 Context Injection via Retrieved Content

Impact
12.5.4 Metadata Exploitation
Vulnerable Metadata Fields
Example Attack
Leakage Type
Information Revealed
12.5.5 Cross-Document Leakage
Common Causes
12.5.6 Retrieval Bypasses
Techniques
12.6 Common RAG Vulnerabilities
12.6.1 Inadequate Access Control
Vulnerability Pattern
Description
Impact
Example Scenario
12.6.2 Prompt Injection via Retrieved Content
Attack Flow
Example Malicious Document
Impact
12.6.3 Data Leakage Through Similarity Search
Attack Methodology
Example
Step
Attacker Query
Outcome
12.6.4 Chunking and Context Window Exploits
Chunking Vulnerabilities
Example Scenario
12.7 Red Teaming RAG Systems: Testing Approach
12.7.1 Reconnaissance
Information Gathering
Reconnaissance Techniques
12.7.2 Retrieval Testing
Test Cases
Test Scenario
Test Input / Action
Expected Behavior
Vulnerability Indicator
Systematic Testing Process
12.7.3 Injection and Poisoning
Test Approaches
A. Document Injection Testing (if authorized and in-scope)
B. Testing Existing Documents for Injections
C. Indirect Prompt Injection
12.7.4 Data Exfiltration Scenarios
Attack Scenarios
Scenario 1: Iterative Narrowing
Scenario 2: Batch Extraction
Scenario 3: Metadata Enumeration
Inference Category
Malicious Query
Scenario 4: Chunk Reconstruction
Step
Attack Action / Query
12.8 RAG Pipeline Supply Chain Risks
Vector Database Vulnerabilities
Security Concerns
Embedding Model Risks
Security Concerns
Third-Party Embedding Services
Document Processing Library Risks
Common Libraries and Their Risks
Library
Purpose
Security Risks
Attack Scenario
Data Provenance and Integrity
Questions to Investigate
Provenance Attack Example
Step
Action
Result/Impact
12.9 Real-World RAG Attack Examples
Scenario 1: Accessing HR Documents Through Query Rephrasing
Setup (Case Study 1)
Attack (Case Study 1)
User/Role
Interaction
System Outcome
Scenario 2: Extracting Competitor Research via Semantic Similarity
Setup (Case Study 2)
Attack (Case Study 2)
Step
Competitor Query
System Response
Scenario 3: Trojan Document Triggering Unintended Actions
Setup (Case Study 3)
Attack (Case Study 3)
Trigger
System Behavior
Scenario 4: Metadata Exploitation Revealing Confidential Project Names
Setup (Case Study 4)
Attack (Case Study 4)
12.10 Defensive Considerations for RAG Systems
Document-Level Access Controls
Implementation Approaches
Input Validation and Query Sanitization
Defensive Measures
Retrieved Content Filtering
Safety Measures Before LLM Processing
Monitoring and Anomaly Detection
Key Metrics to Track
Metric
Purpose
Alert Threshold (Example)
Logging Best Practices
Secure Document Ingestion Pipeline
Ingestion Security Checklist
Example Secure Ingestion Flow

Regular Security Audits
Audit Activities
12.11 RAG Red Team Testing Checklist
Pre-Engagement
Retrieval and Access Control Testing
Injection and Content Security
Data Extraction and Leakage
Supply Chain and Infrastructure
Monitoring and Detection
Documentation and Reporting
12.12 Tools and Techniques for RAG Testing
Custom Query Crafting
Manual Testing Tools
Vector Similarity Analysis
Understanding Embedding Space
Applications
Document Embedding and Comparison
Probing Document Space
RAG-Specific Fuzzing Frameworks
Emerging Tools
Example Custom Fuzzer Structure
Access Control Testing Scripts
Automated Permission Testing
12.13 Conclusion
Chapter Takeaways
Recommendations for Red Teamers
Recommendations for Defenders
Future Considerations
Next Steps
Last updated
Was this helpful?



