Plugin Vulnerabilities
17.4 Plugin Vulnerabilities
Understanding Plugin Vulnerabilities
Plugins extend LLM capabilities but introduce numerous security risks. Unlike the LLM itself (which is stateless), plugins interact with external systems, execute code, and manage stateful operations. Every plugin is a potential attack vector that can compromise the entire system.
Why Plugins are High-Risk
Direct System Access: Plugins often run with elevated privileges.
Complex Attack Surface: Each plugin adds new code paths to exploit.
Third-Party Code: Many plugins come from untrusted sources.
Input/Output Handling: Plugins process LLM-generated data (which is potentially malicious).
State Management: Bugs in stateful operations lead to vulnerabilities.
Common Vulnerability Categories
Injection Attacks: Command, SQL, path traversal.
Authentication Bypass: Broken access controls.
Information Disclosure: Leaking sensitive data.
Logic Flaws: Business logic vulnerabilities.
Resource Exhaustion: DoS via plugin abuse.
17.4.1 Command Injection
What is Command Injection?
Command injection happens when a plugin executes system commands using unsanitized user input. Since LLMs generate text based on user prompts, attackers can craft prompts that force the LLM to generate malicious commands, which the plugin then blindly executes.
Attack Chain
User sends a malicious prompt.
LLM generates text containing the attack payload.
Plugin uses the LLM output in a system command.
OS executes the attacker's command.
System is compromised.
Real-World Risk
Full system compromise (RCE).
Data exfiltration.
Lateral movement.
Persistence mechanisms.
Vulnerable Code Example
Command injection via plugin inputs
Understanding Command Injection:
Command injection is the most dangerous plugin vulnerability. It allows attackers to execute arbitrary operating system commands. If a plugin uses functions like os.system or subprocess.shell=True with unsanitized LLM-generated input, attackers can inject shell metacharacters to run whatever they want.
Why This Vulnerability Exists:
LLMs generate text based on user prompts. If an attacker crafts a prompt like "What's the weather in Paris; rm -rf /", the LLM might include that entire string in its output. The vulnerable plugin then executes it as a shell command.
Attack Mechanism (Vulnerable Code):
User sends prompt:
"What's the weather in Paris; rm -rf /"LLM extracts location:
"Paris; rm -rf /"(it's just text to the LLM).Plugin constructs command:
curl 'https://api.weather.com/...?location=Paris; rm -rf /'os.system()executes two commands:curl '...'(the intended command).rm -rf /(the attack payload, due to the;separator).
Shell Metacharacters Used in Attacks:
;: Separator (runs multiple commands).&&: Runs the second command if the first succeeds.||: Runs the second command if the first fails.|: Pipes output to another command.`command`: Command substitution.$(command): Command substitution.&: Background execution.
Why the Secure Version Works:
Input Validation (
is_valid_location): Uses regex to enforce a whitelist of allowed characters (usually just letters, numbers, and spaces). It rejects shell metacharacters like;,|, and&.API Library Instead of Shell: Uses
requests.get(), which makes an HTTP request directly without invoking a shell. Parameters are passed as dictionary arguments, not string concatenation.No Shell Parsing: The
requestslibrary URL-encodes parameters automatically. Even if someone passes"Paris; rm -rf /", it becomesParis%3B%20rm%20-rf%20%2Fin the HTTP request—treated as literal text by the API, not commands.
Defense Strategy:
Never use
os.system()orsubprocess.shell=Truewith user-controlled input.Always validate input with whitelists (regex patterns for allowed characters).
Use library functions (like
requests) that don't invoke shells.If shell execution is required, use
subprocess.run()withshell=Falseand pass commands as lists.
Real-World Impact:
Remote Code Execution (RCE).
Full system compromise.
Data exfiltration.
Ransomware deployment.
Backdoor installation.
Testing Tips:
To test if your plugin is vulnerable:
Try
location = "Paris; echo VULNERABLE". If the output contains "VULNERABLE", command injection exists.Try
location = "Paris$(whoami)". If the output shows a username, command substitution works.
SQL injection through plugins
Understanding SQL Injection in LLM Plugins:
SQL injection happens when user-controlled data (from LLM output) is concatenated directly into SQL queries instead of using parameterized queries. This lets attackers manipulate the logic, bypass authentication, extract data, or modify the database.
Why LLM Plugins are Vulnerable:
The LLM generates the query parameter based on user prompts. If a prompt says "Show me users named ' OR '1'='1", the LLM might pass that exact string to the plugin, which then runs a malicious SQL query.
Attack Mechanism (Vulnerable Code):
User prompt:
"Search for user named ' OR '1'='1"LLM extracts:
query = "' OR '1'='1"Plugin constructs SQL:
SELECT * FROM users WHERE name LIKE '%' OR '1'='1%'SQL logic breakdown:
name LIKE '%'matches all names.OR '1'='1'is always true.Result: Query returns ALL users.
Common SQL Injection Techniques:
Authentication Bypass:
admin' --(comments out password check).Data Extraction:
' UNION SELECT username, password FROM users --.Boolean Blind:
' AND 1=1 --vs' AND 1=2 --(leaks data bit by bit).Time-Based Blind:
' AND IF(condition, SLEEP(5), 0) --.Stacked Queries:
'; DROP TABLE users; --.
Why Parameterized Queries Prevent SQL Injection:
In the secure version:
The
?is a parameter placeholder, not a string concatenation point.The database driver separates the SQL structure (the query pattern) from the data (the user input).
When
query = "' OR '1'='1", the database treats it as literal text to search for, not SQL code.The query looks for users whose name consists of the characters
' OR '1'='1(which won't exist).No SQL injection is possible because user input never enters the SQL parsing phase as code.
How Parameterization Works (Database Level):
The SQL query is sent to the database first:
SELECT * FROM users WHERE name LIKE :param1The database compiles and prepares this query structure.
The user data (the search term) is sent separately as a parameter value.
The database engine knows this is data, not code, and treats it as a string.
Defense Best Practices:
Always use parameterized queries (prepared statements).
Never concatenate user input into SQL strings.
Use ORM frameworks (like SQLAlchemy or Django ORM) which parameterize by default.
Validate input types (ensure strings are strings, numbers are numbers).
Principle of least privilege: Database users should have minimal permissions.
Never expose detailed SQL errors to users (it reveals database structure).
Real-World Impact:
Complete database compromise.
Credential theft (password hashes).
PII exfiltration.
Data deletion or corruption.
Privilege escalation.
Testing for SQL Injection:
Try these payloads:
query = "test' OR '1'='1"(should not return all users).query = "test'; DROP TABLE users; --"(should not delete table).query = "test' UNION SELECT @@version --"(should not reveal database version).
Type confusion attacks
Understanding Type Confusion and eval() Exploitation:
Type confusion occurs when a plugin accepts an expected data type (like a math expression) but doesn't validate that the input matches that type. The eval() function is the quintessential dangerous function in Python because it executes arbitrary Python code, not just math.
Why eval() is Catastrophic:
eval() takes a string and executes it as Python code. While this works for math expressions like "2 + 2", it also works for:
__import__('os').system('rm -rf /'): Execute shell commands.open('/etc/passwd').read(): Read sensitive files.[x for x in ().__class__.__bases__[0].__subclasses__() if x.__name__ == 'Popen'][0]('id', shell=True): Escape sandboxes.
Attack Mechanism (Vulnerable Code):
User prompt:
"Calculate __import__('os').system('whoami')"LLM extracts:
expression = "__import__('os').system('whoami')"Plugin executes:
eval(expression)Python's
evalruns arbitrary code.Result: The
whoamicommand executes, revealing the username (proof of RCE).
Real Attack Example:
Why the Secure Version (AST) is Safe:
The Abstract Syntax Tree (AST) approach parses the expression into a tree structure and validates each node:
Parse Expression:
ast.parse(expression)converts the string to a syntax tree.Whitelist Validation: Only specifically allowed node types (
ast.Num,ast.BinOp) are permitted.Operator Restriction: Only mathematical operators in the
ALLOWED_OPERATORSdictionary are allowed.Recursive Evaluation:
_eval_node()traverses the tree, evaluating only safe nodes.Rejection of Dangerous Nodes: Function calls (
ast.Call), imports, and attribute access are all rejected.
How It Prevents Attacks:
If an attacker tries "__import__('os').system('whoami')":
AST parses it and finds an
ast.Callnode (function call)._eval_node()raisesInvalidNodeErrorbecauseast.Callisn't in the whitelist.Attack blocked—no code execution.
Even simpler attacks fail:
"2 + 2; import os"→ Syntax error (can't parse)."exec('malicious code')"→ast.Callrejected."__builtins__"→ast.Namewith non-numeric value rejected.
Allowed Operations Breakdown:
Each operator maps to a safe Python function from the operator module, ensuring no code execution.
Defense Strategy:
Never use eval() with user input—this is a universal security principle.
Whitelist approach: Define exactly what's allowed (numbers and specific operators).
AST parsing: Validate input structurally before execution.
Sandboxing: Even "safe" code should run in an isolated environment.
Timeout limits: Prevent
1000**100000style DoS attacks.
Real-World Impact:
Remote Code Execution (RCE).
Full system compromise.
Data exfiltration.
Lateral movement to internal systems.
Crypto mining or botnet deployment.
Prerequisites:
Understanding of Python's AST module.
Knowledge of Python's operator module.
Awareness of Python introspection risks (
__import__,__builtins__).
Alternative Safe Solutions:
sympy library:
sympy.sympify(expression, evaluate=True)– Mathematical expression evaluator.numexpr library: Fast, type-safe numerical expression evaluation.
restricted eval: Use
ast.literal_eval()for literals only (no operators).
Testing Tips:
Test with these payloads:
expression = "__import__('os').system('echo PWNED')"(should raise InvalidNodeError).expression = "exec('print(123)')"(should fail).expression = "2 + 2"(should return 4 safely).
17.4.2 Logic Flaws
Race conditions in plugin execution
Understanding Race Conditions:
Race conditions happen when multiple threads or processes access shared resources—like account balances or database records—simultaneously without proper synchronization. The outcome depends on who wins the unpredictable "race", leading to data corruption or vulnerabilities.
Why Race Conditions are Dangerous in LLM Systems:
LLM plugins often handle multiple requests at once. If an attacker can trick the LLM into invoking a plugin function multiple times simultaneously (via parallel prompts or rapid requests), they can exploit race conditions to:
Bypass balance checks.
Duplicate transactions.
Corrupt data integrity.
Escalate privileges.
The Vulnerability: Time-of-Check-Time-of-Use (TOCTOU)
Attack Timeline:
T0
Start withdraw(500)
1000
T1
Check: 1000 >= 500 ✓
1000
T2
Start withdraw(500)
1000
T3
Check: 1000 >= 500 ✓
1000
T4
sleep(0.1)...
sleep(0.1)...
1000
T5
balance = 1000 - 500
500
T6
balance = 1000 - 500
500
T7
Return True
Return True
500
The Problem:
Both threads checked the balance when it was 1000.
Both passed the check.
Both withdrew 500.
Result: You manipulated the system to withdraw 1000 from an account with only 1000, but logic says the second should have failed.
Real-World Exploitation:
Attacker sends two simultaneous prompts:
Both execute in parallel:
Both check balance (1000) and pass.
Both withdraw 500.
Attacker got $1000 from a $1000 account (should only get $500).
The Solution: Threading Lock
How Locking Prevents the Attack:
T0
Acquire lock ✓
1000
T1
Check: 1000 >= 500 ✓
Waiting for lock...
1000
T2
balance = 500
Waiting for lock...
500
T3
Release lock, Return True
Acquire lock ✓
500
T4
Check: 500 >= 500 ✓
500
T5
balance = 0
0
T6
Release lock, Return True
0
Result: Correct behavior—both withdrawals succeed because there was enough money.
With withdrawal of $600 each:
Thread 1 withdraws $600 (balance = $400).
Thread 2 tries to withdraw $600, check fails (400 < 600).
Second withdrawal correctly rejected.
Critical Section Principle:
The lock creates a "critical section":
Only one thread can be inside at a time.
Check and modify operations are atomic (indivisible).
No race condition possible.
Other Race Condition Examples:
1. Privilege Escalation:
2. File Overwrite:
Best Practices:
Use Locks:
threading.Lock()for thread safety.Atomic Operations: Use database transactions, not separate read-then-write steps.
Optimistic Locking: Use version numbers to detect concurrent modifications.
Pessimistic Locking: Lock resources before access (like
SELECT FOR UPDATE).Idempotency: Design operations so they can be safely retried.
Database-Level Solution:
Instead of application-level locks, use database transactions:
The FOR UPDATE clause locks the database row, preventing other transactions from reading or modifying it until the commit.
Testing for Race Conditions:
Prerequisites:
Understanding of multithreading concepts.
Knowledge of critical sections and mutual exclusion.
Familiarity with Python's threading module.
Real-World Impact:
2012 - Citibank: Race condition allowed double withdrawals from ATMs.
2016 - E-commerce: Concurrent coupon use drained promotional budgets.
2019 - Binance: $41M stolen via coordinated attack exploiting multiple security weaknesses.
Key Takeaway:
In concurrent systems (like LLM plugins handling multiple requests), check-then-act patterns are inherently unsafe without synchronization. Always protect shared state with locks, transactions, or atomic operations.
17.4.3 Information Disclosure
Excessive data exposure
Error message leakage
17.4.4 Privilege Escalation
Horizontal privilege escalation
Vertical privilege escalation
Last updated
Was this helpful?

