OWASP Top 10 for LLM Applications

OWASP Top 10 for LLM Applications

Demystifying the OWASP Top 10: A Comprehensive Guide for LLM (Language Learning Models)

The rush of interest in Large Language Models (LLMs) following the release of mass-market pre-trained chatbots in late 2022 has been astounding—businesses seeking to capitalise on LLMs' promise increasingly integrate them into their operations and client-facing solutions. However, the rapid use of LLMs has overtaken the development of robust security protocols, leaving many applications vulnerable to high-risk vulnerabilities.

There was no centralised resource addressing these security risks in LLMs. Developers were left with dispersed resources since they were unfamiliar with the specific hazards connected with LLMs, and OWASP's purpose seemed a logical fit to assist in pushing for safer use of this technology.

LLM01: Prompt Injection

LLMs can be manipulated by attackers via forged inputs, causing them to carry out the attacker's plans. This can be accomplished directly by adversarially pressing the system prompt or indirectly by altered external inputs, which could result in data exfiltration, social engineering, and other concerns.

Examples

  • Direct prompt injections overwrite system prompts

  • Indirect prompt injections hijack the conversation context

  • A user employs an LLM to summarize a webpage containing an indirect prompt injection

Prevention

  • Enforce privilege control on LLM access to backend systems

  • Implement humans in the loop for extensible functionality

  • Segregate external content from user prompts

  • Establish trust boundaries between the LLM, external sources, and extensible functionality.

Attack Scenarios

  • An attacker provides a direct prompt injection to an LLM-based support chatbot

  • An attacker embeds an indirect prompt injection in a webpage

  • A user employs an LLM to summarize a webpage containing an indirect prompt injection.

LLM02: Insecure Output Handling

Insecure Output Handling is a vulnerability that occurs when a downstream component takes large language model (LLM) output without properly scrutinising it. This can result in XSS and CSRF attacks in web browsers, as well as SSRF, privilege escalation, and remote code execution on backend systems.

Examples

  • LLM output is entered directly into a system shell or similar function, resulting in remote code execution

  • JavaScript or Markdown is generated by the LLM and returned to a user, resulting in XSS.

Prevention

  • Apply proper input validation on responses coming from the model to backend functions

  • Encode output coming from the model back to users to mitigate undesired code interpretations

Attack Scenarios

  • An application directly passes the LLM-generated response into an internal function responsible for executing system commands without proper validation

  • A user utilizes a website summarizer tool powered by an LLM to generate a concise summary of an article, which includes a prompt injection

  • An LLM allows users to craft SQL queries for a backend database through a chat-like feature.

LLM03: Training Data Poisoning

Manipulation of the data or the fine-tuning process to introduce vulnerabilities, backdoors, or biases that could undermine the model's security, effectiveness, or ethical behaviour is referred to as training data poisoning. This increases the likelihood of performance deterioration, downstream software exploitation, and reputational damage.

Examples

  • A malicious actor creates inaccurate or malicious documents targeted at a model’s training data

  • The model trains using falsified information or unverified data which is reflected in output.

Prevention

  • Verify the legitimacy of targeted data sources during both the training and fine-tuning stages

  • Craft different models via separate training data for different use-cases

  • Use strict vetting or input filters for specific training data or categories of data sources

Attack Scenarios

  • Output can mislead users of the application leading to biased opinions

  • A malicious user of the application may try to influence and inject toxic data into the model

  • A malicious actor or competitor creates inaccurate or falsified information targeted at a model’s training data

  • The vulnerability Prompt Injection could be an attack vector to this vulnerability if insufficient sanitization and filtering are performed

LLM04: Model Denial of Service

Manipulation of the data or the fine-tuning process to introduce vulnerabilities, backdoors, or biases that could undermine the model's security, effectiveness, or ethical behaviour is referred to as training data poisoning. This increases the likelihood of performance deterioration, downstream software exploitation, and reputational damage.

Examples

  • Posing queries that lead to recurring resource usage through high volume generation of tasks in a queue

  • ending unusually resource-consuming queries

  • Continuous input overflow: An attacker sends a stream of input to the LLM that exceeds its context window

Prevention

  • Implement input validation and sanitization to ensure input adheres to defined limits, and cap resource use per request or step.

  • Enforce API rate limits to restrict the number of requests an individual user or IP can make

Attack Scenarios

  • Attackers send multiple requests to a hosted model that are difficult and costly for it to process

  • A piece of text on a webpage is encountered while an LLM-drive tool is collecting information to respond to benign a query.

  • The attacker overwhelms the LLM with input that exceeds its context window.

LLM05: Supply Chain Vulnerabilities

Supply chain flaws in LLMs can jeopardise training data, ML models, and deployment platforms, resulting in skewed findings, security breaches, and total system failures. Such flaws might be caused by outdated software, vulnerable pre-trained models, tainted training data, and insecure plugin designs.

Examples

  • Using outdated third-party packages

  • Fine-tuning with a vulnerable pre-trained model

  • Training using poisoned crowd-sourced data

  • Utilizing deprecated, unmaintained models

  • Lack of visibility into the supply chain is.

Prevention

  • Vet data sources and use independently-audited security systems

  • Use trusted plugins tested for your requirements

  • Apply MLOps best practices for own models

  • Use model and code signing for external models

  • Implement monitoring for vulnerabilities and maintain a patching policy

  • Regularly review supplier security and access

Attack Scenarios

  • Attackers exploit a vulnerable Python library

  • An attacker tricks developers via a compromised PyPi package

  • Publicly available models are poisoned to spread misinformation

  • A compromised supplier employee steals IP

  • An LLM operator changes T&Cs to misuse application data.

LLM06: Sensitive Information Disclosure

LLM apps may mistakenly reveal sensitive information, proprietary algorithms, or confidential data, resulting in unauthorised access, intellectual property theft, and privacy violations. LLM applications should use data sanitization, create proper usage controls, and limit the types of data returned by the LLM to reduce these risks.

Examples

  • Incomplete filtering of sensitive data in responses

  • Overfitting or memorizing sensitive data during training

  • Unintended disclosure of confidential information due to errors

Prevention

  • Use data sanitization and scrubbing techniques

  • Implement robust input validation and sanitization

  • Limit access to external data sources

  • Apply the rule of least privilege when training models

  • Maintain a secure supply chain and strict access control.

Attack Scenarios

  • Legitimate users exposed to other user data via LLM

  • Crafted prompts used to bypass input filters and reveal sensitive data

  • Personal data leaked into the model via training data increases risk.

LLM07: Insecure Plugin Design

Due to weak access constraints and faulty input validation, plugins might be vulnerable to malicious requests, which can result in negative outcomes such as data exfiltration, remote code execution, and privilege escalation. To prevent exploitation, developers must use strong security techniques such as rigorous parameterized inputs and safe access control principles.

Examples

  • Plugins accepting all parameters in a single text field or raw SQL or programming statements;

  • Authentication without explicit authorization to a particular plugin;

  • Plugins treating all LLM content as user-created and performing actions without additional authorization.

Prevention

  • Enforce strict parameterized input and perform type and range checks;

  • Conduct thorough inspections and tests including SAST, DAST, and IAST;

  • Use appropriate authentication identities and API Keys for authorization and access control;

  • Require manual user authorization for actions taken by sensitive plugins

Attack Scenarios

  • Attackers craft requests to inject their own content with controlled domains;

  • The attacker exploits a plugin accepting free-form input to perform data exfiltration or privilege escalation;

  • The attacker stages a SQL attack via a plugin accepting SQL WHERE clauses as advanced filters.

LLM08: Excessive Agency

Excessive Agency is a vulnerability in LLM-based systems caused by over-functionality, excessive permissions, or too much autonomy. To avoid this, developers must restrict plugin functionality, rights, and autonomy to the absolute minimum, log user authorisation, demand human approval for all operations, and implement authorization in downstream systems.

Examples

  • An LLM agent accesses unnecessary functions from a plugin

  • An LLM plugin fails to filter unnecessary input instructions

  • A plugin possesses unneeded permissions on other systems

  • An LLM plugin accesses downstream systems with high-privileged identities.

Prevention

  • An LLM agent accesses unnecessary functions from a plugin

  • An LLM plugin fails to filter unnecessary input instructions

  • A plugin possesses unneeded permissions on other systems

  • An LLM plugin accesses downstream systems with high-privileged identities.

Attack Scenarios

  • An LLM-based personal assistant app with excessive permissions and autonomy is tricked by a malicious email into sending spam. This could be prevented by limiting functionality, and permissions, requiring user approval, or implementing rate limiting.

LLM09: Overreliance

Overreliance on LLMs can have major repercussions, including disinformation, legal concerns, and security flaws.
It happens when an LLM is trusted to make crucial decisions or create information without proper scrutiny or confirmation.

Examples

  • LLM provides incorrect information

  • LLM generates nonsensical text

  • LLM suggests insecure code

  • Inadequate risk communication from LLM providers

Prevention

  • Regular monitoring and review of LLM outputs

  • Cross-check LLM output with trusted sources

  • Enhance the model with fine-tuning or embeddings

  • Implement automatic validation mechanisms

  • Break tasks into manageable subtasks

  • Clearly communicate LLM risks and limitations

  • Establish secure coding practices in development environments.

Attack Scenarios

  • AI fed misleading info leading to disinformation

  • AI's code suggestions introduce security vulnerabilities

  • The developer unknowingly integrates a malicious package suggested by AI.

LLM10: Training Data Poisoning

Unauthorised access to and exfiltration of LLM models is LLM model theft, putting economic loss, reputation damage, and unauthorised access to sensitive data at risk. To safeguard these models, strong security measures are required.

Examples

  • The attacker gains unauthorized access to the LLM model

  • Disgruntled employee leaks model artefacts

  • Attacker crafts input to collect model outputs

  • Side-channel attack to extract model info

  • Use of stolen model for adversarial attacks.

Prevention

  • Implement strong access controls, authentication, and monitor/audit access logs regularly

  • Implement rate limiting of API calls

  • Watermarking framework in LLM lifecycle

  • Automate MLOps deployment with governance.

Attack Scenarios

  • Unauthorized access to LLM repository for data theft

  • Leaked model artefacts by disgruntled employee

  • Creation of a shadow model through API queries

  • Data leaks due to supply-chain control failure7 B Side-channel attack to retrieve model information.