SAIF Risk Map
AI risks are everywhere. Take the tour to see how different risks are introduced, exploited, and mitigated throughout the AI development process.
For each AI risk, this tour shows:
-
Introduced
The components where systems, processes, or people could introduce a risk into the model development lifecycle.
-
Exposed
The components where security practitioners, systems, or users can recognize or encounter risks that have been introduced.
-
Mitigated
The components where organizations can take steps to mitigate or remediate a risk.
Use the “next” button to take the entire tour, or jump to a specific risk with the top navigation bar.
Data Poisoning
Introduced
Data Poisoning poses a risk throughout the data lifecycle. Data can be poisoned before it is ingested, during processing or training, or while the data is in storage. This makes it a critical concern across all data handling systems.
Exposed
Data Poisoning is exposed during development in the data filtering and processing steps or the training, tuning, and evaluation stages. It’s also exposed in the model itself, when it produces inaccurate results, malicious outputs, or unexpected behavior.
Mitigated
Proactive mitigation against Data Poisoning happens early in development. This includes data sanitization, secure systems and access controls, and mechanisms to ensure data and model integrity.
Unauthorized Training Data
Introduced
Unauthorized Training Data is introduced early in development if not properly filtered out during data ingestion, data processing, and model evaluation during training.
Exposed
The risk is exposed during development, through data filtering and processing steps or training, tuning, and evaluation. It is also exposed during model use, when the model may produce inferences based on data it shouldn’t have access to.
Mitigated
Mitigations for this risk start early, with careful data selection, filtering, and evaluation during training to catch any lingering issues.
Model Source Tampering
Introduced
Model Source Tampering is a risk that’s introduced when model code, training frameworks, or model weights are not hardened against supply chain attacks and tampering.
Exposed
This risk is exposed in the model frameworks and code components, if the tampering is discovered at the source. Otherwise, the risk is exposed in the model, through its modified behavior during use.
Mitigated
Safeguard against this risk by employing robust access controls and integrity management for model code and weights, comprehensive inventory tracking to monitor and verify models and code throughout systems, and secure-by-default infrastructure tools.
Excessive Data Handling
Introduced
The risk of Excessive Data Handling is introduced when data sources lack proper metadata tagging for effective management or when model and data storage infrastructure isn't designed to address data lifecycle concerns.
Exposed
This risk is exposed in both the model and in storage components, leading to data retention or usage beyond permissible limits.
Mitigated
Mitigate this risk with data filtering and processing, along with automation for data archiving, deletion, or issuing alerts for models trained with outdated data.
Model Exfiltration
Introduced
Model Exfiltration is introduced when storage or serving infrastructure lacks adequate security against attacks.
Exposed
This risk is exposed if attackers target vulnerabilities in serving or storage systems to steal model code or weights.
Mitigated
Mitigate this risk by hardening both storage and serving systems to prevent unauthorized access and protect against model theft.
Model Deployment Tampering
Introduced
The risk of Model Deployment Tampering is introduced within the model serving components, specifically when the serving infrastructure is vulnerable to manipulation.
Exposed
This risk is exposed if attackers tamper with production models within the model serving component.
Mitigated
Mitigation focuses on hardening the model serving infrastructure with secure-by-default tooling.
Denial of ML Service
Introduced
The risk of Denial of ML Service arises in the application component when a model is exposed to excessive access. Additionally, some types of Denial of ML Service (such as energy-latency attacks) stem from the fundamental functioning of the model itself.
Exposed
This risk is exposed during application use, when attackers either overwhelm the model with excessive calls or use carefully crafted "sponge examples" that take advantage of model weaknesses to degrade performance.
Mitigated
Mitigation occurs at the application level, using input filtering and employing rate limiting and load balancing to control the volume of calls to the model.
Model Reverse Engineering
Introduced
The risk of Model Reverse Engineering arises within the application component when excessive access to the model is granted for queries.
Exposed
This risk is exposed if attackers send excessive queries to the model and leverage the responses to reverse engineer its weights.
Mitigated
Mitigate this risk with rate limiting within the application API or using other protective measures at the application level to prevent excessive model access.
Insecure Integrated Component
Introduced
The risk of Insecure Integrated Components is introduced in the application and agent/plugin components, specifically through integrations that permit manipulation of inputs or outputs.
Exposed
This risk is exposed within the application or agent/plugin components, if attackers exploit the security vulnerability to gain unauthorized model access, insert malicious code, or compromise systems.
Mitigated
Mitigate this risk by addressing vulnerabilities directly within the application and agent/plugin components, and by enforcing strict permissions for agents and plugins.
Prompt Injection
Introduced
Prompt Injection is an inherent risk in AI models, because of the potential confusion between instructions and input data.
Exposed
This risk is exposed during model usage, specifically within the model input handling and model components. Attackers may inject commands within prompts, potentially causing unintended model actions.
Mitigated
Mitigation involves robust filtering and processing of inputs and outputs. Additionally, thorough training, tuning, and evaluation processes help fortify the model against prompt injection attacks.
Model Evasion
Introduced
Model Evasion is an inherent risk in AI models, as their core functionality relies on distinguishing between inputs to trigger specific inferences.
Exposed
This risk is exposed within the model component itself during its usage.
Mitigated
Mitigation occurs in the training, tuning, and evaluation phases, where robust models can be developed using extensive and diverse data to better withstand such attacks.
Sensitive Data Disclosure
Introduced
The risk of Sensitive Data Disclosure is introduced in several components. It can also be inherent to models due to their non-deterministic nature. This risk is amplified by data handling practices that fail to filter sensitive information, or by training processes that neglect to evaluate the model's potential for disclosure.
Exposed
This risk is exposed within the model itself, when it inadvertently reveals sensitive data it shouldn't.
Mitigated
Mitigate sensitive data disclosure by: filtering model outputs, rigorously testing the model during training, tuning, and evaluation, and removing or labeling sensitive data during sourcing, filtering, and processing before it's used for training.
Inferred Sensitive Data
Introduced
The risk of Inferred Sensitive Data is introduced in several components. It's inherent to models due to their non-deterministic nature and is amplified by inadequate data handling practices that fail to filter sensitive information. It can also be due to training processes that neglect to evaluate the model's potential for sensitive inferences.
Exposed
This risk is exposed within the model when it generates a response containing inferred sensitive data that it shouldn't.
Mitigated
Mitigation is multi-pronged: filtering model outputs to prevent revealing inferred sensitive data, rigorously testing the model during training, tuning, and evaluation to prevent sensitive inferences, and proactively removing or labeling data that could lead to such inferences during sourcing, filtering, and processing before training.
Insecure Model Output
Introduced
The risk of Insecure Model Output is inherent to AI models due to their non-deterministic nature, which can lead to unexpected and potentially harmful outputs.
Exposed
This risk is exposed within the model itself during usage, either through accidental triggers or deliberate exploitation.
Mitigated
Mitigation includes robust model validation and sanitization processes within the model output handling component to screen and filter for insecure responses.
Rogue Actions
Introduced
The risk of Rogue Actions is introduced when agents or plugins are integrated into an AI system, expanding the potential scope of actions that model output can trigger.
Exposed
This vulnerability is exposed during application usage, when model outputs inadvertently trigger unintended actions in another extension.
Mitigated
Mitigation involves model output handling and granting minimal permissions to agents and plugins. Involving humans in the scoping process may be necessary for added oversight and control.
Next Steps
Now that you understand the basics of the SAIF Risk Map, explore more detailed information about Risks, Controls, and Components.