As we march forward towards building advanced AI systems like Agentic AI which will be primarily used for autonomous decision-making, we also need to consider the growing new cybersecurity threats and risks. The traditional approach of protecting an LLM has an approach of output evaluation, authorization, LLM vulnerabilities remediation and managing access controls but the emergence of Agentic AI changes everything and cybersecurity needs to evolve accordingly.
This paper focuses on creating a building block for Agentic AI security framework and governance with a kill switch for Agentic AI systems. This paper explains agentic AI systems, along with modules and defines the architectural components involved in building an Agentic AI system. The focus is also on laying out the security risks and ways to mitigate these risks. Laying out an optimum security framework is one of the key objectives of this paper along with the role of human operational and governance oversight to manage these security challenges.
Introduction
Agentic AI refers to the next evolution of AI software where systems transcend from being traditional static entities to more dynamic entities which possess a certain degree of agency or autonomy and capability to reason, plan and make decisions independently without any human intervention. Agentic AI extends the AI capability to learn and adapt using new data and constantly iterate and learn using feedback loops.
Agentic AI not only has the “autonomy” to make decisions, take actions solve challenging problems and then end up interacting with external environments or applications. Think of it as the next evolution of RPA (Robotic process automation) but the difference is processes being dynamic and changing or evolving based on the need. These Agents not only learn from the data from the databases but also from the previous user behaviour and learning from these experiences.
When asked a question by the traditional LLM like ChatGPT or Claude about Amazon stock price prediction in the next 6 months, the output will be based on the data it was trained upon. You don’t need stock price data that is frozen in the past but need more real-time data.
Agentic AI would take above-stock predictions and build a workflow to get real-time data. After reasoning and planning, it makes a price prediction of the Amazon stock ticker. Theoretically, if the business context input is also to make a profit, then it will buy stock for you with the given payment information as well.
Agentic AI can be broken up into two sections referred to as Single Agent-based systems (SAS) and Multi agent-based systems (MAS).
SAS typically involves one LLM and usually one single task using function calling. SAS is not the scope of this document
MAS contains multiple autonomous agents each having a distinct role and task along with typically dedicated LLMs giving them the required autonomy. This document is more focused on the MAS.
Background
In the last few years, Enterprise LLMs like ChatGPT and Claude started becoming a new normal with wide-scale adoption. This adoption and usage meant new cybersecurity threats like prompt injection, trust boundaries, information leaks both at LLM and application level, privacy concerns and reliability of data.
As the organizations were rushing towards securing their enterprise LLMs through adversarial tuning and validation to protect the LLMs from hostile input, model evaluation to secure against prompt injection, access control and authentication to establish trust, content and data filtering to prevent leaks and increasing reliability.
As Agentic AI starts dominating the AI space with reasoning, planning, autonomous decision making & learning, we need to be prepared for new evolving cybersecurity threats ranging from adversarial attacks, data poisoning, and model theft with reverse engineering to most importantly decision-making risks. The expanded attack surface adds another layer which further complicates the threat infrastructure.
Agentic AI security Framework lays out an approach to implement security and governance mechanisms to address the Agentic AI security challenges.
Agentic AI Modules
- Perception: The perception module receives the business context input from various sense components such as sensors, text, audio & video. This data is processed to extract relevant information, like visual features, textual information, or numerical values.
- Reasoning and Goals Representation: This is the reasoning module of the agent which is used to interpret the perception module data and define goals and objectives. These goals are sometimes very specific like “Write a research paper for me involving Agentic AI” or broader like “Write a research paper for me ”
- Planning: Once the goals are laid out, the planning module works with given defined goals to generate a plan of action. It generates strategies and plans to achieve the defined goals, considering the agent’s capabilities and environmental constraints.
- Decision-Making: Once the planning mode has generated the plan, the decision-making module lays out the most appropriate plan of action based on the current situation, goals, and available options.
- Action: The action module executes the selected action laid out by the planning and decision-making modules. This includes taking action in real-world applications.
- Learning: The agent’s learning module continuously updates its memories and shared memory from the outcomes of the above actions through learning and adaptation to improve experiences.
Agentic AI Technical Architectural Layers
The Agentic AI technical architecture consists of three core architectural layers consisting of Supervision layer, the Control player and the Data plane layer.
Security Challenges of Agentic AI
Adversarial Attacks
Adversarial attacks refer to the manipulation or tricking of models using carefully crafted input data with the intent of causing harm. This leads to the model making incorrect decisions compromising the effectiveness of the AI system. An attacker might use such attacks to
- Trick the model using carefully crafted inputs to reveal Intellectual property for an organization.
- Adding markings to stop signs tricks the model into understanding it as a speed limit sign.
Adversarial attacks have a lot of visibility in the AI security world. These risks are even more significant when we are dealing with industries like finance, healthcare and aerospace where one wrong decision making can be a life-changing moment.
Data Poisoning
Poisoning of data as the name suggests means corruption of the training data set. This attack pertains to somehow injecting malicious data thereby compromising the Agent’s ability to make sound and correct decisions. Typical outcome of data poisoning results in
- Performance degradation of model
- Financial losses due to incorrect decisions
- Brand loss due to agents making unethical decisions
This is one of the important security threats since most of the renowned available models like chatGPT and Claude are trained on publicly available training data sets.
Model Theft and Reverse Engineering
As we move towards creating custom models using enterprise private data, these custom models have high Intellectual value and hence higher is the risk of attack to steal these models. Attackers might try to:
- Expose vulnerabilities by reverse engineering
- Steal private AI models to gain competitive advantages
- Clone stolen models to create hostile agents to trick other agents into using them for valid business processes
Model theft significantly impacts the organization especially if the organization largely benefits from AI advantages. Model extraction attacks are commonly used where attackers simulate queries to steal or clone model insights.
Expanded Attack Surface
The Agentic AI expansion across critical business processes expands and exposes the attack surface area which can be exploited by the attackers. Typical vulnerabilities can be found in:
- Creation of insecure workflows by Agentic AI
- Usage of open-source AI framework and libraries
- Lateral movement between agents if any one agent is compromised
The expanded attack surface also extends to any cloud provider, partner and third-party integrations.
Privacy and Data Protection Concerns
Agentic AI often requires access to large volumes of data which are usually private and confidential which brings up substantial regulatory and compliance-related data protection concerns.
Any violation of strict regulatory frameworks like GDPR (General Data Protection Regulation) and CCPA (California Consumer Protection Act) usually involves fines which may cause significant financial and brand value loss. These risks involve:
- Agents accessing PII and RPII data
- Local compliance challenges regarding PII data
- AI reveals PII (Personally Identifiable Information) and RPII (Restricted Personally Identifiable Information) information through its outputs
- Tracking and adhering to new data-related compliance and regulation based on jurisdictions
Agentic AI needs to adhere to regulatory concerns especially when dealing with data from multiple jurisdictions. This becomes more challenging when regulations in these multiple jurisdictions are nascent and still evolving.
Autonomous Decision-Making Risks
Agentic AI’s capability of making independent autonomous decisions introduces new risks such as:
- Agent decisions getting manipulated to benefit the attacker
- Lack of visibility of the decision-making process or parameters brings auditing and oversight concerns, especially during any anomaly.
- No human oversight on critical business processes could lead to significant disruption during failures.
Autonomous decision-making can become counter-intuitive if outcomes start being unpredictable.
Agentic AI Security Strategies & Framework
Strategies to Address Security Challenges:
Agentic AI-related security risks can be addressed using below strategies:
- Adversarial Training: This is more of a defensive technique where models are trained on adversarial examples to increase strength against them and prevent manipulation. Below are two major methods to perform adversarial training:
- Feature Squeezing: This strategy involves reducing the search space area for an adversary by squeezing unnecessary features from the input data.
- Defensive Distillation: This involves smoothening of model boundaries making it difficult to exploit
- Data Governance: Establishing strict data governance and quality control measures to mitigate risks related to Agentic AI. Below are a few ways to do the same:
- Audit control: Maintaining regular audit control over the data helps build trust and compliance
- Adaptable Governance: Regularly update governance frameworks to be in sync with growing regulatory frameworks
- Encryption and Access Controls: Implementing encryption techniques like homomorphic encryption technique to safeguard the model and data
- Usage of Explainable AI methods: Explainable AI can help us understand and explain machine learning (ML) algorithms, deep learning and neural networks without trusting them blindly:
- LIME (Local Interpretable Model-agnostic Explanations): Generates local approximations of a model’s predictions, highlighting which features are most important for a specific prediction to calculate the contribution of each feature to a model’s prediction,
- SHAP (SHapley Additive exPlanations): Uses game theory concepts (SHapley values) to calculate the contribution of each feature to a model’s prediction, providing a more comprehensive understanding.
- Monitoring, Oversight & Audits: Monitoring the critical process for anomalies and maintaining regular audits and possible human oversights to flag deviations.
- AI Monitoring & Oversight with Kill Switch: Ability to monitor workflow with AI monitoring tools with a kill switch option to avoid causing further harm:
- AI Monitoring Tool: Implementing AI monitoring tools to flag any deviation from a critical business function
- Kill Switch: The ability of AI monitoring to kill the agentic workflows and shutdown if a large enough deviation or anomaly is detected to protect the system from causing more harm
Agentic AI Security Framework:
The Agentic AI security Framework is built on top of core architectural technical layers of the Supervision plane, control plane & data plane. Each of these layers has its security components attached to it which are listed as follows
Supervision Plane security layers consist of primarily identity management and UBA (User behaviour analytics
- Identity Management: Sometimes referred to as Identity and access management is a framework where user access to resources (in this case Agentic AI interaction layer) is managed through processes and policies. Azure AD is one of the common Identity management services.
- User behaviour analytics (UBA): UBA collects logs and alerts from all the user activity and builds a baseline profile against it. The logs contain user activity details like the IPs, host details, and the application it’s accessing. Using AI algorithms any user behavior anomaly is flagged for security assessment. Now in regards to Agentic AI, any compromised user must get flagged as an anomaly so corrective actions can be taken.
Control plane security layers consist of Context-driven analytics and Explainable AI analytics.
- Context-Based Analytics (CBA): CBA is a security framework that will manage Agents’ roles and permissions based on the real-time context like requester source IP location, device health, time of day, VPN usage and any other relevant contextual information. This framework usually requires the usage of an ML algorithm to make decisions based on real-time contextual information.
- Explainable AI Analytics (EAIA): EAIA is a security framework where Agentic AI decisions are tracked and analyzed from a compliance and governance standpoint. The decisions are then compared to business context user input. Any large deviation is flagged for human analytics and review. For example, if the user submitted a mortgage application but the agent is also returning results of the insurance quote along with the mortgage quote, then that’s a deviation and needs a thorough review of the model.
Data Plane security layers consist of Agentic Behavior Analytics and Agentic Context Analytics.
- Agent User Behavior Analytics (AUBA): The ABA is a security framework that sits on the data plane where agents perform tasks in an external environment. AUBA collects agent execution logs for a particular user request and builds a baseline profile against it. Deviations are tracked and flagged using an AI algorithm for analysis.
- Agent Identity Management: This is dedicated Identity access management for Agents where roles and permission are dynamically assigned by control plane CBA. This assignment is on a per-session basis.
Agentic AI Security Framework Maturity Model – Analytics
Conclusion
As organizations are rushing in to implement Generative AI solutions, they are still trying to ascertain the value of Gen AI being more of an assistant. LLMs have been great at making organizations more efficient but questions remain as to how to drive revenue for the organization cost-effectively.
Agentic AI might be the magic bullet for organizations that are waiting for large-scale adoption and generate revenue growth. Agentic AI is well positioned to harness the power of LLMs and scale it efficiently. Agentic AI is transforming industries by introducing dynamic, autonomous systems capable of executing complex tasks independently.
Scaling these systems requires a careful balance between innovation and security. Harnessing the full potential of Agentic AI requires strong cybersecurity framework implementation which will build trust and drive adoption. Creating a culture of innovation and keeping a close eye on security framework and best practices will ensure agentic AI will be secure, and reliable and most importantly create immense value for both organizations and society.
References
An Introduction Agentic AI in Cybersecurity
Gartner: Mitigating security threats in AI agents
Agentic AI: Top governance & security concerns
Generative AI Cybersecurity Risks for Business AI Agent Workflows
5 Security Considerations for Managing AI Agents and Their Identities
(Image by Pete Linforth from Pixabay)