Rethinking AI Security: Focus Shifts from Model Defense to Workflow Protection Amid New Threats

Rethinking AI Security: Prioritizing Workflow Protection Over Model Defense

As artificial intelligence (AI) assistants and copilots become integral to daily operations, the focus of security teams has predominantly been on safeguarding the AI models themselves. However, recent incidents highlight that the more significant vulnerabilities lie within the workflows surrounding these models.

For instance, two Chrome extensions masquerading as AI helpers were discovered stealing chat data from ChatGPT and DeepSeek, compromising over 900,000 users. In another case, researchers demonstrated how prompt injections embedded in code repositories could deceive IBM’s AI coding assistant into executing malware on a developer’s machine. Notably, these attacks did not compromise the AI algorithms directly but exploited the contexts in which the AI operates.

AI Models as Workflow Engines

AI is increasingly utilized to automate tasks and integrate applications that were traditionally managed manually. For example, an AI writing assistant might retrieve a confidential document from SharePoint to draft an email, or a sales chatbot could access internal CRM records to respond to customer inquiries. These scenarios blur the lines between applications, creating dynamic integration pathways.

The inherent risk lies in the probabilistic nature of AI decision-making. Unlike deterministic software, AI generates outputs based on patterns and context, lacking a native understanding of trust boundaries. Consequently, a carefully crafted input can prompt an AI to perform unintended actions, expanding the attack surface to include every input, output, and integration point the model interacts with.

In such environments, compromising the model’s code becomes unnecessary for adversaries. Manipulating the context perceived by the AI or the channels it utilizes suffices. The aforementioned incidents exemplify this: prompt injections in repositories can hijack AI behavior during routine tasks, while malicious extensions can extract data from AI conversations without directly accessing the model.

Limitations of Traditional Security Controls

These workflow-centric threats reveal significant gaps in conventional security measures, which were designed for deterministic software, stable user roles, and well-defined perimeters. AI-driven workflows challenge these foundational assumptions:

– Indistinguishable Inputs: Traditional applications differentiate between trusted code and untrusted input. AI models, however, process all inputs as text, making it difficult to discern malicious instructions hidden within seemingly benign documents. Standard input validation is ineffective because the payload appears as natural language rather than explicit malicious code.

– Subtle Anomalies: Conventional monitoring systems detect overt anomalies like mass data downloads or unusual login attempts. However, an AI accessing extensive records as part of a routine query may appear as normal service-to-service traffic. If this data is then summarized and transmitted to an attacker, no explicit security rule has been violated.

– Context-Dependent Behavior: Traditional security policies define explicit permissions, such as restricting user access to specific files or blocking traffic to certain servers. AI behavior, however, is context-dependent, making it challenging to establish rules like never disclose customer data in outputs.

– Dynamic Workflows: Security programs often rely on periodic reviews and static configurations, such as quarterly audits or fixed firewall rules. AI workflows are inherently dynamic; integrations may acquire new capabilities or connect to additional data sources over time. By the time a scheduled review occurs, sensitive information may have already been compromised.

Enhancing Security in AI-Driven Workflows

To effectively secure AI-driven workflows, organizations should adopt a holistic approach that encompasses the entire workflow, not just the AI model:

1. Comprehensive AI Usage Assessment: Identify all instances of AI utilization within the organization, including official tools like Microsoft 365 Copilot and unauthorized browser extensions installed by employees. Understand the data each system can access and the actions it can perform. Many organizations are surprised to discover numerous unapproved AI services operating within their infrastructure.

2. Implement Robust Guardrails: If an AI assistant is intended solely for internal summarization, restrict its ability to send external communications. Monitor outputs for sensitive data before they leave the organizational environment. These safeguards should be implemented outside the AI model itself, within middleware that evaluates actions prior to execution.

3. Apply Principle of Least Privilege: Treat AI agents as you would any other user or service. If an AI requires read access to a specific system, do not grant it unrestricted access to all systems. Limit OAuth tokens to the minimum permissions necessary and monitor for anomalies, such as an AI accessing data it has not previously interacted with.

4. Educate Employees on AI Risks: Inform users about the dangers associated with unvetted browser extensions and the risks of copying prompts from unknown sources. Evaluate third-party plugins before deployment and consider any tool that interacts with AI inputs or outputs as part of the security perimeter.

By shifting the security focus from solely protecting AI models to securing the entire workflow, organizations can better address the real risks associated with AI integration. This comprehensive approach ensures that both the AI systems and the contexts in which they operate are safeguarded against potential threats.