Mythos AI Exposes Weak DevOps Pipelines

The much trumpeted arrival of Claude Mythos, a new generation of Large Language Model (LLM), marks a turning point in the cyber security landscape. This latest wave of frontier models have become so capable at coding, they have, as a by-product, made identifying and executing vulnerabilities at a scale, sophistication and speed that fundamentally exceeds human-paced security models.

A vulnerability lay unnoticed in OpenBSD, one of the most hardened operating systems available, for 27 years. Mythos found it.

Claude Mythos is not an anomaly – it’s an early signal of a broader shift towards agentic vulnerability discovery and autonomous security testing emerging across the industry. This has fundamentally changed what organisations need from their delivery infrastructure: the uncomfortable reality emerging is that most are not ready. Not because they lack ambition, but because the pipelines they rely on to build, test and deploy software were designed for a world that no longer exists.

In this post, written in collaboration with Simon Harrison, Senior Consulting Engineer, we explore what Mythos means for DevOps pipelines, why traditional security models are falling behind, and what organisations can do to build agent-ready, security-first delivery infrastructure.

Most pipelines are already brittle and AI compounds this

According to Harness’s State of DevOps Modernization 2026, based on a survey of 700 engineering practitioners across five countries, 69% of respondents say they waste time due to slow or unreliable CI/CD pipelines. Among the heaviest AI coding users, that figure rises to 79%. These are not immature organisations finding their feet with DevOps – these are experienced engineering teams whose delivery infrastructure is already struggling to keep pace, before the full weight of agentic development lands on it.

The reason is straightforward. Pipelines get built during a project’s initial phase, often years beforehand, and then are largely left alone. Teams run them when there is a change to deploy, but routine maintenance, tooling updates and security standard refreshes rarely happen with any consistency.

If you ask most Heads of IT how much visibility they have into the health of their applications via their DevOps tooling, and they’ll tell you the honest answer is ‘not much’. There is rarely a top-level view showing the status of pipelines – where technical debt is high, components are running out of supported versions, or where vulnerabilities are quietly accumulating. Without that visibility, the risk is invisible until something breaks.

They were built before AI. They are running in an AI world.

Those pipelines were designed for human-paced delivery in a lower-threat environment. Automated security testing was often optional in the pipeline. Penetration testing was periodic rather than continuous. That was a reasonable set of trade-offs at the time. It is not reasonable now.

AI has changed two things simultaneously:

It has dramatically increased the volume and speed of code being produced
It has exponentially increased the sophistication of the attacks that code needs to defend against

Agents writing code at volume – without the quality controls to validate what they produce – introduce bugs, technical debt and security vulnerabilities at a rate no human reviewer can match unaided. The organisations most exposed are not the ones with no DevOps capability. They are the ones that built something reasonably robust several years ago, assumed the job was done, and have no real visibility into what has drifted since.

DevSecOps is the response. For most pipelines, it is not yet the reality

The remediation required is not a wholesale rebuild. It is a targeted strengthening of what already exists, with security shifting left into the pipelines.

In an agentic threat environment, a minimum viable pipeline must assume a continuous attack posture. That means mandatory security controls on every change, policy‑as‑code enforcement, automated penetration testing, dependency scanning, secrets management, and supply chain integrity checks that run inside CI/CD as standard practice – not quarterly reviews or point‑in‑time audits.

Vulnerabilities need to be caught at the point of code generation, not discovered in production. Security that sits outside the pipeline is security that does not keep up.

This is the core of DevSecOps: treating security as a first-class engineering concern rather than a downstream review process. For organisations still running pipelines built before AI, closing that gap is the most urgent item on the DevOps agenda. Not because it is theoretically good practice, but because the threat environment Mythos has exposed makes it a practical necessity.

It’s impossible to fix what you cannot see

Many organisations don’t have the internal skills to audit their own pipelines objectively, can’t identify what is insufficient and are unable to build a credible roadmap to close the gaps. They know something needs to change but they are less certain about the status quo.

A credible DevOps partner starts by making that visible. It benchmarks current capability across people, process and technology against frameworks like DORA. It surfaces application health, debt concentration and security exposure across the estate. It builds a remediation roadmap that addresses culture and ways of working alongside the toolchain, because the technical changes are often the straightforward part compared to the behavioural ones.

A leading utility provider came to us facing exactly this situation. Deployment processes were largely manual, DevOps practices were inconsistent across teams, and there was limited automation in either infrastructure provisioning or application deployment. Critically, IT operations had drifted out of alignment with business objectives in ways that nobody had a clear view of.

We conducted a comprehensive DevOps maturity assessment, established CI/CD pipelines, implemented Infrastructure as Code using Terraform, and created reusable patterns and guardrails to ensure consistency across teams. The outcome was a 20% reduction in hosting costs – through decommissioning over 200 redundant virtual machines, a significant reduction in manual tasks, and measurably faster response to business demand. The starting point was not an absence of DevOps. It was years of accumulated drift that had never been made visible.

We have been doing this work for nearly three decades across public and private sector organisations. With approximately 700 cloud and DevOps platform engineers, over 50 AI specialists, and ISO 42001 certification, our DevOps Maturity Assessment provides a structured, evidence-based starting point: a clear picture of where the gaps are, a prioritised remediation roadmap, and a practical view of what an agent-ready, security-first pipeline looks like in your specific environment.

The threat is live. Is your pipeline ready?

Mythos class capabilities exist today and they fundamentally change the threat model for a world not built to withstand it. Every day that an organisation runs a pipeline that has not been assessed, updated and hardened for the current threat environment is another day that exposure goes unmeasured.

Our DevOps Maturity Assessment gives you a clear picture of where the gaps are, a prioritised remediation roadmap and a practical view of what an agent-ready, security-first pipeline looks like in your specific environment

The cost of finding out the hard way is considerably higher than the cost of finding out now. Contact us to get started.

There is a second challenge sitting just beneath the surface of all of this, and it deserves its own discussion. As agents take on more of the work, developers naturally review less carefully. Over time, teams can lose genuine understanding of what they have built. The codebase becomes a black box. That is not a tooling problem, it is a human one – and it has serious implications for how DevOps platforms need to evolve.

In our next post, we will look at cognitive debt and the human factors that agentic DevOps creates, and what organisations and partners need to do differently to address it before it becomes a systemic risk.