Lawrence Jones on Fighting AI with AI - StartupHub.ai

Lawrence Jones, Founding Engineer at incident.io, recently presented at AI Engineer Europe on the topic of "Fighting AI with AI." The core of his presentation focused on how AI systems themselves can be used to manage the complexity of AI products, particularly in debugging and analysis. Jones outlined the challenges faced when dealing with AI systems that rapidly outpace human analytical capabilities, emphasizing the need for tools that can keep pace with AI development.

Lawrence Jones on Fighting AI with AI — from AI Engineer

Visual TL;DR. AI Complexity Challenge leads to Fighting AI with AI. Lawrence Jones presents Fighting AI with AI. Fighting AI with AI enables Leveraging AI for Evals. Leveraging AI for Evals requires Structured Data. Structured Data powers Analysis Pipelines. Analysis Pipelines informs Key Development Patterns.

AI Complexity Challenge: AI systems rapidly outpace human analytical capabilities
Lawrence Jones: Founding Engineer at incident.io, leading AI efforts
Fighting AI with AI: Using AI systems to manage complexity of AI products
Leveraging AI for Evals: AI assists in debugging and analysis of AI systems
Structured Data: Essential for automated analysis pipelines
Analysis Pipelines: Automated systems to keep pace with AI development
Key Development Patterns: Strategies for building and managing AI effectively

Visual TL;DRQuickExplainDeeper

Who is Lawrence Jones?

Lawrence Jones is the Founding Engineer at incident.io, a company that builds incident response management platforms. He joined incident.io as their first hire and has been instrumental in leading their AI efforts. The company has grown to 200 people with dual headquarters in London and San Francisco. Jones's experience at incident.io involves developing AI-driven tools to help companies manage and communicate during incidents, a process that has provided him with deep insights into the practical application of AI in complex operational environments.

The Challenge of AI Complexity

Jones highlighted that AI systems, due to their inherent complexity, often pose significant challenges for human debugging. As AI models and their interactions scale, the sheer volume of data and the intricate interdependencies make it nearly impossible for humans to manually track and understand system behavior. He presented a scenario where an AI system, through its vast number of prompts and tool calls, can generate an overwhelming amount of data that requires automated analysis. This complexity necessitates the development of AI-driven tools to effectively manage and debug these systems.

Leveraging AI for Evals and Debugging

The presentation detailed how incident.io utilizes AI to manage its own AI systems, illustrating a meta-approach to AI development. Jones discussed the concept of "Evals" as unit tests for AI, where a prompt is run with specific input data, and the output is evaluated against predefined criteria. He explained that these evaluations are stored in YAML files, which can grow very large and become difficult for engineers to manage. To address this, incident.io developed a CLI tool called "eval-tool" that allows agents to interact with these evaluation files, enabling them to programmatically analyze and debug AI behavior.

Jones demonstrated how this tool can be used to "steal" an eval from production, meaning to take a real-world incident investigation and translate it into a test case for the AI system. This allows for the creation of a robust feedback loop where AI systems are continuously tested against real-world scenarios. The process involves downloading the investigation data, running it through a sandboxed AI environment, and then analyzing the results to understand where the AI system failed or succeeded. This iterative process of testing, analyzing, and refining prompts is crucial for improving the reliability and accuracy of AI systems.

The Role of File Systems and Analysis Pipelines

A key takeaway from Jones's presentation was the importance of file systems as a way to provide context to AI agents. By structuring and storing data in a clear, organized manner, file systems can serve as a rich source of information for AI systems. incident.io has developed a detailed process flow for its AI investigations, breaking down the analysis into several stages: preflight, per-investigation analysis, cohort clustering, synthesis, and finalization. Each stage involves specific AI agents and tools designed to extract and analyze data from various sources.

Jones emphasized that the output of these AI agents is stored in incremental files within the filesystem, allowing for a granular and traceable analysis. This approach enables the AI agents to build upon previous findings and create a comprehensive understanding of the incident. Furthermore, by combining this analysis with code, engineers can directly link system behavior to specific code changes, facilitating faster debugging and issue resolution.

Key Patterns for AI Development

Jones concluded by summarizing several key patterns that generalize from incident.io's experience:

AI systems quickly outrun human analysis, making AI-driven debugging essential.
Debugging efforts should be prioritized through AI tools.
Filesystems are exceptionally good agent context, providing a structured way for AI to access information.
Complex analysis can be transformed into AI runbooks, enabling agents to follow a structured process for debugging and problem-solving.

He encouraged attendees to leverage these patterns to build more robust and efficient AI systems, emphasizing that the ability to fight AI with AI is becoming increasingly critical in the rapidly evolving field of artificial intelligence.

2026

StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our

terms.