Octopus Protocol: One-Shot Hardware Discovery and Control for AI Agents via Infrastructure-as-Prompts

View PDF HTML (experimental)

Abstract:Recent agentic-robotics systems, from Code-asPolicies to modern vision-language-action (VLA) foundation models, presuppose that drivers, SDKs, or ROS-style primitives for the target hardware already exist. Writing those primitives is the dominant engineering cost of bringing up new hardware for agent control. We present Octopus Protocol, a system that collapses that cost to a single shell command. Given only raw OS access and a language-model API key, a coding agent executes a five-stage pipeline--PROBE, IDENTIFY, INTERFACE, SERVE, DEPLOY--to discover connected devices, infer their capabilities, generate a Model Context Protocol (MCP) server with typed tools, and deploy it as a live HTTP endpoint. A persistent daemon then monitors the system, heals broken code, and perceives physical state through the camera tools it generated for itself. Two architectural principles make this work: protocols are prompts, not code, and the coding agent is the runtime. We validate the system on three heterogeneous platforms (PC/WSL, Apple Silicon macOS, Raspberry Pi 4) and on a commercial 6-DOF robotic arm with USB camera feedback. One command onboards the hardware in ~10-15 minutes and exposes up to 30 MCP tools; an MCP-compliant client then performs closed-loop visual-motor control through tools no human wrote.

Subjects:	Robotics (cs.RO); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)
Cite as:	arXiv:2605.09055 [cs.RO]
	(or arXiv:2605.09055v1 [cs.RO] for this version)
	https://doi.org/10.48550/arXiv.2605.09055 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Quilee Simeon [view email]
[v1] Sat, 9 May 2026 16:57:11 UTC (488 KB)