Abstract:Enterprise AI backends increasingly admit heterogeneous execution requests across model deployment, inference, evaluation, data movement, and agentic workflows. In many systems, those requests arrive in service-specific shapes, which makes it difficult to attach shared admission-time behavior such as logging, governance hints, resource accounting, authorization-aware policy hooks, and later runtime review without rebuilding the same contract in each subsystem. This paper introduces the execution envelope, a normalized internal admission object that records who is asking for what kind of execution, what resources were requested, what policy-relevant scope accompanied the request, and what the backend ultimately granted. The proposal is intentionally narrow. It does not replace service-specific request models, perform scheduling, or introduce a new authority token. Instead, it defines a descriptive admission seam that can be threaded through real backend paths before backend-specific resolution begins. I formalize the distinction between requested and granted resources, specify the field families, invariants, and lifecycle of the envelope, work through POST /serving/deploy_model as an initial proving ground, and position the design relative to usage control, analyzable authorization, admission control, and cluster scheduling. The central claim is that a shared execution-admission contract is a useful missing primitive for modern AI backends because it creates one place to attach governance and observability without pretending to solve placement, policy, and runtime execution in a single step.
| Comments: | Systems paper on backend admission contracts, 12 pages, 4 tables |
| Subjects: | Software Engineering (cs.SE); Artificial Intelligence (cs.AI); Distributed, Parallel, and Cluster Computing (cs.DC); Emerging Technologies (cs.ET) |
| Cite as: | arXiv:2605.08267 [cs.SE] |
| (or arXiv:2605.08267v1 [cs.SE] for this version) | |
| https://doi.org/10.48550/arXiv.2605.08267 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Krti Tallam [view email]
[v1]
Fri, 8 May 2026 03:02:38 UTC (12 KB)
