Polyhedral Instability Governs Regret in Online Learning

View PDF HTML (experimental)

Abstract:Many online decision problems over combinatorial actions are addressed via convex relaxations, leading to online convex optimization with piecewise linear objectives and induced polyhedral structure. We show that regret in such problems is governed by \emph{polyhedral instability}: the number of changes of the active region. Under full information feedback and fixed partition assumptions, if $\mathrm{RS}_T$ denotes the number of region switches and $V_{\max}$ the maximum number of vertices per region, we prove $\Regret_T= \Theta(\sqrt{(1+\mathrm{RS}_T)\,T\,\log V_{\max}})$ interpolating between experts-like and dimension-dependent OCO rates. For online submodular--concave games under Lovász convexification, this reduces to the permutation-switch count $\mathrm{SC}_T$, yielding the matching rate $\Regret_T= \Theta(\sqrt{(1+\mathrm{SC}_T)\,T\,\log n})$. Experiments on synthetic and real combinatorial problems (shortest path, influence maximization) validate the predicted scaling and indicate that low-instability regimes can arise in practice without explicit enumeration of actions.

Subjects:	Machine Learning (cs.LG); Computational Complexity (cs.CC)
Cite as:	arXiv:2605.13692 [cs.LG]
	(or arXiv:2605.13692v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2605.13692 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Luyao Niu [view email]
[v1] Wed, 13 May 2026 15:45:44 UTC (502 KB)