Abstract:We design the first regret guarantees for robust dynamic pricing that decouple the dependence on the corruption $C$ and the time horizon $T$. In dynamic pricing, a seller with unlimited supply of a good interacts with a stream of buyers over \( T \) rounds, with the goal of maximizing revenue. At each round $t$, the seller posts a price $p_t$, and the buyer purchases the good only if their unknown valuation $v^\star$ exceeds this price. The seller observes only the binary feedback $\mathbb{I} \left\{ p_t \leq v^\star \right\}$, indicating whether a sale occurred. In the \emph{robust} pricing setting, a malicious adversary is allowed to corrupt this feedback in at most $C$ rounds. Even if the learner knows the corruption $C$, the best known regret bound is $\mathcal{O}(C\log\log T)$ by Gupta et al. [2025]. This leaves as an open problem to ``decouple'' the dependence on $C$ and $T$. In this work, we resolve this open problem. In particular, we develop a robust variant of binary search that achieves regret $\mathcal{O}(C+\log T)$ when the corruption $C$ is known and $\mathcal{O}(C+\log^2 T)$ when the corruption is unknown.
| Subjects: | Machine Learning (cs.LG); Artificial Intelligence (cs.AI) |
| Cite as: | arXiv:2605.08290 [cs.LG] |
| (or arXiv:2605.08290v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2605.08290 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Francesco Emanuele Stradi [view email]
[v1]
Fri, 8 May 2026 08:22:43 UTC (36 KB)
