Abstract:While synthetic data generation with large language models (LLMs) is widely used in post-training pipelines, existing approaches typically generate full outputs before applying quality filters, leading to substantial token waste on samples that are ultimately discarded. To address this, we propose Multi-Stage In-Flight Rejection (MSIFR), a lightweight, training-free framework that detects and terminates low-quality generation trajectories at intermediate checkpoints before they reach full completion. MSIFR decomposes the generation process into sequential stages and applies fast rule-based validators to identify arithmetic inconsistencies, hallucination patterns, and formatting violations, enabling early rejection of faulty samples. We formalize in-flight rejection as a sequential decision process and show that any non-trivial discard policy reduces expected token consumption, with stage-wise savings increasing when rejection occurs earlier in the generation pipeline. We further demonstrate that conditional utility estimates form a martingale, ensuring that early, in-flight rejection does not bias the expected utility of retained samples. Across five instruction-tuned models and seven reasoning benchmarks, MSIFR reduces token consumption by 11%-77% as a standalone method, and up to 78.2% when combined with early-exit methods, while preserving or improving evaluation accuracy. These results confirm that MSIFR provides a practical mechanism for improving the efficiency of LLM-based synthetic data generation without additional training or architectural changes.
| Comments: | 17 pages, 4 figures, 7 tables |
| Subjects: | Artificial Intelligence (cs.AI); Computation and Language (cs.CL) |
| Cite as: | arXiv:2605.14062 [cs.AI] |
| (or arXiv:2605.14062v1 [cs.AI] for this version) | |
| https://doi.org/10.48550/arXiv.2605.14062 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Anjir Ahmed Chowdhury [view email]
[v1]
Wed, 13 May 2026 19:35:49 UTC (1,546 KB)
