STOP: Structured On-Policy Pruning of Long-Form Reasoning in Low-Data Regimes — AI News