Sharpness-Guided Group Relative Policy Optimization via Probability Shaping — AI News