Reinforcement Learning for Diffusion LLMs with Entropy-Guided Step Selection and Stepwise Advantages — AI News