An Interpretable Latency Model for Speculative Decoding in LLM Serving — AI News