Abstract:LLMs have shown immense potential for code translation, yet they often struggle to ensure both syntactic correctness and semantic consistency. While preference-based learning offers a promising alignment strategy, it is hindered by unreliable semantic rewards derived from sparse test cases or restrictive reference translations. We argue that a robust semantic reward for code translation must be derived directly from the source code. In this paper, we propose CTO to improve code translation with syntax-guided and semantic-aware preference optimization. Through contrastive learning, we train a cross-lingual semantic model to directly assess functional equivalence between source and translated code. By formulating code translation as a multi-objective optimization problem, this robust semantic signal is seamlessly unified with compiler-based syntactic feedback within the direct preference optimization framework. Extensive experiments on C++, Java, and Python translations demonstrate that CTO significantly outperforms existing baselines and alternative preference optimization strategies.
| Comments: | Accepted in the 35th International Joint Conference on Artificial Intelligence (IJCAI 2016) |
| Subjects: | Artificial Intelligence (cs.AI); Software Engineering (cs.SE) |
| Cite as: | arXiv:2605.13229 [cs.AI] |
| (or arXiv:2605.13229v1 [cs.AI] for this version) | |
| https://doi.org/10.48550/arXiv.2605.13229 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Wei Hu [view email]
[v1]
Wed, 13 May 2026 09:19:39 UTC (889 KB)
