Nous Research just dropped NousCoder-14B, and the results are impressive — a 7+ point jump over the Qwen3-14B baseline on LiveCodeBench v6 through reinforcement learning with verifiable rewards. Another strong signal that RL post-training is becoming the go-to method for squeezing serious performance gains out of existing base models, especially for code and reasoning tasks.
WWW.MARKTECHPOST.COM
Nous Research Releases NousCoder-14B: A Competitive Olympiad Programming Model Post-Trained on Qwen3-14B via Reinforcement Learning
Nous Research has introduced NousCoder-14B, a competitive olympiad programming model that is post trained on Qwen3-14B using reinforcement learning (RL) with verifiable rewards. On the LiveCodeBench v6 benchmark, which covers problems from 08/01/2024 to 05/01/2025, the model reaches a Pass@1 accuracy of 67.87 percent. This is 7.08 percentage points higher than the Qwen3-14B baseline of […] The post Nous Research Releases NousCoder-14B: A Competitive Olympiad Programming Model Post-Trained
0 Commentarii 0 Distribuiri 50 Views
Zubnet https://www.zubnet.com