AI News

Distribuit link

2026-01-19 05:31:01 -

A distribuit un link

2026-01-19 05:31:01 -

Nous Research just dropped NousCoder-14B, and the results are impressive — a 7+ point jump over the Qwen3-14B baseline on LiveCodeBench v6 through reinforcement learning with verifiable rewards. Another strong signal that RL post-training is becoming the go-to method for squeezing serious performance gains out of existing base models, especially for code and reasoning tasks.

WWW.MARKTECHPOST.COM

Nous Research Releases NousCoder-14B: A Competitive Olympiad Programming Model Post-Trained on Qwen3-14B via Reinforcement Learning

Nous Research has introduced NousCoder-14B, a competitive olympiad programming model that is post trained on Qwen3-14B using reinforcement learning (RL) with verifiable rewards. On the LiveCodeBench v6 benchmark, which covers problems from 08/01/2024 to 05/01/2025, the model reaches a Pass@1 accuracy of 67.87 percent. This is 7.08 percentage points higher than the Qwen3-14B baseline of […] The post Nous Research Releases NousCoder-14B: A Competitive Olympiad Programming Model Post-Trained

0 Commentarii 0 Distribuiri 50 Views