Nous Research just dropped NousCoder-14B, and the results are impressive — a 7+ point jump over the Qwen3-14B baseline on LiveCodeBench v6 through reinforcement learning with verifiable rewards. Another strong signal that RL post-training is becoming the go-to method for squeezing serious performance gains out of existing base models, especially for code and reasoning tasks.
0 Commentarios
0 Acciones
50 Views