Moonshot AI has released Kimi K2.7 Code, an open-source model tailored for complex programming tasks and agent-based workflows. The model is available as an open-weights version on Hugging Face and is priced at $0.95 per million input tokens and $4.00 per million output tokens, significantly lower than its Western competitors. This pricing strategy makes it an attractive option for cost-sensitive applications. According to Moonshot AI, K2.7 Code is designed to outperform its predecessor on long-running, complex software engineering tasks. For general tasks outside of coding, the company still recommends K2.6. Kimi is also the model that coding tool provider Cursor resells in a modified form.
K2.7 Code shows improvements over its predecessor, K2.6, on several benchmarks. On Moonshot's in-house Kimi Code Bench v2, performance jumps from 50.9 to 62.0. On Program Bench, it climbs from 48.3 to 53.6, and on MLS Bench Lite, it rises from 26.7 to 35.1. The model also improves on agentic benchmarks, hitting 76.0 on MCP Atlas (up from 69.4) and 81.1 on MCPMark Verified (up from 72.8). However, in a head-to-head comparison with GPT-5.5 and Claude Opus 4.8, K2.7 Code trails on most coding benchmarks. GPT-5.5 scores 69.1 on Program Bench versus 53.6 for K2.7 Code. On Kimi Code Bench v2, it's 69.0 versus 62.0. Program Bench is a particularly tough test. Agents have to reproduce a program's behavior using only a compiled binary and its documentation without source code access, decompilation, or internet.
K2.7 Code uses a Mixture-of-Experts (MoE) architecture with one trillion total parameters, according to its model card. Only 32 billion of those are active per token. The model has 384 experts, with eight selected per token. Context length is 256,000 tokens. The model is multimodal and can process images and video alongside text. It uses a custom vision encoder called MoonViT with 400 million parameters. The architecture is identical to K2.5 and K2.6, so existing deployment configs can be reused directly. One key improvement, according to Moonshot AI, is more efficient reasoning. K2.7 Code uses about 30 percent fewer thinking tokens than K2.6, which means less 'overthinking.' The model enforces thinking mode and a 'preserve_thinking' mode that keeps full reasoning content across multiple conversation turns to boost performance in agent-based coding scenarios.
Source: thedecoder