Research

Microsoft's SkillOpt Boosts GPT-5.5 With Markdown File

Microsoft's SkillOpt improves GPT-5.5 by over 20 points on procedural tasks using a trained Markdown file, according to a new paper.

Scientist using microscope for research in a modern laboratory setting.

Photo: Tima Miroshnichenko / Pexels

Microsoft and three Chinese universities have developed SkillOpt, a method that enhances AI models like GPT-5.5 by training instruction documents, known as 'skills,' in a manner similar to model weight training. The approach allows AI agents to perform better on tasks requiring specific procedures and tool use. The skill documents are treated as external, trainable states, with a second language model acting as an optimizer to suggest edits that improve performance. | Image: Yang et al.

The method involves training the skill document like model weights, only accepting changes that yield measurable improvements. A learning rate limits the number of edits per step, while a scheduler reduces the step size over time. Rejected edits are stored as negative examples for future use, and a slow update at the end of each training round ensures stability. The target model remains frozen during this process, and the optimizer model runs only during training. At inference time, the target model receives a Markdown file as context. | Image: Yang et al.

The authors tested SkillOpt on six benchmarks covering search, spreadsheets, document analysis, math, and embodied action. The method consistently outperformed or matched existing approaches, including handwritten skills and specialized methods. On GPT-5.5 in direct chat, the average performance across all benchmarks improved by about 23 points. The biggest gains were seen on tasks with strict format requirements and tool use, such as spreadsheet editing. Smaller models also benefited, suggesting that a well-trained skill can provide procedural knowledge these models lack. | Image: Yang et al.

Source: thedecoder

Key points

Microsoft's SkillOpt improves GPT-5.5 by over 20 points on procedural tasks using a trained Markdown file.
SkillOpt treats skill documents as trainable states, with a second language model acting as an optimizer.
The method accepts edits only if they improve performance on a held-out validation set.
SkillOpt tested on six benchmarks and outperformed or matched existing approaches.
The average performance of GPT-5.5 improved by about 23 points across all benchmarks.
Smaller models also benefited, suggesting procedural knowledge from well-trained skills.
The method combines bounded edits, validation, negative feedback, and slow updates for stable optimization.

Source: The Decoder Read the original →

WRITTEN BY

Maya Chen

AI Research & Breakthroughs

Maya breaks down the latest AI research papers, benchmarks, and technical breakthroughs into plain language.