Microsoft and three Chinese universities have developed SkillOpt, a method that enhances AI models like GPT-5.5 by training instruction documents, known as 'skills,' in a manner similar to model weight training. The approach allows AI agents to perform better on tasks requiring specific procedures and tool use. The skill documents are treated as external, trainable states, with a second language model acting as an optimizer to suggest edits that improve performance. | Image: Yang et al.

The method involves training the skill document like model weights, only accepting changes that yield measurable improvements. A learning rate limits the number of edits per step, while a scheduler reduces the step size over time. Rejected edits are stored as negative examples for future use, and a slow update at the end of each training round ensures stability. The target model remains frozen during this process, and the optimizer model runs only during training. At inference time, the target model receives a Markdown file as context. | Image: Yang et al.

The authors tested SkillOpt on six benchmarks covering search, spreadsheets, document analysis, math, and embodied action. The method consistently outperformed or matched existing approaches, including handwritten skills and specialized methods. On GPT-5.5 in direct chat, the average performance across all benchmarks improved by about 23 points. The biggest gains were seen on tasks with strict format requirements and tool use, such as spreadsheet editing. Smaller models also benefited, suggesting that a well-trained skill can provide procedural knowledge these models lack. | Image: Yang et al.

Source: thedecoder