Software

Amazon SageMaker AI Enhances Tool Calling with SFT and DPO

Amazon SageMaker AI enables improved tool-calling accuracy for small language models using SFT and DPO, with 15,000 training samples in the dataset.

Image: AWS Machine Learning

Amazon SageMaker AI offers a method to enhance the accuracy of tool-calling for small language models (SLMs) by combining Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO). These techniques help AI agents select the correct tools for tasks, reducing errors and improving automation reliability. The solution leverages training data and feedback to refine model behavior, ensuring better performance in real-world applications. The example uses the When2Call dataset, which includes scenarios for evaluating tool-calling decisions. This approach allows developers to focus on training code rather than managing infrastructure, as SageMaker AI handles the underlying resources. By integrating SFT and DPO, organizations can build more effective AI systems that interact with external applications, expanding AI's utility in both consumer and enterprise environments. Source: awsml

Supervised Fine-Tuning (SFT) involves curating a dataset that aligns with the model’s intended function, providing explicit examples of how the model should interact with specific tools. This method helps the model recognize tool-specific language, commands, and constraints. Direct Preference Optimization (DPO) further refines these interactions by incorporating human feedback or predefined objectives into the training loop. The training data in DPO includes 'like this, not like that' preferences, which optimize the same goals as reinforcement learning without reward functions or models. This reduces training time and resource requirements while maintaining model quality. Source: awsml

The example uses the When2Call dataset published by NVIDIA, which evaluates tool-calling decisions for foundation models. It includes scenarios for generating tool calls, asking follow-up questions, and indicating when questions cannot be answered with available tools. The dataset has three parts: one for SFT with 15,000 samples, one for DPO with 9,000 samples, and a test dataset with two files for evaluation. The training code and synthetic data generation scripts are available in NVIDIA’s GitHub repository. Source: awsml

Source: awsml

Key points

Amazon SageMaker AI improves tool-calling accuracy for small language models using SFT and DPO.
The When2Call dataset includes 15,000 samples for supervised fine-tuning.
The dataset for preference alignment contains 9,000 samples for DPO.
The test dataset includes two files for evaluation: Multi-Choice Question and LLM-as-a-judge.
SageMaker AI handles training infrastructure, allowing developers to focus on training code.
Direct Preference Optimization uses 'like this, not like that' preferences to guide model training.
The solution enables AI agents to autonomously interact with external applications for complex tasks.

Source: AWS Machine Learning Read the original →

WRITTEN BY

Theo Almeida

AI Software & Developer Tools

Theo covers AI software, developer tools, frameworks, and the platforms builders use every day.

Amazon SageMaker AI Enhances Tool Calling with SFT and DPO

Key points

Related articles

Current AI Launches Open-Source AI Chatbot for Global Access

Smartsheet Deploys Remote MCP Server on AWS

Amazon Introduces Mobile Layout for Quick Dashboards

Linus Torvalds Supports AI in Linux Kernel Development