Amazon SageMaker AI offers a method to enhance the accuracy of tool-calling for small language models (SLMs) by combining Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO). These techniques help AI agents select the correct tools for tasks, reducing errors and improving automation reliability. The solution leverages training data and feedback to refine model behavior, ensuring better performance in real-world applications. The example uses the When2Call dataset, which includes scenarios for evaluating tool-calling decisions. This approach allows developers to focus on training code rather than managing infrastructure, as SageMaker AI handles the underlying resources. By integrating SFT and DPO, organizations can build more effective AI systems that interact with external applications, expanding AI's utility in both consumer and enterprise environments. Source: awsml
Supervised Fine-Tuning (SFT) involves curating a dataset that aligns with the model’s intended function, providing explicit examples of how the model should interact with specific tools. This method helps the model recognize tool-specific language, commands, and constraints. Direct Preference Optimization (DPO) further refines these interactions by incorporating human feedback or predefined objectives into the training loop. The training data in DPO includes 'like this, not like that' preferences, which optimize the same goals as reinforcement learning without reward functions or models. This reduces training time and resource requirements while maintaining model quality. Source: awsml
The example uses the When2Call dataset published by NVIDIA, which evaluates tool-calling decisions for foundation models. It includes scenarios for generating tool calls, asking follow-up questions, and indicating when questions cannot be answered with available tools. The dataset has three parts: one for SFT with 15,000 samples, one for DPO with 9,000 samples, and a test dataset with two files for evaluation. The training code and synthetic data generation scripts are available in NVIDIA’s GitHub repository. Source: awsml
Source: awsml