We’ve published a dataset of 13,454 tool definitions, each with JSON Schema for both input parameters and output structures—ideal for robust tool-learning.
Derived from Team-ACE/ToolACE, an automatic agentic pipeline generating accurate, complex, and diverse tool-learning data.
We used the ToolACE dataset as our base. We extracted tools from conversation messages, filtered and normalized the extractions, then validated each tool with an LLM and generated an output schema for each.