Seeking Help with SmolAgents Integration for Trademark Predictor App 🚀

jpbrown96 · January 5, 2025, 6:54pm

Hi Hugging Face community!

I’m a new/learning developer working on a project that uses Hugging Face’s new SmolAgents library, and I’m encountering some challenges in getting everything to function smoothly. I’d greatly appreciate your advice and expertise!

Here is the repo on GitHub.

Project Overview

The app processes semi-structured trademark dispute case outcomes to generate predictive insights. Here’s the current tech stack:

Text Extraction: Google Document AI for parsing PDF case files.
Embedding Creation: Legal-BERT for generating vector embeddings from the case data.
Search & Storage: Pinecone for vector storage and efficient semantic search.
Task Orchestration: SmolAgents, which I’m aiming to use to streamline workflows.

Goal

I’m trying to use SmolAgents to:

Automate the orchestration of text extraction, embedding creation, and storage.
Set up custom tools for agent interactions with Pinecone and Document AI.
Fine-tune agent behaviours to ensure robust and reliable task execution.

Challenges

Tool Definition: Struggling to define and implement custom tools for SmolAgents to interact with the existing pipeline components.
Agent Debugging: Facing errors when agents attempt to connect or pass data between Pinecone, Document AI, and Legal-BERT.
Prompt Engineering: Finding it difficult to design effective prompts for SmolAgents to handle multi-step workflows.

What I’ve Tried

Followed the SmolAgents documentation to set up basic agents.
Tested individual pipeline components (e.g., Pinecone queries, Legal-BERT embeddings), which work fine in isolation.
Experimented with a few basic prompts, but the agents don’t seem to chain tasks as expected.

What I Need

Guidance on implementing custom tools within SmolAgents to interact with third-party services (e.g., Pinecone, Google APIs).
Best practices for debugging multi-step workflows with SmolAgents.
Examples or insights on crafting effective prompts for SmolAgents in complex use cases.

Environment Details

SmolAgents Version: Latest (as of January 2025)
Hugging Face Transformers Version: 4.x
Python Version: 3.x
Running on a local machine (Linux-based) and testing some components in a cloud environment.

I’m happy to share more specifics about my setup or the code if it helps! If anyone has experience working with SmolAgents or similar workflows, your input would be incredibly valuable. Thanks in advance!

Topic		Replies	Views
New Framework smolagents Beginners	3	718	January 15, 2025
Custom huggingface Tokenizer with custom model for BERT Beginners	0	779	May 13, 2021
Translate the docs Community Calls	1	21	April 23, 2025
Converting text stories into embeddings with metadata and uploading to Pinecone for chatbot and content creation Beginners	2	3766	June 15, 2024
Retrieval Augmented Generation using Transformer Eco System 🤗Transformers	0	465	October 12, 2023