Error Using Pydantic with LangChain and local model by Hugging Face for Structured Output

Hello everyone,

I’m currently facing a challenge while integrating Pydantic with LangChain and Hugging Face Transformers to generate structured question-answer outputs from a language model, specifically using the llama3 model. My goal is to have the model output structured data conforming to a Pydantic model, but I encounter an error during the output parsing phase. Below is a simplified script along with the error message I receive. I would greatly appreciate any insights or suggestions on how to resolve this issue.

Simplified Script:

from pydantic import BaseModel, Field, validator
from typing import List
from langchain_community.chat_models import ChatOllama
from langchain.output_parsers import PydanticOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_huggingface.llms import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, BitsAndBytesConfig
import torch

class QuestionAnswerPair(BaseModel):
    question: str = Field(..., description="The question text.")
    answer: int = Field(..., description="The integer answer, produced as output.")

    @validator('answer')
    def answer_must_be_int(cls, value):
        if not isinstance(value, int):
            raise ValueError("The 'answer' must be an integer.")
        return value

class QuestionAnswer(BaseModel):
    pairs: List[QuestionAnswerPair] = Field(..., description="A list of question-answer pairs, structured as expected by the LLM output.")

def load_llama3_hf_pipeline():
    config = BitsAndBytesConfig(load_in_8bit_fp32_cpu_offload=True, load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16)
    model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B", quantization_config=config, device_map="auto")
    tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B")
    transformers_pipeline = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=500)
    return HuggingFacePipeline(pipeline=transformers_pipeline)

def run_model(text):
    hf_pipeline = load_llama3_hf_pipeline()
    output_parser = PydanticOutputParser(pydantic_object=QuestionAnswer)
    prompt_template = "Analyze and generate a structured response as a list of question-answer pairs based on the provided text."
    prompt = ChatPromptTemplate.from_template(prompt_template)
    chain = prompt | hf_pipeline | output_parser

    try:
        input_dict = {'text': text}
        output = chain.invoke(input_dict)
        parsed_output = output_parser.parse(output)
        print(parsed_output)
    except Exception as e:
        print(f'Error parsing output: {e}')

# Example usage
run_model("Provide a comprehensive analysis of the situation.")

error recieved:

langchain_core.exceptions.OutputParserException: Invalid json output: [Human: ...rest of the error message]

The error indicates that the parser is receiving the entire input, including the prompt, rather than just the newly generated tokens, suggesting no new tokens were generated. This could point to multiple issues with my code, possibly in the model’s token generation settings or the configuration of the pipeline and parser.

1 Like

Hi,
Did the error got resolved? I am trying to use the huggingface model with a chain and pydantic parser. Receiving the OutputParserException while running the chain. Could you please assist if found any solution to it?

1 Like