Hey there! I’m a beginner, using Argilla for labeling. I’m deployed in a HF Space for my startup. I’m using the Python API, and am log()
ing dicts rather than creating rg.Record
objects.
I’m a bit confused about how to use Suggestion
s to have both AI-generated responses to my Question
s, as well as human-labeled responses.
My schema looks like this:
settings = rg.Settings(
guidelines="""Given the Attribute we're trying to evaluate, and with the given examples \
and the Behavior Interview Question in mind, score the candidate's response on a scale of 0 to 10 and \
identify quotes from their response which provide evidence to back up your score.""",
fields=[
rg.TextField(
name="candidate",
title="Candidate",
description="The candidate we are trying to judge.",
use_markdown=False,
),
rg.TextField(
name="interview_id",
title="Interview ID",
description="The internal ID for the interview.",
use_markdown=False,
),
rg.TextField(
name="attribute",
title="Attribute",
description="The attribute we are trying to judge in the candidate's response to the Behavioral Interview Question.",
use_markdown=True,
),
rg.TextField(
name="attribute_definition",
title="Definition",
description="Definition of this Attribute.",
use_markdown=True,
),
rg.TextField(
name="examples",
title="Examples",
description="Example rating and quotes for this Attribute.",
use_markdown=True,
),
rg.TextField(
name="biq",
title="BIQ",
description="The Behavioral Interview Question we have asked in order to judge the candidate's fit with the given Attribute.",
use_markdown=True,
),
rg.TextField(
name="response",
title="Candidate Response",
description="The Candidate's response to the Behavioral Interview Question.",
use_markdown=True,
),
],
questions=[
rg.RatingQuestion(
name="rating",
title="numeric rating",
description="What is the candidate's score for this attribute, from 0-10, using the examples as guidance?",
values=list(range(11)), # 0..10
),
rg.SpanQuestion(
name="quotes",
title="Quotes",
description="Candidate quotes that provide evidence for the score.",
field="response",
allow_overlapping=True,
labels=["evidence"],
),
rg.TextQuestion(
name="reasoning",
title="Reasoning",
description="LLM's explanation of its rating.",
)
],
)
I’m using a Pydantic model for a bit of runtime type checking, but then convert the data into a simple dict
before logging it:
record = ArgillaValuesModel(candidate = interview_info["candidateName"],
interview_id = interview_config.interview_id(),
attribute = attr_name,
attribute_definition = attr.description,
examples = "<br>".join([f"Text: {example.text}\nScore: {example.score}" for example in attr.examples]),
biq = q_text,
response = response,
quotes = quote_ranges,
rating = score,
reasoning = reasoning,
)
pprint.pp(record.dict())
argilla_records.append(record.dict())
I’m then calling log()
thusly:
dataset.records.log(argilla_records)
Ok, so… in this case the things I’m inserting for the Question
fields are from the LLM, so they should be Suggestions
, right? What’s the simplest way to tweak this?
The example at Add, update, and delete records - Argilla Docs doesn’t exactly match up to my situation; it’s geared about adding suggestions for a label. In my case, I have three questions: rating
, quotes
, and reasoning
…
I want to keep both the AI-generated answers and the human-generated labels in my Argilla dataset.
What’s the easiest way to add this?