Tokenizer from a GGUF file in Python?

Stur86 · May 6, 2024, 12:15pm

I am using Llama 7B locally with Studio LM; I’d like for some generations to set logit biases in order to prefer some tokens to others, but in order to do so I’d have to have access to the bare tokenizer. Sadly the API provided only seems to work to invoke completions. Is there a way to access the tokenizer from a GGUF file using any of the Huggingface Python libraries? Or do I need to use llama.cpp for this?

Luckystrikesvr · May 6, 2024, 6:41pm

I am an active user on the base network and a 3-season warpcast user. Can you help me claim?

[image]

Topic		Replies	Views
Running GGUF model files using Auto classes 🤗Transformers	2	2469	March 2, 2024
Unavailable wav2vec2 tokenizer Intermediate	0	493	December 10, 2021
Using Token to Access Llama2 Beginners	3	14908	February 21, 2024
Can I pass a text file to the tokenizer? Beginners	0	412	July 1, 2022
How to obtain GPT2 tokens without Transformers library? 🤗Transformers	0	256	November 28, 2022

Tokenizer from a GGUF file in Python?

Related topics