Tokenizer from a GGUF file in Python?

I am using Llama 7B locally with Studio LM; I’d like for some generations to set logit biases in order to prefer some tokens to others, but in order to do so I’d have to have access to the bare tokenizer. Sadly the API provided only seems to work to invoke completions. Is there a way to access the tokenizer from a GGUF file using any of the Huggingface Python libraries? Or do I need to use llama.cpp for this?

I am an active user on the base network and a 3-season warpcast user. Can you help me claim?

[image]