Running and testing / BharatGPT-3B-Indic

I checked out BharatGPT-3B-Indic running it in following ways:

  1. From my CPU laptop and online model
  2. From my laptop model saved in my local file system.
  3. From collab and model called from huggingface ---- this worked abliet slowly with delay

I used a test script with Gradio UI code had to turn bits and bytes off.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import gradio as gr
import huggingface_hub
huggingface_hub.login(‘huggingface access code’')
print(huggingface_hub.whoami())
model_id = “CoRover/BharatGPT-3B-Indic”
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map=“auto”,
load_in_8bit=False, # Use bitsandbytes for memory-efficient loading

)

pipe = pipeline(
“text-generation”,
model=model,
tokenizer=tokenizer

)

def generate_response(message):
messages = [
{“role”: “system”, “content”: “You are a helpful assistant who responds in Hindi or English.”},
{“role”: “user”, “content”: message},
]
output = pipe(messages, max_new_tokens=256)
return output[0][“generated_text”]

gr.Interface(
fn=generate_response,
inputs=“text”,
outputs=“text”,
title=“Chat with BharatGPT-3B-Indic”,
description=“Runs locally using 8-bit inference”
).launch()

I got the following chat responses

input : please translate following quoted sentence in gujarati “I am english speaking”

[{‘role’: ‘system’, ‘content’: ‘You are a helpful assistant who responds in Hindi or English.’}, {‘role’: ‘user’, ‘content’: ‘please translate following quoted sentence in gujarati “I am english speaking”\n’}, {‘role’: ‘assistant’, ‘content’: ‘મારું અંગ્રેજી બોલવાનું છે.’}]

please translate following quoted sentence in gujarati “I am a gujarati”

[{‘role’: ‘system’, ‘content’: ‘You are a helpful assistant who responds in Hindi or English.’}, {‘role’: ‘user’, ‘content’: ‘please translate following quoted sentence in gujarati “I am a gujarati”\n’}, {‘role’: ‘assistant’, ‘content’: ‘મારું નામ ગુજરાતી છે.’}]

Analysis

“I am a Gujarati”
Response: મારું નામ ગુજરાતી છે.

Analysis:

  • Translation: “My name is Gujarati.” — which is incorrect.
  • Correct translation:
    :backhand_index_pointing_right: હું ગુજરાતી છું.
    (This means “I am Gujarati.”)

Incorrect translation why this may ne happening:

  1. The model may not have been instructed explicitly enough to translate into Gujarati system message limits it to respond in Hindi or English. This can bias the output.
  2. BharatGPT, while trained on multiple Indian languages, may not have strong enough grounding in Gujarati, or its chat template may not fully support instruction-following for translation.
  3. The “quoted sentence” format I have used in the input may be adding confusion.

I am going to add Gujarati translation to system prompt and test. If anyone has tried or has a testcase code would appreciate an input.

Other aspects

I was thinking of Converting the model to GGUF Format run it with llama.cppto see if i can now run locally without GPU. And works well even on low RAM (4–6GB) with quantized GGUF without Python/transformers overhead.

Thank you

Rashmikant Dave

1 Like

To make sure it is not the chat template instruction I changed it added Gujarati here is the code
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import gradio as gr
import huggingface_hub
huggingface_hub.login(‘’)
print(huggingface_hub.whoami())
model_id = “CoRover/BharatGPT-3B-Indic”
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map=“auto”,
load_in_8bit=False, # Use bitsandbytes for memory-efficient loading

)

pipe = pipeline(
“text-generation”,
model=model,
tokenizer=tokenizer

)

def generate_response(message):
messages = [
{“role”: “system”, “content”: “You are a helpful assistant who responds in Gujarati or English.”},
{“role”: “user”, “content”: message},
]
output = pipe(messages, max_new_tokens=256)
return output[0][“generated_text”]

gr.Interface(
fn=generate_response,
inputs=“text”,
outputs=“text”,
title=“Chat with BharatGPT-3B-Indic”,
description=“Runs locally using 8-bit inference”
).launch()

I am getting the same result which is

[{‘role’: ‘system’, ‘content’: ‘You are a helpful assistant who responds in Gujarati or English.’}, {‘role’: ‘user’, ‘content’: ‘please translate following quoted sentence in gujarati “I am a gujarati”’}, {‘role’: ‘assistant’, ‘content’: ‘મારું નામ ગુજરાતી છે.’}]

In Gujarati it should be

હું ગુજરાતી છું. so it is nothing to do with chat template I used earlier.

1 Like

Ollama is easy to install and uses Llama.cpp as its backend, so I think it’s convenient for testing.

1 Like

I think it’s worth trying to come up with a better prompt.

1 Like

Thank you will walk through the steps and run them this exercise will be helpful in applying to this and other models of interest

1 Like

Thank you i using and knwing more about the LLM LLAMA 3 2 prompts may help me get better prompts

1 Like