I am trying to build one text-to-sql with huggingface chatdb/natural-sql-7b model, it seems it is getting stuck every time and not generating any result. here is my code. Another problem is its notworking with "cuda". It's showing "torch is not compiled w

AbhraSarkar · October 30, 2024, 8:50am

import torch
from db import get_schema
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("chatdb/natural-sql-7b")
model = AutoModelForCausalLM.from_pretrained(
    "chatdb/natural-sql-7b",
    device_map="auto",
    torch_dtype=torch.float16,
)

question = 'How many employees are there?'

prompt = f"""
### Task 

Generate a SQL query to answer the following question: `{question}` 

### PostgreSQL Database Schema 
The query will run on a database with the following schema:

{get_schema()}


### Answer 
Here is the SQL query that answers the question: `{question}` 
```sql
"""

print ("Question: " + question)
print ("SQL: ")

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

generated_ids = model.generate(
    **inputs,
    num_return_sequences=1,
    eos_token_id=100001,
    pad_token_id=100001,
    max_new_tokens=400,
    do_sample=False,
    num_beams=1,

)

outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
print(outputs)
print(outputs[0].split("```sql")[-1])

Output:

Loading checkpoint shards: 100%|█████████████████████████████████████████████████| 3/3 [00:00<00:00, 39.13it/s]
Some parameters are on the meta device because they were offloaded to the disk and cpu.
Question: How many employees are there?
SQL:

John6666 · October 30, 2024, 10:02am

torch will not work properly with CUDA unless you install it according to the instructions on the official site.
Also, the 7B model requires tens of GB of VRAM, so I think it may be OOM and stuck.

AbhraSarkar · October 30, 2024, 10:44am

Hey John, thanks for the response. Cuda problem I have resolved. I am getting the result too, but its taking too long (8mins). How can I make the model work fast? Could you please suggest proper specs where it will work super fast?
My machine specs:
i7 10thgen, 16GB RAM, NVIDIA 1660 TI 6GB, 2 TB SSD

John6666 · October 30, 2024, 10:50am

This is a VRAM requirement calculator for GGUF, and if you multiply the result of Q4_K_M by about 4, you can get the amount of VRAM required without quantization.
25GB of VRAM would be enough… that’s too expensive for such GeForce…

Topic		Replies	Views
Best way to text-to-sql Models	1	3112	September 24, 2024
Text to SQL Model Finetuning Beginners	2	1075	June 28, 2024
[Help] GPU with query answering 🤗Transformers	0	330	November 25, 2020
How to store hugging face model with flask postgresql 🤗Transformers	0	210	February 5, 2023
RAG Class for Question Answering 🤗Transformers	0	420	October 22, 2020

I am trying to build one text-to-sql with huggingface chatdb/natural-sql-7b model, it seems it is getting stuck every time and not generating any result. here is my code. Another problem is its notworking with "cuda". It's showing "torch is not compiled w

Related topics