Inference with falcon7b to generate essays in Google Colab?

Tasaloris13 · July 23, 2023, 2:50pm

I am trying to fine-tune falcon7b on a series of essays I wrote to see how well it can generalize my writing style to new essay prompts. However, doing this requires using a pretty lengthy system prompt, plus a ~300-400 word essay completion for each prompt. Using a T4 GPU and QLoRA I am able to get a model with good loss in ~3 hours. But when I try and generate a whole essay with the fine-tuned model by setting it up in a pipeline, I can wait 20 minutes and not have anything be produced. Is it typical to do inference with models like this in Colab? I can share code if that helps but my question is more theoretically concerned with the fact of if this is even possible. If not, I was also considering trying to run the model locally since I have an Apple M1 chip and I saw people getting fast inference on that with falcon7b. Also, this is my first time posting here, so apologies for any formatting issues or norms I haven’t followed.

Topic		Replies	Views
I fine-tuned the falcon-7b, why is my result such a let down? Beginners	2	674	August 18, 2023
Performance of the idefics-9b Models	0	21	July 16, 2024
Fine-tuning BigBirdPegasus Models	0	453	October 13, 2021
Where to run inference on a fine-tuned sentence transformer model Beginners	0	258	August 20, 2023
Any advice on LLM inference over a large dataset? 🤗Transformers	0	778	August 16, 2023

Inference with falcon7b to generate essays in Google Colab?

Related topics