Model crashing with a 1.6 MB txt file?

Hi,

when I tried to run the model below with a very small dataset 600kb txt file , it works.

but when I load the dataset that contains the 1.6 MB txt file ( “Aurelie123/chatbotdatxt” ), the model crashes on google colab and I get an error of RAM issue but the file is only 1.6 MB !

Has anyone come across this before ?

Thanks


from datasets import load_dataset

from haystack import Document

from haystack.components.readers import ExtractiveReader

#trying on a 4 pages .txt dataset as the full file crashes due to lack of RAM

dataset = load_dataset(“Aurelie123/data2”)

# Convert the dataset into a list of Documents, each with a string content

docs = [Document(content=example[“text”]) for example in dataset[“train”]]

reader = ExtractiveReader(model=“deepset/roberta-base-squad2”)

reader.warm_up()

question = “Can I get more information about computer vision?”

result = reader.run(query=question, documents=docs)

print(result)

1 Like