I am running inference using the pipeline api. But I get the following warning which recommends using the Dataset api. How can I do so?
UserWarning: You seem to be using the pipelines sequentially on GPU.
In order to maximize efficiency please use a dataset
sentiment_analysis = pipeline("sentiment-analysis", model="siebert/sentiment-roberta-large-english", device=0)
result = sentiment_analysis(verb)
return result["label"], result["score"]
df = pd.read_csv(file)
df["roberta_sentiment"] = df["sentences"].apply(roberta_sent)
Hi, I came across this warning as well. By any chance did you find a solution?
I found the following help in the documentation, but I haven’t implemented yet because I’m just doing a POC with 50 documents. Sharing now in case it helps you get started!
Having same issue while using T5-base-grammar-correction for grammer correction on my dataframe with text column
rom happytransformer import HappyTextToText
from happytransformer import TTSettings
from tqdm.notebook import tqdm
happy_tt = HappyTextToText("T5", "./t5-base-grammar-correction")
beam_settings = TTSettings(num_beams=5, min_length=1, max_length=30)
text = "gec: " + text
result = happy_tt.generate_text(text, args=beam_settings)
df['new_text'] = df['original_text'].progress_apply(grammer_pipeline)
It runs very slow and provides the User Warning
/home/.local/lib/python3.6/site-packages/transformers/pipelines/base.py:908: UserWarning: You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset
How to implement this to efficiently utilise GPUs?