Run models on a desktop computer?


I’ve been using some huggingface models in notebooks on SageMaker, and I wonder if it’s possible to run these models (from directly on my own PC? I’m mainly interested in Named Entity Recognition models at this point.

I assume it’d be slower than using SageMaker, but how much slower? Like… infeasibly slow?

I’m a software engineer and longtime Linux user, but fairly new to AI/ML.

Also, I browsed through the docs here a little bit, but didn’t see a basic “Getting Started” type of page – does that exist?

Thanks for any advice.

hello @antcodes ,

Yes, you can run all models from the hub locally.
Maybe you can start by here: Installation
Setting up a local python environment, and installing the required packages.

For example, if you run this code, from base-NER
It will download the model to your local cache.

You can read more about the pipelines here

from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline

tokenizer = AutoTokenizer.from_pretrained("dslim/bert-base-NER")
model = AutoModelForTokenClassification.from_pretrained("dslim/bert-base-NER")

nlp = pipeline("ner", model=model, tokenizer=tokenizer)
example = "My name is Wolfgang and I live in Berlin"

ner_results = nlp(example)
Thanks so much for the quick reply. That’s really helpful! I’m going to get started setting up a python virtual environment.

I tried installing “HuggingFaceH4/starchat-beta” but it did not work, the above code worked but for Starchat model I am getting multiple memory issues. I have 1 GPU 15GB with 64GB RM Ubuntu.

Hi - I’ve tried doing it this way and with git and huggingface-cli on Windows all to no effect. The model I’m trying to download, facebook/nllb-moe-54b, is quite large and I encouter errors every time I try to download. Is there a way to ensure that this is done correctly?

is there any suggestion to simply assess local GPU performance againt the target model that run via pipeline?
I believe the most choice of GPU for individual developers are Nvidia RTX serials, just want to make sure the GPU can run the model before downloading from hf.

