Where does .tokens() come from/inherit from in hugging face

I have been reading NLP with transformers, there I saw this function .tokens. where is this function .tokens() actually written in the library. I wanted to know how can I navigate the libraries of huggingface and use fuctions on my own.
text = “Jack Sparrow loves New York!”

bert_tokens = bert_tokenizer(text).tokens()

Here’s a “teach a man to fish” answer that I’m not intending to be snarky but it might come across that way:

  • Go on the Huggingface transformers Github
  • In the search bar on the top right enter .tokens() or def tokens (since what you’re asking about says .tokens(), that means it’s a function/method and so it’s defined somewhere in the code as def tokens(...)
  • Then read the code and learn

In this case we can see that bert_tokenizer(text) is returning a BatchEncoding object and tokens is function of the BatchEncoding class. So it’s in the code here.

Another thing is that if you use an IDE like VSCode, in your own code, you can just right click things like “.tokens()” and then click “go to definition” and it’ll take you to where in the Huggingface code something is defined.

Either way, the best way to learn a library is to use it a lot and read the code when you don’t understand something.

1 Like

Thank you so much @dblakely . I had gone through so much of their documentation but could not find it any where. This has been really one of the great thing that I have learned from you today. Thanks for helping me out and teaching me how to catch a fish.

1 Like

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.