How to replicate the best model from AutoNLP?

ssam9 · November 15, 2021, 11:44am

Hi,

First of all, a big thanks to the Hugging face team for bringing AutoNLP to us. A really amazing tool. I had the chance to try it out today and it looks very promising and interesting. I was trying to find a best model for a binary text classification task and the config of the best model is shown below:

{
  "_name_or_path": "AutoNLP",
  "_num_labels": 2,
  "architectures": [
    "BertForSequenceClassification"
  ],
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "id2label": {
    "0": "negative",
    "1": "non-negative"
  },
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "label2id": {
    "negative": 0,
    "non-negative": 1
  },
  "layer_norm_eps": 1e-12,
  "max_length": 96,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "pad_token_id": 0,
  "padding": "max_length",
  "position_embedding_type": "absolute",
  "problem_type": "single_label_classification",
  "transformers_version": "4.8.0",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30522
}

If I want to start from the scratch and replicate this best model, will I get the same results? From the config file, it seems the best model uses a Bert-large model, drop-out of 0.1, gelu activation. Where can I find the optimizer that was used/ learning rate/batch size and other things to replicate the code?

Say, for example, I am using the code from this Collab notebook, how can I replicate the best model from scratch?

Thanks again!!

abhishek · November 15, 2021, 12:25pm

It would be hard to replicate as there are several things about training that are not open to the end-user: hyperparameters, optimizer, scheduler, etc. (Un)fortunately, this information will remain closed. You can call it the “secret sauce” of AutoNLP. The final model is, however, open to the end-user and you are free to do whatever you want with it.

ssam9 · November 15, 2021, 1:42pm

Thanks Abhishek. Thats a bit sad, but understandable. I’m not sure about the reason behind the names given to these models like runny fox or tubby snail. It makes AutoNLP look a bit less credible if we were to recommend the use of these models in a company’s project. Nonetheless, its an amazing tool.

(P.S: Your videos are super helpful. Thanks for making those available on your channel. If I may suggest, it would be very interesting to see a layer-wise visualization of one of these bert models to really understand what is happening at each stage.)

Thanks for your quick response.

abhishek · November 15, 2021, 2:00pm

The model names have nothing to do with how they are trained. It’s similar to the names of docker containers. Something similar to: https://github.com/moby/moby/blob/master/pkg/namesgenerator/names-generator.go

Unfortunately, I have to disagree Unlike many other AutoML tools, AutoNLP provides you with all the trained models in the end. You have the weights, you have the tokenizer. You are free to use these models for many different purposes: using them directly with Hugging Face’s API Inference (thus saving several days and sometimes even months’ worth of engineering time) or for further fine-tuning of the models (still saving several days of work). AutoNLP provides you with the best possible models so that one doesn’t have to dig in themselves. I’m personally not aware of any tools that can train tens or even hundreds of SOTA transformer models which are production-ready And that’s why many enterprises already use and love AutoNLP You can read some of the testimonials here: AutoTrain – Hugging Face.

If you have any further concerns about how AutoNLP can be used in your industry, please feel free to write at autonlp [at] huggingface [dot] co and we can discuss further.

ssam9 · November 15, 2021, 3:07pm

Sorry if I’ve offended. It was just a small suggestion.

Thanks for the explanation though.

abhishek · November 15, 2021, 3:18pm

Hey! Not at all!!! I was just explaining

ssam9 · November 15, 2021, 3:20pm

Thanks for your understanding and for the support

Topic		Replies	Views
What kind of models is AutoNLP using? 🤗AutoTrain	6	1385	August 2, 2021
Cannot replicate xlm-roberta-large-xnli Results Models	0	496	September 2, 2021
How to upload a modified architecture of a BERT model Models	0	237	August 25, 2023
How to get code to replicate autotrain model Beginners	2	342	July 21, 2023
DialoGPT fine-tuning dataset format Models	3	722	April 27, 2021

How to replicate the best model from AutoNLP?

Related topics