TF transformers model inputs and outputs showing none?

I am trying out training TF hugging face model:

from transformers import TFDistilBertForTokenClassification

model = TFDistilBertForTokenClassification.from_pretrained('distilbert-base-uncased', num_labels=3)

model.summary(), model.inputs, model.outputs

The summary indicates the model has multiple input/output shape. Normally, calling .inputs and outputs on a tf keras model return the shape (even if it expects a Dict as input/output). However, it seems for hugging face models, these are all returning None.

Is there a quick way to figure out the input/output spec? I am aware you can look for documentation/tutorials on that particular model, but it is rather not as direct and convenience as interrogating the model itself.

Hello :wave:

When you print your model summary you will see:

_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 distilbert (TFDistilBertMai  multiple                 66362880  
 nLayer)                                                         
                                                                 
 dropout_19 (Dropout)        multiple                  0         
                                                                 
 classifier (Dense)          multiple                  2307      
                                                                 
=================================================================
Total params: 66,365,187
Trainable params: 66,365,187
Non-trainable params: 0
_________________________________________________________________
(None, None, None)

Above classes have been instantiated from tf.keras.layers.Layer which has a super cool method called get_config() that returns configuration for custom layers.

All you have to do is:

model.classifier.get_config() and you will get

{'activation': 'linear',
 'activity_regularizer': None,
 'bias_constraint': None,
 'bias_initializer': {'class_name': 'Zeros', 'config': {}},
 'bias_regularizer': None,
 'dtype': 'float32',
 'kernel_constraint': None,
 'kernel_initializer': {'class_name': 'TruncatedNormal',
  'config': {'mean': 0.0, 'seed': None, 'stddev': 0.02}},
 'kernel_regularizer': None,
 'name': 'classifier',
 'trainable': True,
 'units': 3,
 'use_bias': True}

same for model.distilbert.get_config()

{'config': {'_name_or_path': 'distilbert-base-uncased',
  'activation': 'gelu',
  'add_cross_attention': False,
  'architectures': ['DistilBertForMaskedLM'],
  'attention_dropout': 0.1,
  'bad_words_ids': None,
  'bos_token_id': None,
  'chunk_size_feed_forward': 0,
  'cross_attention_hidden_size': None,
  'decoder_start_token_id': None,
  'dim': 768,
  'diversity_penalty': 0.0,
  'do_sample': False,
  'dropout': 0.1,
  'early_stopping': False,
  'encoder_no_repeat_ngram_size': 0,
  'eos_token_id': None,
  'exponential_decay_length_penalty': None,
  'finetuning_task': None,
  'forced_bos_token_id': None,
  'forced_eos_token_id': None,
  'hidden_dim': 3072,
  'id2label': {0: 'LABEL_0', 1: 'LABEL_1', 2: 'LABEL_2'},
  'initializer_range': 0.02,
  'is_decoder': False,
  'is_encoder_decoder': False,
  'label2id': {'LABEL_0': 0, 'LABEL_1': 1, 'LABEL_2': 2},
  'length_penalty': 1.0,
  'max_length': 20,
  'max_position_embeddings': 512,
  'min_length': 0,
  'model_type': 'distilbert',
  'n_heads': 12,
  'n_layers': 6,
  'no_repeat_ngram_size': 0,
  'num_beam_groups': 1,
  'num_beams': 1,
  'num_return_sequences': 1,
  'output_attentions': False,
  'output_hidden_states': False,
  'output_scores': False,
  'pad_token_id': 0,
  'prefix': None,
  'problem_type': None,
  'pruned_heads': {},
  'qa_dropout': 0.1,
  'remove_invalid_values': False,
  'repetition_penalty': 1.0,
  'return_dict': True,
  'return_dict_in_generate': False,
  'sep_token_id': None,
  'seq_classif_dropout': 0.2,
  'sinusoidal_pos_embds': False,
  'task_specific_params': None,
  'temperature': 1.0,
  'tie_encoder_decoder': False,
  'tie_weights_': True,
  'tie_word_embeddings': True,
  'tokenizer_class': None,
  'top_k': 50,
  'top_p': 1.0,
  'torch_dtype': None,
  'torchscript': False,
  'transformers_version': '4.18.0',
  'typical_p': 1.0,
  'use_bfloat16': False,
  'vocab_size': 30522},
 'dtype': 'float32',
 'name': 'distilbert',
 'trainable': True}

to interrogate the model.

1 Like