First it was giving me errors due to a missing argument gpus
but I fixed that by adding,
parser.add_argument(’–gpus’, type=int)
to the parser and setting the gpus parameter at the run_pl.sh file. Doing so I then came up with this error, I can understand that this is an error caused by PL, but it is a misconfiguration exception which means we should be able to fix it ourselves in our code.
I am trying out text-classification example with pytorch-lightning (run_pl.sh). But it seems to throw out an exception.
Some weights of the model checkpoint at bert-base-cased were not used when initializing BertForSequenceClassification: ['cls.p
redictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.w
eight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predict
ions.transform.LayerNorm.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another tas
k or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to
be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly
initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Traceback (most recent call last):
File "run_pl_glue.py", line 186, in <module>
trainer = generic_train(model, args)
File "/lvol/bhashithe/transformers/examples/lightning_base.py", line 299, in generic_train
**train_params,
File "/lvol/bhashithe/env/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 853, in from_argparse_args
return cls(**trainer_kwargs)
File "/lvol/bhashithe/env/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 468, in __init__
self.tpu_cores = _parse_tpu_cores(tpu_cores)
File "/lvol/bhashithe/env/lib/python3.6/site-packages/pytorch_lightning/trainer/distrib_parts.py", line 526, in _parse_tpu_c
ores
raise MisconfigurationException("`tpu_cores` can only be 1, 8 or [<1-8>]")
pytorch_lightning.utilities.exceptions.MisconfigurationException: `tpu_cores` can only be 1, 8 or [<1-8>]
The environment is all up to date, which has 8 GPUs (V100) and no TPUs.