Torchscript Example for BERT

Hey @hikushalhere thanks for raising this issue! This looks like an error in the guide and the only reason the code runs is because the tensor used for segment_ids is similar to what attention_mask should be.

The torchscript=True flag is used to ensure the model outputs are tuples instead of ModelOutput (which causes JIT errors).

Would you like to open a PR to fix the guide?