Hello!
I’m currently trying to fine-tune DistilGPT-2 with Pytorch for a code completion task. My corpus is arranged like the following example:
<|startoftext|>
public class FindCityByIdService {
private CityRepository cityRepository = ...
<|endoftext|>
My first attempt was to run the following command:
python run_clm.py
--model_type=gpt2 \
--model_name_or_path distilgpt2 \
--do_train \
--train_file $TRAIN_FILE \
--num_train_epochs 100 \
--output_dir $OUTPUT_DIR \
--overwrite_output_dir \
--save_steps 20000 \
--per_device_train_batch_size 4 \
After doing some generation tests, I realized that the model is not predicting \ n
for any given context. I imagine that some pre-process stage or something similar is missing. But anyway, what should I do so that \ n
be predicted as expected?
Thanks!!