Trying to reproduce hugging face results

I posted these two in the beginners category, but perhaps that was wrong.

I thought I was doing something stupid, but I would greatly appreciate any advice someone could give on how huggging face code is behaving behind the scenes.

I have two methods I’ve tried working on now, and in the past, I cannot replicate the hf results in my methods.

[hugging face trainer behaving differently]([What does hugging face trainer do special?](https://hf trainer behaves differently))

difference of training and inference

Overall, I just want to build my own methods that perform as well as hugging face. I feel like I am missing something obvious or doing something dumb, but I would really appreciate it if a pro pointed me in the right direction. I’ve tried reading the source code behind these, and nothing really stood out.

Is there some advanced guide to hugging face training and inference I can read?