Currently, codecov on PRs reports a pretty random information most of the time. If you look at this analysis by Thomas Hu, he suggests that the
transformers test suite is not idempotent. i.e. each test run produces a coverage report that is different from the previous run, even if there were no changes in the code. (Idempotence: definition)
There are several examples already in that ticket, but here is a recent “outrageous” report where a change in a pure doc file and no code change gets reported as a 2.17% decrease in coverage.
If you look at the reported percentages in the files, none of these make sense. Other than that the reported coverage from running the tests somehow wildly varies from run to run.
What I observed is that most of the time it’s the
*_tf_* modules that are listed at the top of impacted files list, so I was thinking perhaps it has something to do specifically with TF, but it’s not always the case.
If you have experience with this kind of situation, please, kindly share your insights and how this can be fixed.