How to fix the non-idempotency issue for the transformers test suite (codecov issue)

Currently, codecov on PRs reports a pretty random information most of the time. If you look at this analysis by Thomas Hu, he suggests that the transformers test suite is not idempotent. i.e. each test run produces a coverage report that is different from the previous run, even if there were no changes in the code. (Idempotence: definition)

There are several examples already in that ticket, but here is a recent “outrageous” report where a change in a pure doc file and no code change gets reported as a 2.17% decrease in coverage.

If you look at the reported percentages in the files, none of these make sense. Other than that the reported coverage from running the tests somehow wildly varies from run to run.

What I observed is that most of the time it’s the *_tf_* modules that are listed at the top of impacted files list, so I was thinking perhaps it has something to do specifically with TF, but it’s not always the case.

If you have experience with this kind of situation, please, kindly share your insights and how this can be fixed.

Thank you.

Well, after an extensive testing I couldn’t find any symptoms of non-idempotency. It doesn’t mean it doesn’t exist, I just couldn’t reproduce it on my machine.

There was one sub-issue of codecov generating an invalid report when it fails to find a code coverage report for the “base” it checks against. When this happens it goes looking for the nearest hash with the coverage report, which often leads to a report that is not representative of the true impact of the proposed PR, since it is no longer comparing the proposed code changes to the base against which it’d be applied.

A fix that prevents generation of invalid reports for when the base is lacking has been applied:

But looking at the recent PRs with no code change, the problem is still there. We still get coverage changes reported, when there are none:
e.g. Codecov, Codecov