Ablation monotonicity: can we get a stable non-monotone curve under fixed model?

Tiny, synthetic bench. Keeping the model/hypers fixed and only expanding observed information, I see monotone curves:
• Forecast (AR3): MSE ↓ with more lags
• Naive Bayes: Accuracy ↑ with more clues
• Planning (Dijkstra): Regret ↓ with more revealed edges

Question: beyond finite-sample wiggles, is there a STABLE counterexample where the EXPECTED metric strictly worsens moving from F ⊂ G (same loss, same eval window)?

Repo (prereg + CSV + charts): GitHub - jacopo992010-oss/ablation-monotony-bench
Release v1 (single ZIP): see Releases
Live demo: AXIOM — Accesso ⇒ Performance

If you have a counterexample, I’ll replicate (same seeds/window) and add it to the README.

1 Like