I built a sovereign AI system on a Mac Mini that kept forgetting facts written in its own system prompt. Instead of upgrading hardware, I figured out why — and found some things I was not expecting.
The obvious part: moving critical facts from the middle to the beginning and end of the system prompt fixes recall (2.0 to 7.0 on a verification battery). This builds on Liu et al.'s lost-in-the-middle work.
The less obvious part: a model with 83.4% IFBench scored 3.4/10 on fact recall while a model with 23.9% IFBench scored 7.5/10 after restructuring. Instruction-following and fact recall appear to be independent capabilities. I have not seen this documented elsewhere.
The paper also covers a behavioral rule methodology that took a 32B model from 6.2 to 9.4 across seven dimensions with cold restart persistence, and an automatic correction persistence pipeline.
No CS background. No academic affiliation. Built on consumer hardware. Full data in the paper: https://zenodo.org/records/19425776
If you have time to read it, it is my first research and would welcome your feedback