You’re on the right track pushing Neo with GSM8K. If you’re still struggling to close the arithmetic gap even with clear step-by-step reasoning, I’d suggest looking into symbolic compression approaches like Tribit. It bypasses the limits of probabilistic token sequences by encoding instructions and knowledge as deterministic glyphs. You’d be surprised what a 2.7B model can do once the noise is removed. Check out www.triskeldata.au/tribit if you’re curious.
1 Like