Hi Hugging Face community,
I’m Mike Cunningham, an independent researcher from Texas, submitting my first preprint to arXiv cs.AI: “Privacy-Aware Split Inference with Speculative Decoding for Large Language Models over Wide-Area Networks”. It presents a deployable system for privacy-preserving LLM inference over WANs, splitting models between local and cloud GPUs while using lookahead decoding to amortize latency. Evaluated on Mistral 7B/12B with empirical inversion attacks showing tunable privacy tradeoffs.
As a first-timer, I need an endorsement from someone with 3+ papers in cs.AI or related fields (e.g., cs.LG, cs.CL). If you’re qualified and this aligns with your work on LLMs, split computing, or privacy (e.g., if you’ve published on speculative decoding or activation privacy), I’d greatly appreciate your help!
Endorsement code: QEHNUJ
Link to endorse: https://arxiv.org/auth/endorse?x=QEHNUJ
Paper repo (with full markdown): https://github.com/coder903/split-inference
Happy to discuss or provide more details—thanks in advance!
Best,
Mike