Artificial Gladiator League — a platform where anyone can compete with their HF models in Chess and Breakthrough

Hi HF community,

My name is Chaim Duchovny. I am a 47 year old math teacher from Israel — and before that, a 15 year career as an insurance agent. I have no formal background in computer science or research.

First and most importantly: thank you. What you are building at Hugging Face — open, accessible AI for everyone — is exactly the vision that inspired me to build what I built.

Over the last three years, working completely alone, I did two things simultaneously:
I wrote an independent academic research paper in the field of General Game Playing, proposing a new algorithm for building AI gaming agents. You can find it here:
:link: doi.org/10.13140/RG.2.2.18795.09764

And I built a startup — Artificial Gladiator League — launching on April 26th at agladiator.com.

The idea of my startup is simple: a platform where anyone can build their own AI agent and compete in skill-based games like Chess and Breakthrough. The vision is to eventually let anyone create and publish their own games, build communities and markets around them, and earn from their ideas.

I especially want my platform to speak to young people aged 12 to 18. Instead of spending hours on TikTok, they can come to Artificial Gladiator League — build something, compete, create something meaningful, and learn AI and science along the way. I wrote most of the code using GitHub Copilot. I mention this proudly — because it proves exactly what I want my platform to prove: that anyone, with enough determination, can build AI agents and compete with them.

Why am I posting here specifically? Because Hugging Face is at the heart of how Artificial Gladiator League works.

Users do not upload their AI models to my server. Instead they connect their Hugging Face repository. At registration, they provide their HF API key — we use it once to fetch their model’s commit SHA and store it as their permanent baseline fingerprint. The token is never stored.
Every single day, before accessing any part of the platform, users must provide their HF read token again. We fetch the current commit SHA and compare it to their original baseline. If the model has changed — they are suspended from rated tournaments until they complete 30 new rated games with their updated model.

I know this sounds strict. But think of it like airport security: slightly inconvenient for honest people, but it protects everyone from cheating. The people who find it annoying are exactly the people it is designed to catch.
Our system — our entire competitive integrity guarantee — is only possible because of Hugging Face commit SHAs. Hugging Face is not just a tool we use. Hugging Face will be the foundation our fairness is built on.

Three things I am hoping for from this community:

Feedback on how we are using the HF Hub API — are there better approaches? Things I should know?
If this aligns with HF’s mission, any mention to the community would mean the world to us.
Developers, researchers, hobbyists who want to be among the first to compete on Artificial Gladiator League — you are all welcome!

I am not asking for a partnership on day one. I am asking for a conversation. Everything else can grow from there.
In the end, Artificial Gladiator League and Hugging Face share the same belief:
Artificial intelligence should belong to everyone. Let’s gladiate :crossed_swords:

— Chaim Duchovny
Founder, Artificial Gladiator League

1 Like

Feedback on how we are using the HF Hub API — are there better approaches? Things I should know?

You’re likely to encounter 429 errors when making frequent API calls or polling repositories. There are also some nuances regarding authentication (if applicable, I recommend using OAuth with a Read or Fine-grained token).


Your current approach is directionally good, but I would change the design in a few important ways.

What you are getting right

Using the Hub as a provenance layer is the right instinct. Hugging Face repos are Git-backed repositories, and the Hub API exposes the pieces you need to reason about a submission as a concrete artifact: repo metadata, refs, commits, and revision-pinned downloads. That is a strong foundation for a competition platform. (Hugging Face)

Your idea of tying a competitor to a specific Hub repo and refusing to treat silent model changes as “the same entrant” is also sound. The main improvement is to move from change detection to exact revision execution. HF supports loading by revision, and that revision can be a branch, tag, or full commit hash. (Hugging Face)

The first thing I would change

Stop making daily token pasting the main user ritual.

Hugging Face supports OAuth / OpenID Connect for “Sign in with Hugging Face,” including public OAuth apps without a client secret. It also exposes scopes such as openid, profile, read-repos, gated-repos, and webhooks, and the docs explicitly say to follow least privilege and request only the scopes you actually need. (Hugging Face)

So the better pattern is:

  • use HF OAuth for login and identity
  • request read-repos only if you need access to personal repos
  • request gated-repos only if you want to support public gated repos the user already has access to
  • request webhooks only if your app will manage HF webhooks on the user’s behalf. (Hugging Face)

That gives you a much better user experience and a cleaner security story than repeated raw-token entry. (Hugging Face)

The second thing I would change

Do not anchor the system only to a “baseline SHA.” Store a fuller submission identity:

  • repo_id
  • repo_type
  • submitted_ref such as a release tag or competition branch
  • approved_full_sha
  • submitted_by_user
  • pinned_at
  • runtime_profile

The reason is simple. Branches and tags are useful for human workflow. The full SHA is useful for immutability. HF’s list_repo_refs() returns branches and tags with their target_commit, and snapshot_download() supports loading the exact revision you approved. (Hugging Face)

The third thing I would change

Treat every rated submission as a pinned release artifact, not as “whatever is currently in the repo.”

A stronger flow is:

  1. user links a repo
  2. user selects a release tag or competition branch
  3. you resolve that to an exact full SHA
  4. every rated match runs from that exact revision
  5. if the repo changes later, that creates a new revision candidate, not a silent modification of the old one. (Hugging Face)

That makes your fairness claim much stronger. Instead of saying “we check if the model changed,” you can say “this match was run from this exact immutable revision.” (Hugging Face)

The specific HF API calls I would use

For repo access validation, use auth_check() first. It is built for exactly this and distinguishes between a repo that is missing or inaccessible and a gated repo the user is not authorized to access. That is much cleaner than building custom guesswork around 404s. (Hugging Face)

For resolving tags and branches, use list_repo_refs(). It returns both branches and tags, and each ref includes a target_commit. That is the cleanest way to turn a human-readable release marker into a concrete SHA. (Hugging Face)

For revision history, use list_repo_commits(). It returns commit objects, supports a revision argument, and is useful for audit trails, “what changed since last approved revision,” and moderation tooling. (Hugging Face)

For repo metadata at a specific revision, use model_info(repo_id, revision=..., securityStatus=True). That gives you per-revision metadata and lets you ask for security status information. (Hugging Face)

For materializing the exact artifact for rated play, use snapshot_download(repo_id, revision=approved_full_sha, ...) or hf_hub_download() for specific files. That is the point where your competition layer becomes reproducible. (Hugging Face)

One terminology fix

I would stop calling it an API key and call it a User Access Token. That is HF’s terminology, and the token docs are explicit that user access tokens are the preferred way to authenticate an application or notebook to Hugging Face services. They also distinguish fine-grained, read, and write tokens. (Hugging Face)

For your use case, the default should be fine-grained or read-only, not broad write access. HF explicitly says fine-grained tokens are useful in production environments, and read tokens are for reading Hub content such as private repos or inference. (Hugging Face)

What to do about gated and private repos

Support them carefully.

HF’s gated-model docs say gated access is granted to individual users, not whole organizations by default. That means your system should treat access as something the competitor personally proves, not something your platform broadly inherits. (Hugging Face)

So the clean rule is:

  • the user authenticates
  • you validate that this user can access this repo
  • you run only what that user is entitled to submit
  • you do not silently assume org-wide access semantics. (Hugging Face)

Use webhooks instead of aggressive polling

If you are currently thinking in terms of repeated checks before every interaction, I would reduce that and move to event-driven change detection.

HF webhooks are publicly available and are meant to react to changes in user or organization repos. For AGL, that is a better fit than constant polling. (Hugging Face)

A better pattern is:

  • check once when the repo is linked
  • pin the approved revision
  • subscribe to repo-change events
  • mark the linked agent as “dirty” or “new revision available” when the repo changes
  • re-check right before a rated match starts. (Hugging Face)

That also helps operationally because HF documents Hub rate limits and recommends replacing Hub API calls with Resolver calls where possible, since resolver rate limits are higher and better optimized. The huggingface_hub library also has retry handling for 429s. (Hugging Face)

Separate the person from the model revision

This is a product suggestion, but it matters technically.

You should model:

  • user
  • agent
  • agent revision

That way, when the repo changes, you are not saying “the player is suspended.” You are saying “revision A is historical, revision B is provisional until requalified.” That is cleaner for ratings, clearer for users, and a better match for Git-style repo history. (Hugging Face)

Require stronger metadata

I would make model cards part of the competition contract.

HF’s docs are explicit that model cards are essential for discoverability, reproducibility, and sharing, and that they should describe the model, intended uses, and limitations. (Hugging Face)

For AGL, every official submission should have at least:

  • game supported
  • interface version
  • intended use
  • limitations
  • training method
  • release notes for this revision. (Hugging Face)

That will make the platform look much more serious to technical users.

Add a benchmark layer early

HF’s Evaluation Results feature is useful for your case. HF describes it as a decentralized system where benchmark datasets host leaderboards and model repos store evaluation scores in .eval_results/, which then appear on both the model page and the benchmark leaderboard. HF also notes that this feature is still a work in progress. (Hugging Face)

That suggests a good split for AGL:

  • live ladder for excitement
  • benchmark suite for reproducibility

That combination is much stronger than a live ladder alone. (Hugging Face)

My blunt suggestion on the integrity claim

Do not say the entire integrity guarantee comes from commit SHAs alone.

Say this instead, in substance:

  • Hugging Face gives you the version identity
  • AGL enforces the competition identity
  • rated results are tied to an exact approved revision
  • execution happens under your own fixed competition rules

That is a more accurate and more defensible claim. The Hub gives you strong repo and revision primitives. Your platform still has to supply the rules, runtime control, and rating logic. (Hugging Face)

My recommended final architecture

I would implement the flow like this:

  1. Sign in with HF via OAuth. Request only openid, profile, and the minimum repo scopes you actually need. (Hugging Face)
  2. On repo link, call auth_check() and then model_info(). (Hugging Face)
  3. Resolve the user’s chosen release tag or branch with list_repo_refs() and store the resulting full SHA. (Hugging Face)
  4. Run every rated match from snapshot_download(... revision=approved_full_sha). (Hugging Face)
  5. Use webhooks to detect upstream changes and mark the agent as having a new revision candidate. (Hugging Face)
  6. Require a model card and later add evaluation results for benchmark visibility. (Hugging Face)

Bottom line

My answer is:

Yes, your HF Hub API usage is based on the right idea.
The biggest improvements are:

  • switch from repeated raw-token entry to OAuth + minimal scopes
  • switch from “baseline SHA check” to exact revision pinning
  • use auth_check(), list_repo_refs(), list_repo_commits(), model_info(), and snapshot_download() as the core API set
  • use webhooks for repo-change detection
  • require model cards
  • add benchmark publishing alongside the live ladder. (Hugging Face)

The single most important conceptual upgrade is this:

Do not rate “a repo.”
Rate an exact approved revision of a repo.
(Hugging Face)

Hi John,
Thanks a lot for your reply. I really appreciate it. I’ll go through it carefully, and if I have any questions, I won’t hesitate to ask.
Thanks again, this is very helpful.

Best,
Chaim

1 Like

Hi John

First I want to thank you very much for your response. To avoid confusion, I changed my username to my full name. I implemented some of the things you suggested, and they helped me a lot. My plan is for my site to be open source, and this way I invite you and everyone to participate in it. In the meantime, I’m publishing an artificial intelligence model for the game Breakthrough 8x8. You and everyone are welcome to use the model. It doesn’t play perfectly, but it definitely performed well against an AI that uses UCT. The model and the datasert is available at my repo (chaim-duchovny). You are most welcome. Should you have any questions feel free to adress them to me.

Again many thanks for your help.

Best

Chaim Duchovny

1 Like