I’m not very familiar with JavaScript, but could it be that the front-end events aren’t being triggered properly? Since Gradio allows you to call JavaScript from event handlers, why not try using that?
Short version
Your Reversi AI probably does make the move. The board and status update because Python/Gradio returns the new game state.
The missing part is probably this:
The browser is not reliably receiving, detecting, or accepting the delayed “AI moved, now play sound” event.
Your app currently has two different sound paths:
| Situation |
How sound is triggered |
Why it behaves differently |
| Your move |
JavaScript handles the board click and plays sound immediately |
This happens close to a real user gesture, so browser audio policy is friendlier |
| AI move |
Python computes the AI move, returns hidden metadata HTML, and JavaScript must notice a DOM mutation later |
This is delayed, backend-driven, and depends on a fragile hidden-HTML/MutationObserver bridge |
That difference explains the symptom very well: your move sound works, but the AI move sound does not.
The strongest fix is:
Stop using hidden HTML plus MutationObserver as the AI-sound event bus. Return plain JSON metadata from Python, place it in a real Gradio component such as a Textbox, and call your browser audio code through a documented Gradio .change(..., js=...) listener.
Relevant docs:
1. What your app is doing now
Your project is a Gradio/Hugging Face Space Reversi game. It has:
- a Python backend that owns the game state and AI move;
- a Gradio UI with 64 board buttons;
- local WAV files under
sounds/;
- JavaScript that loads sounds and plays spatialized move/flip/error/pass sounds;
- screen reader text and ARIA/live-region support;
- a hidden metadata channel intended to tell JavaScript what happened after Python processes a move.
In app.py, your UI has this hidden metadata output:
move_metadata_view = gr.HTML(
visible=False,
elem_id="move-metadata-container"
)
Then _build_ui_payload() builds metadata like this:
payload = {
"moves": moves_metadata,
"ts": time.time()
}
escaped_metadata = json.dumps(payload).replace("'", "'")
metadata_html = f"<div id='move-metadata' data-payload='{escaped_metadata}'></div>"
That hidden HTML is returned as one of the Gradio outputs.
In logic.py, the backend creates metadata for things like:
- valid human move;
- invalid move;
- pass;
- AI move.
For AI moves, the backend does roughly this:
ai_move = ai_player.choose_move(board, ai_color)
if ai_move is not None:
before_ai = board.copy()
board.apply_move(ai_move[0], ai_move[1])
ai_flips = _changed_to_player(before_ai, board, ai_color)
moves_metadata.append({
"type": "move",
"player": ai_color,
"r": ai_move[0],
"c": ai_move[1],
"flips": ai_flips,
})
So the backend is not silent. It is producing a description of the AI move.
The weak part is not “does Python know the AI moved?” The weak part is:
Does the browser reliably notice that returned metadata and play audio after the Gradio update?
Right now, probably not.
2. Why your move sound works
Your human move sound is played optimistically in JavaScript.
That means the browser does not wait for Python first. When you click a board square, your frontend:
- finds the clicked square;
- checks whether it looks locally valid;
- updates the local visual board;
- immediately calls the audio engine;
- only then waits for Gradio/Python to return the authoritative state.
That is a good design for responsive game feel.
It also happens near a real user gesture: a click, keypress, or touch. Browser audio systems are much more willing to allow sound when it follows a user interaction.
This path is direct:
user click
↓
JavaScript click handler
↓
AudioEngine.playMoveSequence(...)
↓
sound plays
This is why your move sound can work even while the AI sound fails.
3. Why the AI move sound fails
The AI move sound uses a different path:
user click
↓
Gradio sends event to Python
↓
Python applies human move
↓
Python computes AI move
↓
Python returns hidden metadata HTML
↓
Gradio updates hidden HTML component
↓
MutationObserver should notice metadata
↓
handleMetadataUpdate(...) should run
↓
AudioEngine should play AI move sound
That chain has several fragile links.
The biggest one is the MutationObserver.
Your current observer does this:
const metaContainer =
document.getElementById('move-metadata') ||
document.getElementById('move-metadata-container');
if (metaContainer) {
observer.observe(metaContainer, {
attributes: true,
attributeFilter: ['data-payload']
});
}
This watches only for data-payload attribute changes on the observed node.
But Gradio may update the gr.HTML component by replacing its child HTML, not by mutating the same existing node’s data-payload attribute.
In other words, your code is good at detecting this:
<div id="move-metadata" data-payload="old"></div>
changing into this on the same DOM node:
<div id="move-metadata" data-payload="new"></div>
But Gradio may effectively do this instead:
<div id="move-metadata-container">
<div id="move-metadata" data-payload="old"></div>
</div>
becoming:
<div id="move-metadata-container">
<div id="move-metadata" data-payload="new"></div>
</div>
where the inner node is replaced.
That kind of update is a child-list/subtree mutation, not necessarily an attribute mutation on the exact node your observer is watching.
MDN’s MutationObserver.observe() docs explain the relevant options:
attributes: true watches attribute changes;
childList: true watches child nodes being added or removed;
subtree: true watches descendants too;
characterData: true watches text node changes.
Your board observer watches child changes, but your metadata observer only watches attributes. That makes the AI metadata trigger easy to miss.
4. Second major issue: browser autoplay policy
Even if the metadata detection is fixed, browser audio policy still matters.
Modern browsers do not let pages freely play sound at arbitrary times. Chrome’s autoplay policy applies to Web Audio. Chrome says that if an AudioContext is created before the page receives a user gesture, it may start in the "suspended" state and must be resumed after the user interacts with the page.
Relevant docs:
The key practical rule is:
Python cannot play sound in the user’s browser. Python can only return data. Browser JavaScript must receive that data and play sound from an already-unlocked audio context.
Your human move sound is close to the click. Your AI move sound happens later, after a backend round trip. That delayed playback is more likely to fail unless the audio context has already been unlocked.
Your JavaScript already tries to resume the context in several places, which is good. But your AudioEngine.play() currently logs a warning if the context is not running and then continues toward creating a source node.
This part should be stricter. If the context is not running, return false and log a clear failure.
5. Third issue: the file URL should be simpler
Your app uses local WAV files from sounds/, and app.py launches with something like:
demo.launch(css=APP_CSS, js=APP_JS, allowed_paths=[sounds_path])
That general idea is correct.
Gradio’s file access docs say exposed files are available through:
/gradio_api/file=<local-file-path>
and allowed_paths can expose additional files or directories.
So your JavaScript sound loading should use the documented route directly:
const url = `/gradio_api/file=sounds/${sound}`;
const response = await fetch(url);
This is simpler and less fragile than constructing a URL from window.location.pathname.
This may not be the main bug because your human move sound works, but it is still a good cleanup.
6. Things that are probably not the main cause
Not likely: “the AI did not move”
If the board visibly updates after the AI turn, the AI computation path is working. Your backend also explicitly appends AI move metadata after applying the AI move.
So the issue is probably not:
AI failed to choose a move
It is more likely:
AI move metadata was returned, but frontend did not reliably react to it
Not likely: “the WAV files are missing”
If your human move sound works, at least some sound files are loaded and decodable.
You should still test file serving directly, but missing WAV files are unlikely to be the core cause.
Not useful: time.sleep() in Python
A backend delay like this does not help the browser play sound:
if DEBUG:
time.sleep(0.8)
That sleep happens before Python returns the response. The browser cannot play metadata it has not received yet.
Timing between human and AI sounds belongs in JavaScript, after the metadata arrives.
7. The architecture I recommend
Use this boundary:
Python owns:
- authoritative board state
- legal move validation
- AI move choice
- move metadata
- status text
- screen reader text
JavaScript owns:
- audio unlocking
- sound-file loading
- sound playback
- immediate human move feedback
- replaying returned AI metadata
- keyboard navigation
- animation timing
The bridge should be:
Python returns JSON metadata
↓
Gradio Textbox value changes
↓
Textbox.change JS listener fires
↓
window.handleReversiMetadata(payload)
↓
AudioEngine plays AI sound
Avoid this as the main event bridge:
Python returns hidden HTML
↓
Gradio updates internal DOM
↓
MutationObserver guesses what changed
↓
JavaScript maybe finds data-payload
↓
AI sound maybe plays
Hidden HTML is markup. Your move metadata is data. Send it as data.
8. Main fix: replace hidden HTML metadata with JSON metadata
Step 1 — return plain JSON from _build_ui_payload()
Replace the hidden-HTML metadata construction with a plain JSON string.
import json
import time
def _build_ui_payload(board: Board, status_text: str, moves_metadata=None):
if moves_metadata is None:
moves_metadata = []
final_status = logic.compose_status(board, status_text)
adv_html, _ = board.get_advantage_info()
legal_html, _ = board.get_legal_moves_info(board.turn)
metadata_json = json.dumps({
"moves": moves_metadata,
"ts": time.time(),
})
return [
board,
final_status,
metadata_json,
final_status,
board.get_screenreader_text(final_status),
logic.announce_to_screenreader(final_status),
gr.update(value=adv_html),
gr.update(value=legal_html),
*[gr.update(value=lbl) for lbl in board.get_button_labels()],
]
Benefits:
- no HTML escaping;
- no JSON inside an HTML attribute;
- no
' conversion;
- no dependency on a specific hidden
<div>;
- no dependency on Gradio’s internal DOM replacement behavior;
- easier to inspect in DevTools.
Step 2 — replace hidden gr.HTML with CSS-hidden gr.Textbox
Replace:
move_metadata_view = gr.HTML(
visible=False,
elem_id="move-metadata-container"
)
with:
move_metadata_view = gr.Textbox(
value="",
label="Move metadata",
interactive=False,
elem_id="move-metadata-json",
elem_classes=["visually-hidden-metadata"],
)
Then add CSS:
.visually-hidden-metadata {
position: absolute !important;
left: -10000px !important;
top: auto !important;
width: 1px !important;
height: 1px !important;
overflow: hidden !important;
}
Why a Textbox?
Because Gradio documents Textbox.change, and the docs say it fires when the value changes either because of user input or because of a backend function update. That is exactly what you need here.
Relevant docs:
Step 3 — attach a Gradio .change(..., js=...) listener
Add this after move_metadata_view is created:
move_metadata_view.change(
fn=None,
inputs=move_metadata_view,
outputs=[],
js="""
async (payload) => {
if (!payload) return;
if (window.handleReversiMetadata) {
await window.handleReversiMetadata(payload);
} else {
console.warn("handleReversiMetadata is not available");
}
}
""",
show_api=False,
)
This uses Gradio’s documented event system instead of DOM guessing.
Relevant docs:
Step 4 — expose your metadata handler globally
At the end of assets/script.js, add:
window.handleReversiMetadata = handleMetadataUpdate;
window.AudioEngine = AudioEngine;
window.Reversi = Reversi;
Your file already exposes window.AudioEngine and window.Reversi. Add the metadata handler too.
9. Make the audio engine stricter
Add a helper like this:
async ensureRunning() {
if (!this.ctx || this.ctx.state === "closed") {
await this.init();
}
if (!this.ctx) {
console.warn("AudioContext missing");
return false;
}
if (this.ctx.state === "suspended") {
try {
await this.ctx.resume();
} catch (e) {
console.warn("AudioContext resume failed:", e);
}
}
if (this.ctx.state !== "running") {
console.warn("AudioContext is not running:", this.ctx.state);
return false;
}
return true;
}
Then start play() like this:
async play(soundName, r, c) {
console.log(`AudioEngine.play: ${soundName} at (${r}, ${c})`);
try {
const ok = await this.ensureRunning();
if (!ok) return false;
if (!this.buffers[soundName]) {
console.warn(`Sound ${soundName} not loaded.`);
return false;
}
const source = this.ctx.createBufferSource();
source.buffer = this.buffers[soundName];
if (r !== undefined) {
const targetFreq = this.BASE_FREQ + r * this.STEP;
source.playbackRate.value = targetFreq / this.DEFAULT_SAMPLE_RATE;
}
const panner = this.ctx.createStereoPanner();
panner.pan.value = c !== undefined ? (2 * c / 7) - 1.0 : 0;
source.connect(panner).connect(this.ctx.destination);
source.start();
return true;
} catch (e) {
console.error("Error playing sound", e);
return false;
}
}
This makes debugging much clearer:
| Symptom |
Likely meaning |
| no metadata log |
Gradio event bridge problem |
metadata log appears, but AudioContext is suspended |
browser audio unlock problem |
| metadata log appears, context is running, but buffer missing |
sound loading problem |
| all checks pass |
sound should play |
10. Add an explicit “Enable sound” button
For accessibility and browser reliability, add a button:
enable_audio_btn = gr.Button("Enable sound", elem_id="enable-audio-btn")
Then in JavaScript:
document.addEventListener("click", async (e) => {
const btn = e.target.closest("#enable-audio-btn");
if (!btn) return;
await AudioEngine.init();
if (AudioEngine.ensureRunning) {
const ok = await AudioEngine.ensureRunning();
if (!ok) {
console.warn("Could not enable audio");
return;
}
}
await AudioEngine.play("disk.wav", 3, 3);
console.log("Audio enabled");
});
This gives the browser a clear user gesture for unlocking Web Audio.
It also helps screen reader users because sound activation becomes explicit instead of hidden behind the first board click.
11. Simplify sound-file loading
Use Gradio’s documented file route:
const url = `/gradio_api/file=sounds/${sound}`;
const response = await fetch(url);
Then test in the browser console:
await fetch("/gradio_api/file=sounds/disk.wav")
.then(r => [r.status, r.headers.get("content-type"), r.url]);
Expected result:
[200, "audio/wav", ".../gradio_api/file=sounds/disk.wav"]
The exact content type may vary slightly, but it should be audio-like.
Bad signs:
[404, "...", "..."]
or:
[200, "text/html", "..."]
200 with text/html usually means you fetched the Gradio app page, not the WAV file.
Relevant docs:
12. Remove backend sleep
If you have:
DEBUG = True
and later:
if DEBUG:
time.sleep(0.8)
remove it or set:
DEBUG = False
The browser cannot play AI-move metadata until Python returns it. Waiting inside Python does not give the browser time to play anything.
If you want timing between sounds, do that in JavaScript after metadata arrives:
await AudioEngine.playMoveSequence(move.player, move.r, move.c, move.flips);
await new Promise(resolve => setTimeout(resolve, 300));
That is the right layer for audio timing.
13. Pin Gradio exactly while debugging
Your Space metadata uses a specific Gradio SDK version, but the Python requirement is loose if it says something like:
gradio >= 6.13.0
For a custom-JS-heavy app, avoid loose Gradio versions while debugging.
Use one of these approaches:
gradio==6.13.0
or rely on the Space README sdk_version and remove the duplicate loose Gradio requirement.
Hugging Face Spaces are configured through the YAML block in README.md, including sdk, sdk_version, python_version, and app_file.
Relevant docs:
Why this matters:
Custom CSS/JS is more sensitive to Gradio version changes than plain Python-only demos. If the DOM shape or event timing changes, a hidden-HTML/observer trick can break even though the Python code is unchanged.
14. Optional quick diagnostic: repair the MutationObserver
I would not make this the final architecture, but it can confirm the diagnosis.
If you keep the observer temporarily, watch child replacement too:
function attachMetadataObserver() {
const container =
document.getElementById("move-metadata-container") ||
document.getElementById("move-metadata");
if (!container) {
console.warn("[META] Metadata container not found; retrying");
setTimeout(attachMetadataObserver, 250);
return;
}
const processMetadata = () => {
const metaElem = document.getElementById("move-metadata");
if (!metaElem) return;
const currentMeta = metaElem.getAttribute("data-payload");
if (currentMeta && currentMeta !== window.__lastMeta) {
window.__lastMeta = currentMeta;
console.log("[OBSERVER] Metadata found; dispatching");
handleMetadataUpdate(currentMeta);
}
};
const observer = new MutationObserver(() => {
processMetadata();
});
observer.observe(container, {
childList: true,
subtree: true,
attributes: true,
attributeFilter: ["data-payload"],
});
processMetadata();
}
attachMetadataObserver();
This watches both:
- children inserted/replaced inside the metadata container;
data-payload attribute changes inside the subtree.
Again: this is a diagnostic or quick patch, not my preferred final solution. The cleaner fix is still Textbox.change.
Relevant docs:
15. Debug checklist for the Space
Open the browser console on the Space and run these in order.
A. Did custom JavaScript load?
!!window.AudioEngine
Expected:
true
After the patch:
!!window.handleReversiMetadata
Expected:
true
B. Can the Space serve a WAV file?
await fetch("/gradio_api/file=sounds/disk.wav")
.then(r => [r.status, r.headers.get("content-type"), r.url]);
Expected:
[200, "audio/wav", "...disk.wav"]
If the content type is text/html, the path is wrong.
C. Did sounds decode?
After clicking “Enable sound” or making one move:
Object.keys(window.AudioEngine.buffers)
Expected something like:
["disk.wav", "white.wav", "black.wav", "error.wav", "pass.wav", "border.wav"]
D. Is the audio context running?
window.AudioEngine.ctx?.state
Expected after user interaction:
"running"
If it says:
"suspended"
then the browser has not unlocked audio yet.
E. Does AI metadata arrive?
Add this temporarily to your Gradio .change(..., js=...) listener:
console.log("[GRADIO CHANGE] payload:", payload);
Make a move.
Expected payload shape:
{
"moves": [
{
"type": "move",
"player": "B",
"r": 2,
"c": 3,
"flips": [[3, 3]]
},
{
"type": "move",
"player": "W",
"r": 2,
"c": 2,
"flips": [[3, 3]]
}
],
"ts": 1760000000.123
}
The exact coordinates will differ.
F. Does your handler skip only the optimistic human move?
Your current handleMetadataUpdate() logic intentionally skips the human move that was already sounded locally, then plays the AI move.
Expected log pattern:
[META] Processing 2 moves from metadata
[META] isOptimisticHumanMove: true
[META] Skipping optimistic human move sound
[META] isOptimisticHumanMove: false
[META] Playing move sound for W at (...)
If you get the metadata logs but no sound, the problem is the audio context or sound buffer.
If you never get metadata logs, the problem is the event bridge.
16. Accessibility notes
Your accessibility direction is good.
Sound is useful for non-visual orientation, but sound should not be the only feedback channel. Keep three feedback layers:
| Layer |
Purpose |
Example |
| Sound |
fast spatial/game feedback |
disk tone, flip tone, pass sound |
| Visible status |
sighted users and debugging |
“White played C4 and flipped 2 disks” |
| Screen reader text / live region |
semantic non-visual feedback |
“White played C4. Flipped 2 disks. Your turn.” |
A good normal move announcement is short:
White played C4. Flipped 2 disks. Your turn.
A good invalid-move announcement is also short:
Invalid move at C4.
Keep normal game updates polite rather than assertive so the screen reader does not interrupt too aggressively.
Useful accessibility references:
17. Why this was hard for AI agents to fix
This bug crosses several layers:
Python game state
+ Gradio outputs
+ hidden component rendering
+ DOM mutation detection
+ browser autoplay policy
+ Web Audio context state
+ optimistic local playback
+ screen reader behavior
A tool can check one layer and think everything is fine:
- “The AI move exists.”
- “The board updates.”
- “The WAV files exist.”
- “The browser can play a test sound.”
- “The metadata is generated.”
- “The observer exists.”
All of those can be true while the AI sound still fails.
The real question is more specific:
After Python returns the AI move, does the browser receive a reliable frontend event, and is the Web Audio context already running when that event fires?
Right now, the likely answer is “not reliably.”
18. Recommended patch order
Do this in order:
- Replace hidden
gr.HTML metadata with CSS-hidden gr.Textbox.
- Return plain JSON metadata from
_build_ui_payload().
- Attach
move_metadata_view.change(..., js=...).
- Expose
window.handleReversiMetadata = handleMetadataUpdate.
- Add
AudioEngine.ensureRunning().
- Make
AudioEngine.play() return early if the context is not running.
- Add an explicit “Enable sound” button.
- Load WAV files through
/gradio_api/file=sounds/${sound}.
- Remove backend
time.sleep().
- Pin Gradio exactly while debugging.
19. Final judgment
The core problem is probably not Reversi logic. It is the event bridge between Gradio’s Python return value and browser-side audio playback.
Your current design says:
AI move happened
↓
Python returns hidden HTML
↓
JavaScript watches for one specific attribute mutation
↓
sound plays if that exact mutation is observed
That is too brittle.
Use this instead:
AI move happened
↓
Python returns JSON metadata
↓
Gradio Textbox value changes
↓
Gradio .change JavaScript listener runs
↓
browser-side audio handler plays the AI move sound
That makes every step explicit, inspectable, and testable.
The fix is not “try another sound file” or “sleep longer in Python.” The fix is to make the backend-to-frontend event channel robust and to unlock Web Audio intentionally from a user gesture.