Using HF Models to Build a Word Game Like Letter Boxed Ideas & Feedback?

Hi all,

I’ve been exploring ways to use Hugging Face models to help create a word game inspired by the Letter Boxed puzzle (for anyone unfamiliar, it’s a game where you connect letters to form words without repeating letters).

My idea is to have a model generate or validate possible connections based on a set of letters, sort of like an intelligent helper that can:

  • Suggest all valid word combinations from a given letter set

  • Explain why a word is valid or invalid

  • Offer hints when players get stuck

So far I’ve experimented with a few approaches — for example:

  • Prompting a language model like GPT‑2 / GPT‑Neo / Qwen to list words that only use the provided letters

  • Asking it to score or rank suggestions based on game rules

  • Trying rule‑based filters vs pure generative responses

I’ve hit some challenges though:

  1. The model sometimes suggests words that violate the letter rules

  2. When trying to generate all possible combinations, it’s inconsistent unless heavily prompted

  3. Validation logic is tricky — it’s easy to miss duplicates or rule violations

A few questions for the community:

  1. Has anyone tried building or prototyping a Letter Boxed‑style word solver or assistant using HF models?

  2. Is there a recommended way to combine ML with formal rule checking (like merging Hugging Face output with Python logic to filter valid words)?

  3. Any suggestions on prompt design or best models for this use case (especially offline or smaller local models)?

I’m also curious if there are ways to fine‑tune a model specifically for this type of combinatorial word generation or if it’s generally better to handle the heavy rule‑checking outside of the model itself.

Would love to hear your thoughts, examples, or even small snippets of code that helped you in similar tasks!

Thanks :blush:

for now:


Your project is strong because it starts from a puzzle that already has hard rules and real strategy, instead of trying to invent fun from unconstrained text generation. That matters. A lot of “LLM game” ideas collapse because the model is asked to be both the rules engine and the entertainer. Your setup is better because the rules can stay exact while the model layer adds ranking, hints, and adaptive interaction. The current tooling also fits that split well: Hugging Face Datasets can load plain CSV data without a custom dataset script, Sentence Transformers recommends a retrieve-then-rerank pipeline for harder selection tasks, and Qwen’s current embedding/reranker family is explicitly built for text embedding and ranking. (Sbert)

What your recent runs already proved

Your project is already past the “can I wire HF into this?” stage. You have shown that a public HF CSV lexicon can load cleanly, that legality checks can be enforced deterministically, and that the search layer can find cover-all chains. The odd outputs you saw are not a failure of Hugging Face integration. They are a sign that your lexicon is broad and your score is solver-oriented, not player-oriented. The English-Valid-Words dataset card says it contains valid English words with frequency, stem, and stem valid probability, which makes it a good bootstrap lexicon, but not automatically a polished game vocabulary. (Hugging Face)

The most important design decision

The best decision for your project is to keep legality symbolic. I would not let the language model decide whether a move is valid. Public Letter Boxed solver repos that actually work do this symbolically with backtracking, chaining, or bitmasking, not with free-form generation. Hugging Face’s constrained beam search and prefix_allowed_tokens_fn are useful tools, but they are best treated as optional control layers, not as the final judge of a puzzle’s rules. (GitHub)

That means your architecture should stay split like this:

  • rules engine: board, legality, chaining, search
  • lexicon layer: which words are allowed and which are player-friendly
  • ranking layer: which legal move is best for the current player state
  • assistant layer: how to explain, hint, and pace the experience

That is the version of the project I would bet on.

Where the real value is

The direct solver is useful, but the most valuable part of your project is the assistant behavior. A solver asks, “What works?” A good assistant asks, “What works, what feels fair, and what helps without spoiling?” That difference is where your project becomes more than a clone. Sentence Transformers’ retrieve-and-rerank guidance maps almost perfectly onto this: first get the candidate set efficiently, then rerank for precision. In your game, the symbolic engine produces the legal set, and the semantic layer reranks that legal set for hint quality, thematic fit, or beginner-friendliness. (Sbert)

My view on your current lexicon problem

Your biggest short-term problem is not the model. It is the vocabulary surface. The HF lexicon you are using is broad enough to include words that are technically valid but poor for casual gameplay. That is why you started getting outputs that are structurally efficient but aesthetically bad. I would treat English-Valid-Words as a bootstrap source, not as the final player-facing lexicon. A better long-term approach is to intersect a broad lexicon with a more curated common-word list. SCOWL-style wordlists are useful here because they are explicitly organized by commonness: the en-wl/SCOWL project says size 35 is a recommended small list, 50 medium, 70 large, and 80 starts to include the strange and unusual words people like to use in word games. That is almost exactly the distinction your demo exposed. (Hugging Face)

So my recommendation is:

  • keep the broad HF lexicon for coverage and internal search
  • build a clean gameplay lexicon on top of it for hints and first suggestions
  • use frequency and stem-validity metadata to suppress ugly entries
  • optionally intersect with a SCOWL-style common-word list for a “human mode”

That will improve the game more than changing models will.

My view on models for your project

For demos and local experiments, sentence-transformers/all-MiniLM-L6-v2 is a good fit because it is intended for sentence and short paragraph encoding and is used for retrieval, clustering, and sentence similarity. That is the right job description for “semantic hinting” or “theme-aware reranking.” It is not a legality model, and it does not need to be. (Hugging Face)

For a stronger production path, I would move toward a real embedding-plus-reranker stack such as Qwen/Qwen3-Embedding-0.6B with Qwen/Qwen3-Reranker-0.6B. The Qwen model cards say the series is specifically designed for text embedding and ranking tasks, with sizes from 0.6B to 8B, and inherits multilingual and long-context strengths from the Qwen3 base models. That makes it a better long-term fit than a small general-purpose sentence embedder once your candidate set and scoring logic are already solid. (Hugging Face)

The key point is that the model should rank already legal candidates. It should not generate the legal set from scratch.

Where I would not spend time yet

I would not fine-tune early. The project still has higher-leverage work in:

  • lexicon curation
  • score design
  • hint ladder design
  • board generation and evaluation

Fine-tuning a model to enumerate legal words would be the wrong abstraction. Exact combinatorial legality is cheaper and more reliable in code. If you fine-tune anything later, fine-tune a reranker or a hint model, not the legality engine. Sentence Transformers’ docs are very clear that rerankers are second-stage precision tools, and that is much closer to your actual bottleneck. (Sbert)

The project directions I think are strongest

I see three especially strong directions.

1. Puzzle assistant

This is the safest and most immediately useful version. It validates words, explains failures, ranks next moves, and offers spoiler-controlled hints. It is easy to test and easy to understand.

2. Semantic variant

This is the most original version. The next word is not only legal by letters; it must also be semantically related, contrastive, or theme-consistent. This is where embeddings and rerankers become central rather than optional.

3. Board generator

This is where the project becomes more than a helper. You can score candidate boards by solvability, number of short solutions, branching factor, and the quality of beginner-friendly hints. Solver repos show how to solve boards; your bigger opportunity is to generate boards that are actually fun. (GitHub)

The real hard problem

The hardest problem in your project is not legality. It is taste.

A mathematically efficient move is not always a good move for a player. Your current scores already showed that. So I would explicitly separate:

  • solver score: shortest or strongest completion
  • assistant score: common, elegant, hintable, human-friendly
  • semantic score: theme fit or conceptual continuity

If you do not separate those, the assistant will keep sounding like a brute-force optimizer.

What I would build next

I would do the next phase in this order:

  1. keep the current symbolic engine
  2. add stronger lexicon filters for player-facing suggestions
  3. create a clean assistant-mode score that penalizes overlong or obscure first moves
  4. only then turn semantic reranking back on
  5. after that, add a hint ladder: structural hint, semantic hint, constrained shortlist, explanation

That order keeps you focused on the actual player experience rather than on model novelty.

My bottom line

Your project is good because it uses models where models actually help: ranking, hinting, theming, and adaptation. It avoids the trap of asking the model to replace exact rules. The current Hugging Face ecosystem supports this kind of system well, and your latest runs already showed that the technical foundation works. The next leap is not a bigger model. It is a better vocabulary policy and a better assistant score. (Hugging Face)

The shortest version of my advice is:

Keep rules symbolic. Curate the lexicon aggressively. Use embeddings to improve candidate quality, not legality. Treat the assistant as a game designer, not a raw solver.


Below is a single-file demo.

It is designed around three current facts:

  • Hugging Face Datasets can load plain CSV files with the generic csv loader, so you do not need a custom dataset script for a dataset like this. (Hugging Face)
  • Maximax67/English-Valid-Words explicitly says it contains valid English words plus frequency, stem, and stem valid probability. (Hugging Face)
  • sentence-transformers/all-MiniLM-L6-v2 is intended for sentence and short paragraph encoding for retrieval, clustering, and similarity, and inputs longer than 256 word pieces are truncated. That makes it fine for lightweight semantic reranking of short word candidates. (Hugging Face)
# deps:
#   pip install datasets sentence-transformers transformers torch numpy
#
# demo goals:
# - one file
# - no argparse
# - public Hugging Face dataset
# - no dataset builder script required
# - CPU-safe by default
# - GPU-safer if CUDA is available
# - cleaner vocabulary than the earlier demos
# - separate "assistant" scoring from "solver" scoring
#
# URLs used:
# Dataset page:
#   https://huggingface.co/datasets/Maximax67/English-Valid-Words
# Raw CSV:
#   https://huggingface.co/datasets/Maximax67/English-Valid-Words/resolve/main/valid_words_sorted_by_frequency.csv
# Model page:
#   https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2
# HF datasets docs:
#   https://huggingface.co/docs/datasets/main/en/dataset_script
# Sentence Transformers semantic similarity docs:
#   https://sbert.net/docs/sentence_transformer/usage/semantic_textual_similarity.html
#
# Why these choices:
# - HF docs say generic loaders are provided for CSV / JSON / text style data,
#   so a plain CSV dataset can be loaded without a custom dataset script.
# - The chosen dataset exposes word/frequency/stem fields, which is useful
#   for filtering common playable words.
# - The chosen embedding model is small and practical for a 10GB / 16GB setup,
#   and is meant for similarity / retrieval style tasks rather than generation.
#
# Notes:
# - This demo keeps legality fully symbolic.
# - The HF model is optional and only used to rerank a short legal shortlist.
# - On CPU, float32 is preferred for safety.
# - On CUDA, float16 is attempted to save VRAM.
# - This is "assistant-first", not "brute-force shortest-solver-first".

from __future__ import annotations

import math
from collections import defaultdict

import numpy as np
import torch
from datasets import load_dataset


# ============================================================
# 1) EDITABLE SETTINGS
# ============================================================

# Friendlier default board than abc/def/ghi/jkl
BOARD_SIDES = ["aeo", "rtn", "sli", "cdp"]

# Use a standard Letter Boxed-like rule:
# consecutive letters cannot come from the same side
FORBID_SAME_SIDE_ADJACENT = True

# This is NOT standard Letter Boxed, but some variants want it.
# Keep False for a friendlier game.
FORBID_REPEATED_LETTERS_IN_WORD = False

# Standard chaining rule: next word starts with previous word's last letter
REQUIRE_LAST_TO_FIRST_CHAIN = True

MIN_WORD_LEN = 3
MAX_WORD_LEN = 8

# Stronger lexical filters for cleaner gameplay
MIN_FREQUENCY = 1_000_000
MIN_STEM_VALID_PROB = 0.60

# Scan only part of the dataset for a safer demo
MAX_WORDS_TO_SCAN = 150_000

# Search / output
MAX_CHAIN_LEN = 4
MAX_RESULTS = 10
TOP_K = 12

# Assistant vs solver mode:
# - "assistant" prefers more common, cleaner, easier-to-hint words
# - "solver" prefers bigger coverage and fast completion
MODE = "assistant"   # or "solver"

# Optional semantic rerank
USE_SEMANTIC_RERANK = False
SEMANTIC_THEME = "movement, travel, transport"
SEMANTIC_SHORTLIST = 40
EMBED_MODEL = "sentence-transformers/all-MiniLM-L6-v2"

# Partial chain, if any
CURRENT_CHAIN = []

# Example words to validate
EXAMPLE_WORDS = ["stone", "crane", "plane", "trade", "cider", "loop"]


# ============================================================
# 2) PUBLIC HF DATA SOURCE
# ============================================================

WORD_CSV_URL = (
    "https://huggingface.co/datasets/Maximax67/English-Valid-Words/"
    "resolve/main/valid_words_sorted_by_frequency.csv"
)


# ============================================================
# 3) HELPERS
# ============================================================

def normalize(text: str) -> str:
    return "".join(ch.lower() for ch in text if ch.isalpha())


def find_column(columns, exact_names, contains_tokens, required=True):
    lower_map = {c.lower(): c for c in columns}

    for name in exact_names:
        if name.lower() in lower_map:
            return lower_map[name.lower()]

    for c in columns:
        cl = c.lower()
        if any(tok in cl for tok in contains_tokens):
            return c

    if required:
        raise ValueError(f"Could not find expected column in {columns}")
    return None


def safe_float(x, default=None):
    try:
        if x in (None, ""):
            return default
        return float(x)
    except Exception:
        return default


def safe_int(x, default=0):
    try:
        if x in (None, ""):
            return default
        return int(float(x))
    except Exception:
        return default


# ============================================================
# 4) BOARD / RULE ENGINE
# ============================================================

BOARD_SIDES = [normalize(s) for s in BOARD_SIDES if normalize(s)]
BOARD_LETTERS = set("".join(BOARD_SIDES))
SIDE_OF = {ch: i for i, side in enumerate(BOARD_SIDES) for ch in side}
CURRENT_CHAIN = [normalize(w) for w in CURRENT_CHAIN if normalize(w)]

if len(BOARD_LETTERS) != sum(len(s) for s in BOARD_SIDES):
    raise ValueError("Board letters must be unique across sides for this demo.")

VOWELS = set("aeiou")


def invalid_reason(word: str) -> str | None:
    w = normalize(word)

    if len(w) < MIN_WORD_LEN:
        return f"too short (minimum is {MIN_WORD_LEN})"

    if len(w) > MAX_WORD_LEN:
        return f"too long (maximum is {MAX_WORD_LEN})"

    bad = sorted({ch for ch in w if ch not in BOARD_LETTERS})
    if bad:
        return f"contains letters not on the board: {', '.join(bad)}"

    if FORBID_REPEATED_LETTERS_IN_WORD and len(set(w)) != len(w):
        return "repeats a letter, which this variant forbids"

    if FORBID_SAME_SIDE_ADJACENT:
        for a, b in zip(w, w[1:]):
            if SIDE_OF[a] == SIDE_OF[b]:
                return f"uses the same side twice in a row at '{a}{b}'"

    return None


def is_legal_word(word: str) -> bool:
    return invalid_reason(word) is None


def chain_ok(prev_word: str, next_word: str) -> bool:
    if not REQUIRE_LAST_TO_FIRST_CHAIN:
        return True
    return normalize(prev_word)[-1] == normalize(next_word)[0]


def explain_word(word: str, prev_word: str | None = None) -> str:
    reason = invalid_reason(word)
    if reason is not None:
        return f"INVALID: {reason}"

    if prev_word is not None and not chain_ok(prev_word, word):
        return (
            f"INVALID CHAIN: previous word ends with '{normalize(prev_word)[-1]}', "
            f"but '{normalize(word)}' starts with '{normalize(word)[0]}'."
        )

    return "VALID"


# ============================================================
# 5) HUMAN-FRIENDLY FILTERS
# ============================================================

def looks_human_friendly(word: str, stem_prob: float | None) -> bool:
    # Require at least one classic vowel
    if not any(ch in VOWELS for ch in word):
        return False

    # Reject very abbreviation-like short forms
    if len(word) <= 3 and sum(ch not in VOWELS for ch in word) >= 3:
        return False

    # If stem-validity exists, require a reasonably confident value
    if stem_prob is not None and stem_prob < MIN_STEM_VALID_PROB:
        return False

    return True


# ============================================================
# 6) LOAD HF CSV WITH GENERIC CSV LOADER
# ============================================================

print("Loading word list from Hugging Face CSV...")
ds = load_dataset("csv", data_files=WORD_CSV_URL, split="train")
print("Columns:", ds.column_names)

word_col = find_column(
    ds.column_names,
    exact_names=["Word", "word"],
    contains_tokens=["word"],
)

freq_col = find_column(
    ds.column_names,
    exact_names=["Frequency count", "frequency count", "frequency", "freq", "count"],
    contains_tokens=["frequency", "freq", "count"],
    required=False,
)

stem_col = find_column(
    ds.column_names,
    exact_names=["Stem", "stem"],
    contains_tokens=["stem"],
    required=False,
)

stem_prob_col = find_column(
    ds.column_names,
    exact_names=["Stem valid probability", "stem valid probability"],
    contains_tokens=["stem valid probability", "probability"],
    required=False,
)


# ============================================================
# 7) BUILD FILTERED LEGAL LEXICON
# ============================================================

lexicon = []
seen = set()

for i, row in enumerate(ds):
    if i >= MAX_WORDS_TO_SCAN:
        break

    word = normalize(str(row[word_col]))
    if not word or word in seen:
        continue
    seen.add(word)

    freq = safe_int(row.get(freq_col) if freq_col else None, default=0)
    stem = normalize(str(row.get(stem_col))) if stem_col and row.get(stem_col) is not None else ""
    stem_prob = safe_float(row.get(stem_prob_col) if stem_prob_col else None, default=None)

    # Stronger filtering than earlier demos
    if freq < MIN_FREQUENCY:
        continue
    if len(word) > MAX_WORD_LEN:
        continue
    if not looks_human_friendly(word, stem_prob):
        continue
    if not is_legal_word(word):
        continue

    lexicon.append(
        {
            "word": word,
            "freq": freq,
            "stem": stem,
            "stem_prob": stem_prob,
            "letters": set(word),
        }
    )

print(f"Loaded {len(lexicon):,} filtered legal words.")


# ============================================================
# 8) FAST INDICES
# ============================================================

by_start = defaultdict(list)
for item in lexicon:
    by_start[item["word"][0]].append(item)

for ch in by_start:
    by_start[ch].sort(key=lambda x: (-len(x["letters"]), -x["freq"], x["word"]))


# ============================================================
# 9) SCORING
# ============================================================

def assistant_score(item, uncovered_letters, is_first_move: bool) -> float:
    """
    Assistant mode:
    - prefer common words
    - prefer decent continuation count
    - still value new coverage
    - penalize overlong first words
    """
    new_cover = len(item["letters"] & uncovered_letters)
    continuation_count = len(by_start.get(item["word"][-1], []))
    length_penalty = 0.0

    if is_first_move and len(item["word"]) > 7:
        length_penalty = 3.0 + 0.8 * (len(item["word"]) - 7)

    return (
        3.5 * new_cover
        + 0.45 * continuation_count
        + 0.45 * math.log1p(item["freq"])
        - length_penalty
    )


def solver_score(item, uncovered_letters) -> float:
    """
    Solver mode:
    - aggressively reward big coverage
    - still prefer continuation count and commonness
    """
    new_cover = len(item["letters"] & uncovered_letters)
    continuation_count = len(by_start.get(item["word"][-1], []))
    return (
        6.0 * new_cover
        + 0.25 * continuation_count
        + 0.20 * math.log1p(item["freq"])
        + 0.10 * len(item["word"])
    )


def score_item(item, uncovered_letters, is_first_move: bool) -> float:
    if MODE == "assistant":
        return assistant_score(item, uncovered_letters, is_first_move)
    return solver_score(item, uncovered_letters)


def candidate_pool(chain_words):
    if not chain_words:
        return lexicon

    last_word = chain_words[-1]
    used_words = set(chain_words)

    pool = []
    for item in by_start.get(last_word[-1], []):
        if item["word"] not in used_words and chain_ok(last_word, item["word"]):
            pool.append(item)
    return pool


def rank_candidates(chain_words):
    used_letters = set("".join(chain_words))
    uncovered = BOARD_LETTERS - used_letters
    is_first_move = len(chain_words) == 0

    ranked = []
    for item in candidate_pool(chain_words):
        sc = score_item(item, uncovered, is_first_move)
        ranked.append(
            {
                **item,
                "symbolic_score": sc,
                "score": sc,
            }
        )

    ranked.sort(key=lambda x: x["score"], reverse=True)
    return ranked


# ============================================================
# 10) OPTIONAL SEMANTIC RERANK
# ============================================================

def maybe_semantic_rerank(candidates, theme_text):
    if not USE_SEMANTIC_RERANK or not theme_text or not candidates:
        return candidates

    try:
        from sentence_transformers import SentenceTransformer
    except ImportError:
        print("sentence-transformers not installed; skipping semantic rerank.")
        return candidates

    device = "cuda" if torch.cuda.is_available() else "cpu"
    print(f"Loading semantic model on {device}: {EMBED_MODEL}")

    model = SentenceTransformer(EMBED_MODEL, device=device)

    # CPU-safe preference
    if device == "cpu":
        try:
            model = model.float()
        except Exception:
            pass
    else:
        # CUDA-safer preference
        try:
            model = model.half()
        except Exception:
            pass

    short = candidates[:SEMANTIC_SHORTLIST]
    texts = [theme_text] + [c["word"] for c in short]

    embs = model.encode(
        texts,
        batch_size=32,
        convert_to_numpy=True,
        normalize_embeddings=True,
        show_progress_bar=False,
    )

    query = embs[0]
    docs = embs[1:]
    sims = docs @ query  # normalized => dot product == cosine similarity

    reranked = []
    for cand, sim in zip(short, sims):
        item = dict(cand)
        item["semantic_score"] = float(sim)
        item["score"] = item["symbolic_score"] + 1.20 * float(sim)
        reranked.append(item)

    reranked.sort(key=lambda x: x["score"], reverse=True)
    return reranked + candidates[SEMANTIC_SHORTLIST:]


# ============================================================
# 11) DFS SOLVER
# ============================================================

def solve_cover_all(max_results=MAX_RESULTS):
    target = BOARD_LETTERS
    results = []

    # Start from a moderately strong seed set
    seed_words = sorted(
        lexicon,
        key=lambda x: (-len(x["letters"]), -x["freq"], x["word"])
    )[:500]

    def dfs(chain_items, covered_letters):
        if len(results) >= max_results:
            return

        if covered_letters == target:
            results.append([x["word"] for x in chain_items])
            return

        if len(chain_items) >= MAX_CHAIN_LEN:
            return

        if not chain_items:
            candidates = seed_words
        else:
            used_words = {x["word"] for x in chain_items}
            next_start = chain_items[-1]["word"][-1]
            candidates = [
                x for x in by_start.get(next_start, [])
                if x["word"] not in used_words
            ]

        uncovered = target - covered_letters
        is_first_move = len(chain_items) == 0

        scored = sorted(
            candidates,
            key=lambda x: score_item(x, uncovered, is_first_move),
            reverse=True,
        )[:120]

        for nxt in scored:
            dfs(chain_items + [nxt], covered_letters | nxt["letters"])

    dfs([], set())

    seen = set()
    dedup = []
    for chain in results:
        key = tuple(chain)
        if key not in seen:
            seen.add(key)
            dedup.append(chain)
    return dedup


# ============================================================
# 12) HINTS
# ============================================================

def make_hint(chain_words, ranked):
    if not ranked:
        return "No legal next move found."

    used_letters = set("".join(chain_words))
    uncovered = sorted(BOARD_LETTERS - used_letters)
    best = ranked[0]

    if not chain_words:
        return (
            f"Structural hint: start with a common word of length {len(best['word'])} "
            f"that covers letters like {', '.join(sorted(best['letters'] & set(uncovered)))}."
        )

    return (
        f"Structural hint: the next word should start with '{best['word'][0]}' "
        f"and helps cover {', '.join(sorted(best['letters'] & set(uncovered)))}."
    )


# ============================================================
# 13) RUN DEMO
# ============================================================

print("\nBoard sides:", BOARD_SIDES)
print("Board letters:", "".join(sorted(BOARD_LETTERS)))
print("Current chain:", CURRENT_CHAIN if CURRENT_CHAIN else "(empty)")
print("Mode:", MODE)
print("Semantic rerank:", "ON" if USE_SEMANTIC_RERANK else "OFF")

print("\nValidation examples:")
for word in EXAMPLE_WORDS:
    prev = CURRENT_CHAIN[-1] if CURRENT_CHAIN else None
    print(f"  {word:>8} -> {explain_word(word, prev)}")

ranked = rank_candidates(CURRENT_CHAIN)
ranked = maybe_semantic_rerank(ranked, SEMANTIC_THEME)

print("\nTop next moves:")
if not ranked:
    print("  No legal next moves found.")
else:
    for item in ranked[:TOP_K]:
        stem_prob_text = f"{item['stem_prob']:.3f}" if item["stem_prob"] is not None else "None"
        extra = f", semantic={item['semantic_score']:.3f}" if "semantic_score" in item else ""
        print(
            f"  {item['word']:<12} "
            f"score={item['score']:.2f}, "
            f"freq={item['freq']}, "
            f"stem_prob={stem_prob_text}{extra}"
        )

print("\nShort cover-all chains:")
solutions = solve_cover_all()
if not solutions:
    print("  No chain found with current depth / search budget.")
else:
    for i, chain in enumerate(solutions, 1):
        covered = "".join(sorted(set("".join(chain))))
        print(f"  {i}. {' -> '.join(chain)}   [covers: {covered}]")

print("\nHint:")
print(" ", make_hint(CURRENT_CHAIN, ranked))