Transformers.js: Retrieving the size of models in MB/GB before running

Hello.

How can I determine how much free space is required before running the pipeline?

1 Like

Oh… it looks like there isn’t a dedicated API for that yet.:sweat_smile:


No single official Transformers.js API currently tells you, before pipeline() starts, “this run needs X MB/GB.” The current documented hook is progress_callback, and it is for updates during model construction. There is also an open Transformers.js feature request asking for file-size information before download starts because the current callbacks do not provide enough information for a single total progress bar across all files. (Hugging Face)

What the problem really is

Before a pipeline runs, you need to answer two separate questions:

  1. How many bytes will this pipeline download and cache?
  2. How much storage headroom does this browser origin have right now?

Those are different numbers. The first comes from the model repo and your load options. The second comes from the browser’s Storage API. MDN is explicit that navigator.storage.estimate() returns approximate usage and quota for the current origin, not exact raw disk free space. Browser storage is managed per origin, and eviction rules vary by browser. (MDN Web Docs)

Why one fixed model-size number is usually wrong

Transformers.js does not always fetch one file. The file set depends on options such as:

  • revision, which can be a branch, tag, or commit id
  • subfolder, which defaults to onnx
  • device
  • dtype
  • use_external_data_format, which the docs say is used for models >= 2GB (Hugging Face)

The default dtype can also change by backend. Transformers.js documents typical choices such as fp32, fp16, q8, and q4, and notes that fp32 is the default for WebGPU while q8 is the default for WASM. That means the same repo can require different storage depending on how you load it. (Hugging Face)

What you should do instead

Use a preflight step before pipeline():

  1. Pin the exact load configuration you will use:

    • repo id
    • revision
    • subfolder
    • device
    • dtype
  2. Fetch the model repo metadata from the Hub with file metadata enabled.

  3. Sum the sizes of the files your configuration will actually need.

  4. Compare that total against quota - usage from navigator.storage.estimate().

  5. Add a safety margin because the browser values are estimates. (Hugging Face)

That is the correct architecture today.

Where the file sizes come from

The Hub API already exposes the size metadata you need. The official Hub docs say:

  • model_info(..., files_metadata=True) can retrieve metadata for files in the repository, including size and LFS metadata
  • RepoSibling.size is the file size in bytes when file metadata is requested
  • RepoFile.size is the file size in bytes (Hugging Face)

So the missing piece is not “file sizes do not exist.” The missing piece is that Transformers.js does not yet wrap that into a built-in “preflight total bytes” API. (GitHub)

What counts toward required space

For a browser-first Transformers.js app, the practical first-run footprint can include:

  • model config and tokenizer or processor files
  • ONNX model files in the configured subfolder
  • external .onnx_data files for very large models when external data format is used
  • cached ONNX Runtime WASM binaries, because Transformers.js documents useBrowserCache as true by default if available, and useWasmCache as true by default when cache is available (Hugging Face)

So the question is not just “how big is model.onnx?” It is “how big is the full set of artifacts that this load path will cache?” (Hugging Face)

The formula

A practical estimate is:

required_bytes ≈ sum(selected_repo_files) + safety_buffer
available_bytes ≈ quota - usage
ok_to_start = available_bytes >= required_bytes

Use a buffer such as 10% to 25% because browser storage numbers are approximate and because you may also cache runtime assets such as WASM binaries. (MDN Web Docs)

Minimal browser-side implementation

This version uses the Hub metadata endpoint directly. It is simple and works well in a browser app.

function formatBytes(bytes) {
  const units = ["B", "KB", "MB", "GB", "TB"];
  let n = bytes;
  let i = 0;
  while (n >= 1024 && i < units.length - 1) {
    n /= 1024;
    i++;
  }
  return `${n.toFixed(n >= 10 || i === 0 ? 0 : 1)} ${units[i]}`;
}

async function getModelInfoWithSizes(repoId, revision = "main") {
  const url =
    revision === "main"
      ? `https://huggingface.co/api/models/${repoId}?blobs=true`
      : `https://huggingface.co/api/models/${repoId}/revision/${encodeURIComponent(revision)}?blobs=true`;

  const res = await fetch(url);
  if (!res.ok) {
    throw new Error(`Failed to fetch model metadata: ${res.status} ${res.statusText}`);
  }
  return res.json();
}

function getPath(file) {
  return file.rfilename ?? file.path ?? "";
}

function getSize(file) {
  return file.size ?? file.lfs?.size ?? 0;
}

function pickLikelyTransformersJsFiles(siblings, { subfolder = "onnx" } = {}) {
  const sidecars = new Set([
    "config.json",
    "tokenizer.json",
    "tokenizer_config.json",
    "special_tokens_map.json",
    "added_tokens.json",
    "vocab.json",
    "vocab.txt",
    "merges.txt",
    "spiece.model",
    "preprocessor_config.json",
    "processor_config.json",
    "feature_extractor.json",
    "generation_config.json",
  ]);

  return siblings.filter((file) => {
    const path = getPath(file);
    if (!path) return false;
    if (sidecars.has(path)) return true;
    if (path.startsWith(`${subfolder}/`)) return true;
    return false;
  });
}

async function estimateOriginStorage() {
  if (!navigator.storage?.estimate) {
    return { supported: false, quota: null, usage: null, free: null };
  }
  const { quota = 0, usage = 0 } = await navigator.storage.estimate();
  return {
    supported: true,
    quota,
    usage,
    free: Math.max(0, quota - usage),
  };
}

async function estimatePipelineSpace(repoId, {
  revision = "main",
  subfolder = "onnx",
  safetyFactor = 1.2,
} = {}) {
  const info = await getModelInfoWithSizes(repoId, revision);
  const siblings = info.siblings ?? [];
  const files = pickLikelyTransformersJsFiles(siblings, { subfolder });

  const modelBytes = files.reduce((sum, file) => sum + getSize(file), 0);
  const requiredBytes = Math.ceil(modelBytes * safetyFactor);

  const storage = await estimateOriginStorage();

  return {
    repoId,
    revision,
    files: files.map((f) => ({ path: getPath(f), size: getSize(f) })),
    modelBytes,
    modelHuman: formatBytes(modelBytes),
    requiredBytes,
    requiredHuman: formatBytes(requiredBytes),
    storage,
    enoughSpace:
      storage.supported && storage.free != null
        ? storage.free >= requiredBytes
        : null,
  };
}

Use it like this:

const report = await estimatePipelineSpace(
  "Xenova/distilbert-base-uncased-finetuned-sst-2-english",
  {
    revision: "main",
    subfolder: "onnx",
  }
);

console.log("Estimated model footprint:", report.modelHuman);
console.log("Recommended free space:", report.requiredHuman);
console.log("Enough space?", report.enoughSpace);
console.table(report.files);

How exact this can be

There are three levels of accuracy.

1. Rough but safe

Count all sidecars plus everything under onnx/.
This often overestimates, but it is simple and usually safe. The onnx default comes from the Transformers.js docs. (Hugging Face)

2. Better

Pin revision, dtype, device, and subfolder, then only count the files that match that exact configuration. This is better because those options directly affect what gets loaded. (Hugging Face)

3. Best

Maintain your own manifest of exact filenames and byte totals for each supported model configuration. That is the cleanest production design. The open feature request is effectively asking Transformers.js to expose something like this natively. (GitHub)

Browser versus Node

In a browser, the storage question is always about origin quota. In Node, you can reason about the filesystem and cache directory more directly. The @huggingface/hub package does provide scanCacheDir, downloadFileToCacheDir, and snapshotDownload, but its README explicitly says those cache helpers do not work in the browser. (GitHub)

Practical conclusion

The current answer is:

  • No, Transformers.js does not currently give you one built-in API that returns required free space before pipeline() starts. (GitHub)

  • Yes, you can determine it reliably enough by combining:

    • Hub file metadata for required bytes
    • your chosen Transformers.js load options
    • navigator.storage.estimate() for available origin quota headroom (Hugging Face)

The most correct mental model is:

required space = exact repo files this load path will cache
available space = estimated origin quota headroom (MDN Web Docs)

A robust implementation pins revision, sets dtype explicitly, includes .onnx_data when applicable, adds a safety margin, and treats the result as a preflight gate before pipeline(). (Hugging Face)