Trying to run a model from local machine - fails ERROR: Failed building wheel for tokenizers

I would like to run models locally to understand how to integrate them eventually with other apps or expose them as a Swagger/OpenAPI. As such I found this post on the forum…

I could not go through it all as I started following the steps in the [installation] → huggingface. co/docs/transformers/installation, and then I see this failure with error

ERROR: Failed building wheel for tokenizers

Full execution

pip install ‘transformers[tf-cpu]’

Requirement already satisfied: transformers[tf-cpu] in ./.env/lib/python3.12/site-packages (4.48.2)
Requirement already satisfied: filelock in ./.env/lib/python3.12/site-packages (from transformers[tf-cpu]) (3.17.0)
Requirement already satisfied: huggingface-hub<1.0,>=0.24.0 in ./.env/lib/python3.12/site-packages (from transformers[tf-cpu]) (0.28.1)


.... REMOVED IN THE MIDDLE AS TOO LONG
         


         Compiling tokenizers v0.13.3 (/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/tokenizers-lib)
           Running `/home/bizmate/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/bin/rustc --crate-name tokenizers --edition=2018 tokenizers-lib/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C embed-bitcode=no --cfg 'feature="cached-path"' --cfg 'feature="clap"' --cfg 'feature="cli"' --cfg 'feature="default"' --cfg 'feature="dirs"' --cfg 'feature="esaxx_fast"' --cfg 'feature="http"' --cfg 'feature="indicatif"' --cfg 'feature="onig"' --cfg 'feature="progressbar"' --cfg 'feature="reqwest"' --check-cfg 'cfg(docsrs)' --check-cfg 'cfg(feature, values("cached-path", "clap", "cli", "default", "dirs", "esaxx_fast", "fancy-regex", "http", "indicatif", "onig", "progressbar", "reqwest", "unstable_wasm"))' -C metadata=498d720f2099b118 -C extra-filename=-498d720f2099b118 --out-dir /tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps -C strip=debuginfo -L dependency=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps --extern aho_corasick=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libaho_corasick-92bce46bd4a532bd.rmeta --extern cached_path=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libcached_path-850e6ec3e4af1aa4.rmeta --extern clap=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libclap-b4a6caa2b49e8ea9.rmeta --extern derive_builder=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libderive_builder-451cb1c97b4326ce.rmeta --extern dirs=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libdirs-32b195c1f13b56d7.rmeta --extern esaxx_rs=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libesaxx_rs-595f26381d605689.rmeta --extern getrandom=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libgetrandom-7790b83d430b3b5c.rmeta --extern indicatif=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libindicatif-10b2c1e790b78c35.rmeta --extern itertools=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libitertools-d0e9333e0e61b9c1.rmeta --extern lazy_static=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/liblazy_static-6441f4b82cf80ee3.rmeta --extern log=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/liblog-d72600b7b332de15.rmeta --extern macro_rules_attribute=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libmacro_rules_attribute-93bd015f9020b99c.rmeta --extern monostate=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libmonostate-ff4c5186c219cd62.rmeta --extern onig=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libonig-a19b9221b1d61e11.rmeta --extern paste=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libpaste-0c1f3f3a0421d29d.so --extern rand=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/librand-119b24f19765b5fc.rmeta --extern rayon=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/librayon-a0e7f8b1d3e0dc36.rmeta --extern rayon_cond=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/librayon_cond-fd26ec6467072342.rmeta --extern regex=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libregex-d8de493f18ddc33c.rmeta --extern regex_syntax=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libregex_syntax-3378409f0198f480.rmeta --extern reqwest=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libreqwest-bcfb90a71a25b5a1.rmeta --extern serde=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libserde-15e8f6398ebee345.rmeta --extern serde_json=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libserde_json-7d710ee26110d64d.rmeta --extern spm_precompiled=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libspm_precompiled-db400f3aeeb5353c.rmeta --extern thiserror=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libthiserror-2cdced45e8327508.rmeta --extern unicode_normalization_alignments=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libunicode_normalization_alignments-2c96ba8ce0f02598.rmeta --extern unicode_segmentation=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libunicode_segmentation-fbe88d076ed11a12.rmeta --extern unicode_categories=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libunicode_categories-54e68dc068f662ac.rmeta -L native=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/build/bzip2-sys-d9b1dc04afffd030/out/lib -L native=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/build/zstd-sys-2513a00fd82e4a78/out -L native=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/build/esaxx-rs-8e72c0c81c1a60b9/out -L native=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/build/onig_sys-ff1305f6aea81deb/out`
      warning: variable does not need to be mutable
         --> tokenizers-lib/src/models/unigram/model.rs:265:21
          |
      265 |                 let mut target_node = &mut best_path_ends_at[key_pos];
          |                     ----^^^^^^^^^^^
          |                     |
          |                     help: remove this `mut`
          |
          = note: `#[warn(unused_mut)]` on by default
      
      warning: variable does not need to be mutable
         --> tokenizers-lib/src/models/unigram/model.rs:282:21
          |
      282 |                 let mut target_node = &mut best_path_ends_at[starts_at + mblen];
          |                     ----^^^^^^^^^^^
          |                     |
          |                     help: remove this `mut`
      
      warning: variable does not need to be mutable
         --> tokenizers-lib/src/pre_tokenizers/byte_level.rs:200:59
          |
      200 |     encoding.process_tokens_with_offsets_mut(|(i, (token, mut offsets))| {
          |                                                           ----^^^^^^^
          |                                                           |
          |                                                           help: remove this `mut`
      
      error: casting `&T` to `&mut T` is undefined behavior, even if the reference is unused, consider instead using an `UnsafeCell`
         --> tokenizers-lib/src/models/bpe/trainer.rs:526:47
          |
      522 |                     let w = &words[*i] as *const _ as *mut _;
          |                             -------------------------------- casting happend here
      ...
      526 |                         let word: &mut Word = &mut (*w);
          |                                               ^^^^^^^^^
          |
          = note: for more information, visit <https://doc.rust-lang.org/book/ch15-05-interior-mutability.html>
          = note: `#[deny(invalid_reference_casting)]` on by default
      
      warning: `tokenizers` (lib) generated 3 warnings
      error: could not compile `tokenizers` (lib) due to 1 previous error; 3 warnings emitted
      
      Caused by:
        process didn't exit successfully: `/home/bizmate/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/bin/rustc --crate-name tokenizers --edition=2018 tokenizers-lib/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C embed-bitcode=no --cfg 'feature="cached-path"' --cfg 'feature="clap"' --cfg 'feature="cli"' --cfg 'feature="default"' --cfg 'feature="dirs"' --cfg 'feature="esaxx_fast"' --cfg 'feature="http"' --cfg 'feature="indicatif"' --cfg 'feature="onig"' --cfg 'feature="progressbar"' --cfg 'feature="reqwest"' --check-cfg 'cfg(docsrs)' --check-cfg 'cfg(feature, values("cached-path", "clap", "cli", "default", "dirs", "esaxx_fast", "fancy-regex", "http", "indicatif", "onig", "progressbar", "reqwest", "unstable_wasm"))' -C metadata=498d720f2099b118 -C extra-filename=-498d720f2099b118 --out-dir /tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps -C strip=debuginfo -L dependency=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps --extern aho_corasick=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libaho_corasick-92bce46bd4a532bd.rmeta --extern cached_path=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libcached_path-850e6ec3e4af1aa4.rmeta --extern clap=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libclap-b4a6caa2b49e8ea9.rmeta --extern derive_builder=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libderive_builder-451cb1c97b4326ce.rmeta --extern dirs=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libdirs-32b195c1f13b56d7.rmeta --extern esaxx_rs=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libesaxx_rs-595f26381d605689.rmeta --extern getrandom=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libgetrandom-7790b83d430b3b5c.rmeta --extern indicatif=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libindicatif-10b2c1e790b78c35.rmeta --extern itertools=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libitertools-d0e9333e0e61b9c1.rmeta --extern lazy_static=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/liblazy_static-6441f4b82cf80ee3.rmeta --extern log=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/liblog-d72600b7b332de15.rmeta --extern macro_rules_attribute=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libmacro_rules_attribute-93bd015f9020b99c.rmeta --extern monostate=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libmonostate-ff4c5186c219cd62.rmeta --extern onig=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libonig-a19b9221b1d61e11.rmeta --extern paste=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libpaste-0c1f3f3a0421d29d.so --extern rand=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/librand-119b24f19765b5fc.rmeta --extern rayon=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/librayon-a0e7f8b1d3e0dc36.rmeta --extern rayon_cond=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/librayon_cond-fd26ec6467072342.rmeta --extern regex=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libregex-d8de493f18ddc33c.rmeta --extern regex_syntax=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libregex_syntax-3378409f0198f480.rmeta --extern reqwest=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libreqwest-bcfb90a71a25b5a1.rmeta --extern serde=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libserde-15e8f6398ebee345.rmeta --extern serde_json=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libserde_json-7d710ee26110d64d.rmeta --extern spm_precompiled=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libspm_precompiled-db400f3aeeb5353c.rmeta --extern thiserror=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libthiserror-2cdced45e8327508.rmeta --extern unicode_normalization_alignments=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libunicode_normalization_alignments-2c96ba8ce0f02598.rmeta --extern unicode_segmentation=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libunicode_segmentation-fbe88d076ed11a12.rmeta --extern unicode_categories=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/deps/libunicode_categories-54e68dc068f662ac.rmeta -L native=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/build/bzip2-sys-d9b1dc04afffd030/out/lib -L native=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/build/zstd-sys-2513a00fd82e4a78/out -L native=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/build/esaxx-rs-8e72c0c81c1a60b9/out -L native=/tmp/pip-install-sune5vl5/tokenizers_cab0b55d019740e783c13f612a522b5e/target/release/build/onig_sys-ff1305f6aea81deb/out` (exit status: 1)
      error: `cargo rustc --lib --message-format=json-render-diagnostics --manifest-path Cargo.toml --release -v --features pyo3/extension-module --crate-type cdylib --` failed with code 101
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for tokenizers
Failed to build tokenizers
ERROR: Failed to build installable wheels for some pyproject.toml based projects (tokenizers)

Notice i have installed rust as suggested and looked at errors when it was complaining that Open SSL was missing, so I installed the above, but now i have no clue as to what to try next

1 Like

In addition to the causes you mentioned, it also seems to be an error that occurs in Python 3.13.

I had seen a similar suggestion and ignored it because I am actually running .12 on the machine where I tried the above

1 Like

If that’s the case, it might be quicker to downgrade the tokenizers for now. If it’s a bug, it’ll probably be fixed in a new version eventually.