The LayoutLM Installation Fails

I want to install LayoutLM in Google Colaboratory

First, I have cloned the LayoutLM from this GitHub repository

After that, I will install the LayoutLM by running its setup.py file by running this code block:

%%bash
cd /content/drive/MyDrive/LayoutLMMM/SROIE2019-20210928T080219Z-001/SROIE2019
# git clone https://github.com/microsoft/unilm.git
cd unilm/layoutlm/deprecated
pip install .

However, when i tried to run the script, an error occurs

Successfully built layoutlm sacremoses
Failed to build tokenizers
  error: subprocess-exited-with-error
  
  Ă— Building wheel for tokenizers (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for tokenizers
ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects

As I locate what library is causing the error, I found out that the tokenizers library in the transformers of HuggingFace is returning this error.

I have tried modifying the setup file by omitting the transformers library and modified the setup script of transformers to install the latest version of tokenizer to check if it can fix the error. Yes, this method works however, the results are not accurate.

How can I install the LayoutLM without the building wheel for tokenizers did not run successfully error?

1 Like

u got any solution for that?

The problem in here is that the version of the tokenizers library is not supported in the latest version of python. Therefore in Google Colaboratory, I changed the python version with the command:

# Download and Install Py 3.7
!sudo apt-get install python3.7.
!sudo apt-get update -y
# Update the python version in Colab
!sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.7 1
! echo "2" | sudo update-alternatives --config python3   # Change the '2' with the number with the Python 3.7
# Install the pip
!sudo apt install python3-pip

Try this if there is a problem in the distutils:

# Install distutils
!sudo apt install python3.7-distutils

Then you can try installing the LayoutLM with the bash command:

%%bash

...

pip install .

NOT RECOMMENDED

Another workaround that I did was omit the tokenizers library in the setup file of the LayoutLM library and install the latest version of tokenizers separately using the command:
!pip install tokenizers