Issue with Loading BLIP Processor and Model for Image Captioning

Hi Hugging Face Community,

I’m experiencing an issue with loading the BLIP processor and model for image captioning using the Salesforce/blip-image-captioning-base model. My script seems to get stuck while attempting to load the processor and model. Below are the details of my setup and the script I’m using.

Environment Details

  • Transformers Version: 4.42.3
  • Torch Version: 2.2.2
  • Pillow Version: 10.3.0
  • Python Version: 3.12 (Anaconda environment)
  • Operating System: macOS
  • CUDA Availability: False (Running on CPU)

Script

import logging
from transformers import BlipProcessor, BlipForConditionalGeneration, __version__ as transformers_version
from PIL import Image, __version__ as pillow_version
import torch

# Enable logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# Log versions
logger.info(f"Transformers version: {transformers_version}")
logger.info(f"Torch version: {torch.__version__}")
logger.info(f"Pillow version: {pillow_version}")

# Check CUDA availability
logger.info(f"CUDA available: {torch.cuda.is_available()}")

# Load an image
image_path = "./data/images/LHorgan_JDFAirWing_279.jpg"
try:
    image = Image.open(image_path)
    logger.info("Image loaded successfully")
except Exception as e:
    logger.error(f"Failed to load image: {e}")

# Load the processor and model with detailed logging
try:
    logger.info("Loading processor...")
    processor = BlipProcessor.from_pretrained("Salesforce/blip-image-captioning-base")
    logger.info("Processor loaded successfully")

    logger.info("Loading model...")
    model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-base")
    logger.info("Model loaded successfully")
except Exception as e:
    logger.error(f"Failed to load processor or model: {e}")

# Check if the model and processor are correctly loaded
if 'processor' in locals():
    logger.info("Processor is ready for use.")
else:
    logger.error("Processor was not loaded properly.")

if 'model' in locals():
    logger.info("Model is ready for use.")
else:
    logger.error("Model was not loaded properly.")

Issue

The script logs the following and then gets stuck while loading the processor:

INFO:__main__:Transformers version: 4.42.3
INFO:__main__:Torch version: 2.2.2
INFO:__main__:Pillow version: 10.3.0
INFO:__main__:CUDA available: False
INFO:__main__:Image loaded successfully
INFO:__main__:Loading processor...

It seems to hang indefinitely at “Loading processor…”. I have ensured that my internet connection is stable and have checked my Python environment setup. Despite this, the processor and model do not seem to load successfully.

Could anyone provide insights or suggestions on how to resolve this issue? Are there any specific configurations or steps I might be missing? Any help would be greatly appreciated.

Thank you in advance!