I’m trying to run the CLAP model given here - CLAP
I’m running inference on short audio slices, around 2.6 seconds each. They are all a part of a long audio file.
I’m running this on a CPU, in my Windows machine.
For some reason, after approximately ~ 150-300 iterations I get memory exhaustion and the process is crushing. Only if I restart the PC I can get it running again.
My code:
files_list = os.listdir("path_to_audio_slices")
model = ClapModel.from_pretrained("laion/clap-htsat-unfused")
processor = AutoProcessor.from_pretrained("laion/clap-htsat-unfused")
input_text = ["Music", "Speech", "Speech with music"]
path_list = [os.path.join(rel_path, x) for x in files_list]
results = []
for file_path in chunk:
print("in path")
probs = process_audio(file_path, model, processor, input_text)
results.append(probs)
Even if I initialize the model every 50 slices it still happens.
Will appreciate your help with this, and also tips on how to define batch inference.