Finetune a model in a different sample rate

Hello, I’m a student currently working on my master’s thesis in dolphin click detection. The audio files are sampled at 96,000 Hz, and I can’t downsample them because I would lose important information. I would like to try out some models from Hugging Face, but due to my data, I can’t use any of the pre-trained ones. Are there any alternatives? Is it possible to take a model and train it from scratch on my type of data?