Is Manual Audio Resampling Required?

In the Speech Recognition Tutorial docs(speech_to_text_2), this line of code reads the audio files: speech, _ = sf.read(batch["file"])
However, given the _, the sample rate is discarded.
Later, the audio is prepped for loading here: inputs = processor(ds["speech"][0], sampling_rate=16_000, return_tensors="pt")
I notice the sample rate is 16k, if my files are not 16k, do I need to manually downsample my, for example, 44100hz, audio to 16k with librosa, as an example, or will the processor line of code downsample automatically for me?