My issue is with the transformers.MusicgenProcessor for text-to-music generation. It works smoothly, but when I try to use an audio prompt, I run into a problem. The generated music has a portion at the beginning that is identical to the prompt. Why is this happening? I’m hoping for behavior similar to the demo webpage, where the prompt doesn’t appear verbatim at the start of the generated music.

