What are the product quantization vectors

I’m trying to understand what are the product quantization vectors. It seems was though the latent speech representations find the most similar quantized representations from multiple codebooks and concatenates them.(as explained here https://arxiv.org/pdf/2006.11477.pdf)

But I’m not sure what they even are. Are the quantized representations just randomly initialized vectors? Are they frequencies that are closely related to specific phonemes? Are they back propagated through, our do they stay constant?