Abstract: We present SoundLoCD, a novel text-to-sound generation framework, which incorporates a LoRA-based conditional discrete contrastive latent diffusion model. Unlike recent large-scale sound ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results