Input
Input Image

Build your sentence:

a tall a happy <mask>
giraffe puppy <mask>
wearing a with a <mask>
green shirt top hat <mask>
detailed cartoon <mask>
Click word pairs above to build your sentence!
Output
Output Image

Model Response:

Select words and interact with the input image to see results here.

BibTeX

@article{swerdlow2025unidisc,
        title = {Unified Multimodal Discrete Diffusion},
        author = {Swerdlow, Alexander and Prabhudesai, Mihir and Gandhi, Siddharth and Pathak, Deepak and Fragkiadaki, Katerina},
        journal = {arXiv preprint arXiv:2503.20853},
        year = {2025},
        doi = {10.48550/arXiv.2503.20853},
      }