JavisVerse logo

JavisVerse: A Universe of Joint Audio-Video Intelligence Symphony

A unified family of audio-video models for multimodal generation and understanding, including:
text-conditional joint audio-video synthesis (JavisDiT) and unified audiovisual comprehension and generation (JavisGPT).

Flagship Research
Workshop
Survey