Overview
CALL FOR PAPERS
We will call for extended abstract submission of works already accepted by recent CVPR/ICCV/ECCV/ICML/ICLR/NeurIPS or IJCV/TPAMI/TMLR.
Additionally, we will invite accepted papers from both the main conference (ICCV 2025) and the IJCV special issue on audio-visual generation for the presentations.
CALL FOR DEMOS
We will also call for demos from the industry.
In this workshop, we aim to shine a spotlight on this exciting yet underinvestigated field by prioritizing new approaches in audio-visual generation, as well as covering a wide range of topics related to audio-visual learning, where the convergence of auditory and visual signals unlocks a plethora of opportunities for advancing creativity, understanding, and also machine perception. We hope our workshop can bring together researchers, practitioners, and enthusiasts from diverse disciplines in both academia and industry to delve into the latest developments, challenges, and breakthroughs in audio-visual generation and learning. The workshop will mainly cover but not limited to the following topics:
- Audio-visual generation, including joint audio-visual or cross-modal generation
- Image/video-driven audio generation
- Audio-driven visual media generation
- Dancing video and talking head animation
- Audio-visual foundation models
- Audio-visual representation learning and transfer learning
- Audio-visual learning applications
- Application on scene understanding
- Application on localization
- Audio-visual benchmarks, such as datasets and evaluation metrics
- Ethical considerations in audio-visual research