Workshop Overview

A long-term goal of AI research is to build intelligent agents that can see the rich visual environment around us, communicate this understanding in natural language to humans and other agents. To this end, recent advances at the intersection of vision and language have made incredible progress - from being able to generate natural language descriptions of images & videos, to answering questions about them, to even holding free-form conversations about visual content! The research on vision-language has attracted a lot of researchers across different communities, such as computer vision, natural language processing and machine learning.

This workshop propose to gather these researchers to form a new vision-language community and attract more people on this topic. In the workshop, we will invite several researchers from this area to present their most recent works. The workshop will be ended with an open panel discussion.


The goal of this workshop is to provide a comprehensive yet accessible overview of existing work and to reduce the entry barrier for new researchers. And we aim to invite speakers from this area to present their latest works and propose new challenges. Overall, the topics we will cover in this workshop are as following: