MAGMaR 2026

The 2nd Workshop on Multimodal Augmented Generation via MultimodAl Retrieval

magmar@lists.jh.edu

📢 Looking to submit your work? Submit shared task runs here, submit research papers here, and submit shared task system descriptions here.

Important Dates


Workshop Description

Audiovisual media is becoming an increasingly dominant form of online information consumption. From firsthand, "in the wild" video footage of natural disasters to professionally edited news coverage of major political events, videos serve as rich sources of information for producing factual, grounded articles. Especially for actively unfolding events, grounding articles in video can help combat misinformation and provide journalists and analysts with tools to quickly synthesize new developments.

Individual research groups have independently begun addressing this challenge, leading to parallel yet disconnected efforts to define the research space. ACL 2025 hosted the first MAGMaR workshop focused on Video Event Retrieval. This year's iteration focuses on two primary areas: (1) the retrieval of multimodal content spanning text, images, audio, and video; and (2) retrieval-augmented generation, with an emphasis on multimodal retrieval and grounded generation. To further this goal, we are again hosting a shared task, extending this year to full grounded article generation from multiple videos.

Relevant topics include document retrieval, multimodal retrieval, retrieval-augmented generation (RAG), multimodal RAG, multimodal question answering, and research on video, image, and audio understanding.

This workshop is organized in support of ACL's Special Interest Group on Image and Language (SIGIL).

The workshop will be a one-day hybrid event to allow remote participation and will be co-located with ACL 2026 in San Diego, USA on July 4th.


Shared Task

This shared task focuses on retrieving relevant videos and generating grounded reports that respond to information needs. Given a query describing a real-world current event, participating systems must identify pertinent videos from a large multilingual, multimodal collection and use that evidence to produce a coherent and informative written report.

There are two tracks:

Teams may submit to either track or both. Additional details on submission formats, evaluation, and task instructions are available in the shared task repository.

Submissions will be collected via Google Form:
https://docs.google.com/forms/d/1B_J_iJqisqmcOsNaL_K25hWV_13eF9oQ7xLuVinHd10


Submissions

Research Papers: We welcome the submission of novel research papers on topics including (but not limited to) retrieval-augmented generation, multimodal understanding, multimodal retrieval, document retrieval, multimodal question answering, and video/image/audio understanding. Submissions should follow the ACL guidelines; both archival and non-archival submissions are permitted. Submit via OpenReview: Research Paper Submission

Shared Task System Papers: Teams participating in the retrieval or generation tracks of the shared task are invited to submit system description papers detailing their approaches. Submit via OpenReview: System Paper Submission


Leaderboard

Rankings for each shared task track. Click any column header to sort.

The Retrieval track is evaluated using standard IR metrics: nDCG, Recall, and MAP at various cutoffs.

The Oracle Generation and RAG tracks are evaluated using ROUGE-L, BERTScore, and MiRAGE (InfoP, InfoR, CiteP, CiteR) to measure both text quality and grounded citation accuracy.

Note: All teams will receive emails with their scores, but only the top 5 submissions from each submitting team will be displayed.

🏆 Retrieval

🏆 Oracle Generation

🏆 RAG (Retrieval-Augmented Generation)


Schedule

Time Program
9:30 - 9:45 amWelcome Remarks
Reno Kriz (Johns Hopkins University)
9:45 - 10:30 amKeynote 1
Nanyun (Violet) Peng (UCLA)
10:30 - 11:00 amBreak
11:00 am - 12:30 pmOral Presentations
12:30 - 2:00 pmLunch
2:00 - 3:30 pmPoster Session
3:30 - 4:00 pmBreak
4:00 - 4:45 pmKeynote 2
Chenliang Xu (University of Rochester)
4:45 - 5:00 pmPaper Awards and Closing

Organizers

Program Committee

  • Eugene Yang
    Research Scientist at the HLTCOE and Johns Hopkins University
  • Matthew Maciejewski
    Research Scientist at the HLTCOE and Johns Hopkins University
  • Benjamin Van Durme
    Associate Professor at Johns Hopkins University and Principal Researcher at Microsoft
  • Tejas Gokhale
    Assistant Professor at University of Maryland, Baltimore County
  • Dave Etter
    Machine Learning Scientist at the HLTCOE
  • Sourajit Saha
    Ph.D. student at University of Maryland, Baltimore County
  • Shubhashis Roy Dipta
    Ph.D. student at University of Maryland, Baltimore County
  • Cameron Carpenter
    Ph.D. student at Johns Hopkins University
  • Tyler Skow
    Ph.D. student at Johns Hopkins University
  • Arun Reddy
    Research Engineer at JHU Applied Physics Laboratory
  • Nellia Dzhubaeva
    Master's student at Saarland University
  • Crystina Zhang
    PhD student at University of Waterloo
  • Xueguang Ma
    PhD student at University of Waterloo
  • Chris Biemann
    Professor at University of Hamburg