MMC is the dataset of multilingual multiparty coreference resolution. It is associated with the paper
Multilingual Coreference Resolution in Multiparty Dialogue. It is based on transcripts and subtitles from The Big Bang Theory and Friends. It includes 1,222 multiparty dialogue scenes annotated with coreference in English, 1,215 in Farsi, and 1,215 in Chinese.
You can download it here [.zip (5.5MB)].
@misc{zheng-etal-2022-multilingual,
url = {https://arxiv.org/abs/2208.01307},
author = {Zheng, Boyuan and Xia, Patrick and Yarmohammadi, Mahsa and Van Durme, Benjamin},
title = {Multilingual Coreference Resolution in Multiparty Dialogue},
publisher = {arXiv},
year = {2022},
}