Track1

MedDG: Entity-aware medical dialogue generation

A medical dialogue system aims to generate context consistent and medically meaningful responses conditioned on the dialogue history. In this track, we focus on entity-aware medical dialogue generation [1, 2]. Formally, given the dialogue history X={X_1,X_2,...,X_K} between the doctor and the patient, where X_K is the patient's last utterance, the target of this task is to generate the next response of the doctor X_{K+1} with as many correct entities as possible.
MedDG is a large-scale entity-centric medical dialogue dataset related to 12 types of common gastrointestinal diseases, with more than 17K conversations and 385K utterances collected from the online health consultation community. Each conversation is annotated with five different categories of entities, including diseases, symptoms, attributes, tests, and medicines. For more details about this dataset, please refer to this preprint.
In addition, according to Gururangan, Suchin et al[3]. when tailoring a pretrained model to the domain of a target task, a second phase of pretraining in-domain leads to performance gains. MedDialog[4] provides 3.4 million annotation-free conversations between patients and doctors, which could be used for domain-adaptive pretraining. please refer to MedDialog.

Evaluation: Average of 3 metrics: BLEU-1/4 and Entity-F1.
Download: The training dataset is available at Google Drive
Submission: https://competitions.codalab.org/competitions/29703

Track2

Dialogue system for medical diagnosis

This track targets on building task-oriented dialogue system for automatic medical diagnosis [5, 6, 7], which converses with patients to collect additional symptoms beyond their self-reports and makes a disease diagnosis in the end.Concretly, the system can only access the explicit symptoms in the beginning. When it requests a symptom during the dialogue, the user simulator will take one of the three actions including True for the positive symptom, False for the negative symptom, and Not sure for the symptom that is not mentioned in the user goal. The maximum number of terms are 22.
Since the scales of the existing datasets are relatively small, we constructed a new Medical Diagnosis Dialogue dataset named MDD, including 12 diseases in the General domain. Following the Muzhi dataset [5], we converted source medical records into structured user goals including only disease tags, explicit symptoms, and implicit symptoms to protect privacy as much as possible. Compared with previous datasets, MDD is three times larger, including 2,374 dialogues, 12 disease types, and 118 symptom types. Besides, it is derived from real-world patients in offline (brick-and-mortar) hospitals, thus closer to the real clinical diagnosis scenario. The MDG-12 dataset was split to 8:1:1 for training, development and test, respectively.

Evaluation: Diagnosis Accuracy * 0.8 + Symptom F1 * 0.2.
Download: Please download the data from MDD Dataset
Submission: https://competitions.codalab.org/competitions/29706

Awards

This year, Tencent Javis Lab will provide cash awards to the winners of each track. In each track, the 1st, 2nd, and 3rd places will receive $1000, $500, and $300, respectively.
Founded in September 2018, Tencent Javis Lab is a group focusing on medical artificial intelligence. They are committed to building a real-time evolving knowledge and decision-making platform in healthcare through machine learning and big data analytics.

References

[1] Wenge Liu, Jianheng Tang, Jinghui Qin, Lin Xu, Zhen Li, Xiaodan Liang. MedDG: A Large-scale Medical Consultation Dataset for Building Medical Dialogue System. Arxiv, 2020.

[2] Shuai Lin, Pan Zhou, Xiaodan Liang, Jianheng Tang, Ruihui Zhao, Ziliang Chen, Liang Lin. Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation. In AAAI, 2021.

[3] Suchin Gururangan, Ana Marasović, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, Noah A. Smith. Don't Stop Pretraining: Adapt Language Models to Domains and Tasks. In ACL, 2020.

[4] Guangtao Zeng, Wenmian Yang, Zeqian Ju, Yue Yang, Sicheng Wang, Ruisi Zhang, Meng Zhou, Jiaqi Zeng, Xiangyu Dong, Ruoyu Zhang, Hongchao Fang, Penghui Zhu, Shu Chen, Pengtao Xie. MedDialog: Large-scale Medical Dialogue Datasets. In EMNLP, 2020.

[5] Zhongyu Wei, Qianlong Liu, Baolin Peng, Huaixiao Tou, Ting Chen, Xuanjing Huang, Kam-fai Wong, Xiangying Dai. Task-oriented Dialogue System for Automatic Diagnosis. In ACL, 2018.

[6] Lin Xu, Qixian Zhou, Ke Gong, Xiaodan Liang, Jianheng Tang, Liang Lin. End-to-End Knowledge-Routed Relational Dialogue System for Automatic Diagnosis. In AAAI, 2019.

[7] Yuan Xia, Jingbo Zhou, Zhenhui Shi, Chao Lu, Haifeng Huang. Generative Adversarial Regularized Mutual Information Policy Gradient Framework for Automatic Diagnosis. In AAAI 2020.