1. Sign In

Enriched Topical-Chat Dataset for Knowledge-Grounded Dialogue Systems

This dataset provides extra annotations on top of the publicly released Topical-Chat dataset(https://github.com/alexa/Topical-Chat) which will help in reproducing the results in our paper "Policy-Driven Neural Response Generation for Knowledge-Grounded Dialogue Systems" (https://arxiv.org/abs/2005.12529?context=cs.CL). The dataset contains 5 files: train.json, valid_freq.json, valid_rare.json, test_freq.json and test_rare.json. Each of these files will have additional annotations on top of the original Topical-Chat dataset. These specific annotations are: dialogue act annotations and knowledge sentence annotations. The annotations were computed automatically using off the shelf models which are mentioned in the README.txt

The Data

ARN: arn:aws:s3:::enriched-topical-chat
Region: us-west-2


by Behnam Hedayatnia, Karthik Gopalakrishnan, Seokhwan Kim, Yang Liu, Mihail Eric & Dilek Hakkani-Tur


natural language processing machine learning conversation data