The bAbI dialog dataset

The bAbI dialog dataset (introduced by Bordes et al. in 2016) is one of the simplest goal-oriented dialog datasets aimed at testing end-to-end trained systems in the domain of restaurant reservations. The dialog tasks are meant to complement the bAbI tasks for text understanding, which were described earlier.

Complete information about the creation and usage of the bAbI dialog dataset can be found in the paper Learning End-to-End Goal-Oriented Dialog by Antoine Bordes, Y-Lan Boureau, and Jason Weston at http://arxiv.org/abs/1605.07683. The data can be downloaded from the following URL: https://research.fb.com/downloads/babi/.

Set in the domain of restaurant reservation, this synthetically generated dataset breaks down a conversation between a bot and a user into five tasks to test some crucial capabilities that dialog systems should have. Given a knowledge base (KB) of restaurants and their properties (location, type of cuisine, and so on), the aim of the dialog is to book a restaurant for the user. Full dialogs are divided into various stages, each of which tests whether models can learn abilities, such as implicit dialog state tracking, using KB facts in dialog, and dealing with new entities that don't appear in dialogs from the training set.

The following figure will help you understand the tasks better:

The conversations are generated by a simulator (in a fixed template format) based on an underlying KB containing all the restaurants and their properties. Each restaurant is defined by a type of cuisine (ten choices, for example, Italian, Indian), a location (ten choices, for example, London, Tokyo), a price range (cheap, moderate, or expensive), a party size (2, 4, 6, or 8 people), and a rating (from 1 to 8). Each restaurant also has an address and a phone number. Making an API call to the KB returns a list of facts related to all the restaurants that satisfy the four parameters: location, cuisine, price range, and party size. In addition to the user and bot utterances, dialogs in each task are comprised of API calls and the resulting facts. Conversations are generated using natural language patterns after randomly selecting each of the four required fields: location, cuisine, price range, and party size. There are 43 patterns for the user and 15 for the bot (the user can say something in up to four different ways, while the bot only has one).

Although the tasks were designed to be used as a framework to analyze the shortcomings of dialog systems in a goal-oriented setting, we will focus on the fifth task: conducting a full conversation. This task combines all aspects of the first four tasks into full dialog scripts and can be used to train a simple chatbot for restaurant reservations.

Table of Contents for The bAbI dialog dataset

Create new playlist

Sign In

Sign Up

Table of Contents for
The bAbI dialog dataset