Data description

The data provided is in the form of id, qid1, qid2, question1, question2, and is_duplicate, where the id field provides the ID for the training pair, qid and qid2 provide the ID for each question, and question1 and question2 are the full text for each question used for training, and is_duplicate is a Boolean or target value, set to 1 if the pair of texts are duplicates (semantically meaning the same) and 0 if they are not duplicates. The data that we will be using to train contains approximately 404,000 question pairs, along with their labels. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset