Preparing the data for model building

In this section, we will prepare some data so that we can develop an author classification model. We will start by using tokens to convert text data that is available in the form of articles into a sequence of integers. We will also make changes to identify each author by unique integers. Subsequently, we will use padding and truncation to arrive at the same length for the sequence of integers that represent the articles by 50 authors. We will end this section by partitioning the training data into train and validation datasets and then carrying out one-hot encoding on the response variables.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset