Getting ready

The dataset that we'll be using in this recipe is comprised of thousands of German phrases with English translations. It is available at http://www.manythings.org/anki/deu-eng.zip. The examples have been taken from the Tatoeba Project.

Let's start by loading the required libraries:

library(keras)
library(stringr)
library(reshape2)
library(purrr)
library(ggplot2)
library(readr)
library(stringi)

The data is in the form of a tab-delimited text file. We will be using the first 10,000 phrases. Let's load the dataset and have a look at the sample data:

lines <- readLines("data/deu.txt", n = 10000)
sentences <- str_split(lines, " ")
sentences[1:10]

The following screenshot shows a few records from the data. It contains German phrases and their translations in English:

We will use the preceding dataset to build our neural machine translation model.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset