Neural network architecture

As we have now built the word vectors from the character-word embedding concatenation, we will run a bidirectional LSTM over the sequence of word embeddings, to get a semantic representation of the embeddings, using the concatenated hidden states (forward and backward) of the bidirectional LSTM. This is shown in the following figure:

Using TensorFlow to implement this is straightforward, and is quite similar to how we implemented the earlier character-level embedding learning LSTM. However, unlike in the earlier case, we are interested in the hidden states of each time step:

bi_dir_cell_fw = tf.contrib.rnn.LSTMCell(hidden_state_size)

bi_dir_cell_cell_bw = tf.contrib.rnn.LSTMCell(hidden_state_size)

(out_fw, out_bw), _ = tf.nn.bidirectional_dynamic_rnn(bi_dir_cell_fw, bi_dir_cell_bw, word_embedding, sequence_length=sequence_lengths, dtype=tf.float32)

semantic_representation = tf.concat([out_fw, out_bw], axis=-1)

Hence, a semantic representation, h, captures the meaning of each available word, w, using the pretrained word vectors, the characters, and the context in which these words are available to the model. With such a vector made available, we can use a dense neural network to get a vector, where each element in the vector corresponds to a tag that we would like to predict as an entity:

Semantic graph representation

The preceding figure shows how the bidirectional LSTM output is fed into:

Bidirectional LSTM
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset