Memory network pipeline overview

A generic memory network's architecture can be decomposed into four parts: a Question Module, an Input Module, a Memory Module, and an Output Module. As is common practice in neural networks, information passes from one module to the other through dense vectors/embeddings, making the parameters of the model end-to-end trainable using gradient descent:

The working of this model is as follows:

The Input Module receives multiple facts and encodes each of them in vectors.
The Question Module, similar to the Input Module, is responsible for encoding the question in a vector.
The Memory Module receives the encoded facts (from the Input Module) and the encoded question (from the Question Module), and performs a soft attention mechanism on the facts to figure out their relevance to the question. The result of the attention is a context vector for the given question, encoding the question, as well as all the contextual information required to answer it.
The Output Module receives the context vector and is responsible for producing an answer in the desired format. This could mean the selection of an appropriate response from a candidate set, the prediction of answer spans, or the generation of a token-by-token response.

Table of Contents for Memory network pipeline overview

Create new playlist

Sign In

Sign Up

Table of Contents for
Memory network pipeline overview