We have seen how MAML finds the optimal initial parameter of a model so that it can easily be adaptable to a new task with fewer gradient steps. Now, we will see an interesting variant of MAML called CAML. The idea of CAML is very simple, same as MAML; it also tries to find the better initial parameter. We learned how MAML uses two loops; on the inner loop, MAML learns the parameter specific to the task and tries to minimize the loss using gradient descent and, on the outer loop, it updates the model parameter to reduce the expected loss across several tasks so that we can use the updated model parameter as better initializations for related tasks.

In CAML, we perform a very small tweak to the MAML algorithm. Here, instead of using a single model parameter, we split our model parameter into two:

  • Context parameter: It is task-specific parameter updated on the inner loop. It is denoted by ∅ and it is specific to each task and represents the embeddings of an individual task.
  • Shared parameter: It is shared across tasks and updated in the outer loop to find the optimal model parameter. It is denoted by θ.

So, the context parameter is adapted in the inner loop for each task and the shared parameter is shared across tasks and used for meta training in an outer loop. We initialize the context parameter to zero before each adaptation step.

Okay; but what is really useful in splitting our parameter into two different parameters? It is used to avoid overfitting with respect to particular tasks, promotes faster learning, and it is memory efficient.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.