Reptile is a simple yet effective algorithm. Reptile can be implemented in both serial and batch versions. In the serial version, we sample only one task from the task distribution while, in the batch version, we sample a batch of tasks and try to find the optimal parameter. We'll see how the serial version of Reptile works. The sequence of steps involved in Reptile is as follows:
- Let's say we have a distribution over tasks , and we randomly initialize the model parameter .
- Now we sample a task T from the task distribution: .
- For the sampled task , we sample some k data points and prepare our dataset D:. Our dataset basically contains x features and y labels. Now, we minimize the loss in our dataset by performing stochastic gradient descent for some n number of iterations. After performing SGD for n number of iterations on our sampled task T, we'll get the optimal parameter .
- We update our randomly initialized parameter in a direction closer to the optimal parameter obtained in previous steps as follows: .
- We repeat the steps 2 to step 4 for n number of iterations.