Training will make a forward pass and then a backward pass through the network, traversing all layers of the network. The lists of accumulators will be initialized in the first iteration. An update for all sets of weights will be performed, the learning rate and momentum updated, the correct gradient applied, gradients zeroed out, and then the process repeated.