MobileNets

Howard et al. proposed a solution for faster inference with a learning paradigm. The following is an illustration of the use of mobile inference for various models. The models generated with this technique can be used to serve from the cloud, as well:

Reproduced from Howard et al.- Use of mobile inference for various models

There are three ways to apply convolution, as shown in the following diagram:

Reproduced from Howard, et al.

A normal convolution can be replaced with depthwise convolution, as follows:

Reproduced from Howard, et al.

The following is a graph that represents the linear dependence of accuracy, with respect to the number of operations performed:

Reproduced from Howard, et al.

The following is a graph representing the accuracy dependence on the number of parameters. The parameters are plotted in a logarithmic scale:

Reproduced from Howard, et al.

From the preceding discussion, it is clear that quantization gives a performance boost to the inference of models. In the next section, we will see how TensorFlow Serving can be used to serve models in production. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset