2017-05-07 22:19:25 -07:00
# Predict with pre-trained models
2017-05-15 08:32:57 -07:00
This tutorial explains how to recognize objects in an image with a
2017-05-07 22:19:25 -07:00
pre-trained model, and how to perform feature extraction.
2017-06-09 23:15:46 -07:00
## Prerequisites
To complete this tutorial, we need:
2018-01-26 19:49:15 -08:00
- MXNet. See the instructions for your operating system in [Setup and Installation ](http://mxnet.io/install/index.html )
2017-06-09 23:15:46 -07:00
- [Python Requests ](http://docs.python-requests.org/en/master/ ), [Matplotlib ](https://matplotlib.org/ ) and [Jupyter Notebook ](http://jupyter.org/index.html ).
```
2017-09-01 10:07:14 -07:00
$ pip install requests matplotlib jupyter opencv-python
2017-06-09 23:15:46 -07:00
```
2017-05-07 22:19:25 -07:00
## Loading
We first download a pre-trained ResNet 152 layer that is trained on the full
2017-06-09 23:15:46 -07:00
ImageNet dataset with over 10 million images and 10 thousand classes. A
2017-05-07 22:19:25 -07:00
pre-trained model contains two parts, a json file containing the model
2017-06-12 14:57:09 -07:00
definition and a binary file containing the parameters. In addition, there may be
2017-05-15 08:32:57 -07:00
a text file for the labels.
2017-05-07 22:19:25 -07:00
``` python
import mxnet as mx
path = ' http://data.mxnet.io/models/imagenet-11k/ '
[ mx . test_utils . download ( path + ' resnet-152/resnet-152-symbol.json ' ) ,
mx . test_utils . download ( path + ' resnet-152/resnet-152-0000.params ' ) ,
mx . test_utils . download ( path + ' synset.txt ' ) ]
```
2017-05-15 08:32:57 -07:00
Next, we load the downloaded model. * Note: * If GPU is available, we can replace all
2017-06-12 14:57:09 -07:00
occurrences of `mx.cpu()` with `mx.gpu()` to accelerate the computation.
2017-05-07 22:19:25 -07:00
``` python
sym , arg_params , aux_params = mx . model . load_checkpoint ( ' resnet-152 ' , 0 )
2017-05-25 22:33:31 -07:00
mod = mx . mod . Module ( symbol = sym , context = mx . cpu ( ) , label_names = None )
mod . bind ( for_training = False , data_shapes = [ ( ' data ' , ( 1 , 3 , 224 , 224 ) ) ] ,
label_shapes = mod . _label_shapes )
mod . set_params ( arg_params , aux_params , allow_missing = True )
2017-05-07 22:19:25 -07:00
with open ( ' synset.txt ' , ' r ' ) as f :
labels = [ l . rstrip ( ) for l in f ]
```
## Predicting
2017-05-15 08:32:57 -07:00
We first define helper functions for downloading an image and performing the
prediction:
2017-05-07 22:19:25 -07:00
``` python
% matplotlib inline
import matplotlib . pyplot as plt
import cv2
import numpy as np
# define a simple data batch
from collections import namedtuple
Batch = namedtuple ( ' Batch ' , [ ' data ' ] )
def get_image ( url , show = False ) :
# download and show the image
fname = mx . test_utils . download ( url )
img = cv2 . cvtColor ( cv2 . imread ( fname ) , cv2 . COLOR_BGR2RGB )
if img is None :
return None
if show :
plt . imshow ( img )
plt . axis ( ' off ' )
# convert into format (batch, RGB, width, height)
img = cv2 . resize ( img , ( 224 , 224 ) )
img = np . swapaxes ( img , 0 , 2 )
img = np . swapaxes ( img , 1 , 2 )
img = img [ np . newaxis , : ]
return img
def predict ( url ) :
img = get_image ( url , show = True )
# compute the predict probabilities
mod . forward ( Batch ( [ mx . nd . array ( img ) ] ) )
prob = mod . get_outputs ( ) [ 0 ] . asnumpy ( )
# print the top-5
prob = np . squeeze ( prob )
2017-05-25 22:33:31 -07:00
a = np . argsort ( prob ) [ : : - 1 ]
for i in a [ 0 : 5 ] :
2017-05-07 22:19:25 -07:00
print ( ' probability= %f , class= %s ' % ( prob [ i ] , labels [ i ] ) )
```
2017-05-15 08:32:57 -07:00
Now, we can perform prediction with any downloadable URL:
2017-05-07 22:19:25 -07:00
``` python
predict ( ' http://writm.com/wp-content/uploads/2016/08/Cat-hd-wallpapers.jpg ' )
```
``` python
2017-05-25 09:43:02 -07:00
predict ( ' http://thenotoriouspug.com/wp-content/uploads/2015/01/Pug-Cookie-1920x1080-1024x576.jpg ' )
2017-05-07 22:19:25 -07:00
```
## Feature extraction
2017-06-12 14:57:09 -07:00
By feature extraction, we mean presenting the input images by the output of an
2017-05-07 22:19:25 -07:00
internal layer rather than the last softmax layer. These outputs, which can be
viewed as the feature of the raw input image, can then be used by other
applications such as object detection.
We can use the ``get_internals` ` method to get all internal layers from a
Symbol.
` ``python
# list the last 10 layers
all_layers = sym.get_internals()
all_layers.list_outputs()[-10:]
` ``
2017-05-31 20:41:27 +01:00
An often used layer for feature extraction is the one before the last fully
2017-06-12 14:57:09 -07:00
connected layer. For ResNet, and also Inception, it is the flattened layer with
2017-05-15 08:32:57 -07:00
name ` flatten0` which reshapes the 4-D convolutional layer output into 2-D for
the fully connected layer. The following source code extracts a new Symbol which
2017-06-12 14:57:09 -07:00
outputs the flattened layer and creates a model.
2017-05-07 22:19:25 -07:00
` ``python
fe_sym = all_layers['flatten0_output']
fe_mod = mx.mod.Module(symbol=fe_sym, context=mx.cpu(), label_names=None)
fe_mod.bind(for_training=False, data_shapes=[('data', (1,3,224,224))])
fe_mod.set_params(arg_params, aux_params)
` ``
2017-05-15 08:32:57 -07:00
We can now invoke ` forward` to obtain the features:
2017-05-07 22:19:25 -07:00
` ``python
img = get_image('http://writm.com/wp-content/uploads/2016/08/Cat-hd-wallpapers.jpg')
fe_mod.forward(Batch([mx.nd.array(img)]))
features = fe_mod.get_outputs()[0].asnumpy()
print(features)
assert features.shape == (1, 2048)
` ``
2017-05-08 11:30:09 -07:00
<!-- INSERT SOURCE DOWNLOAD BUTTONS -->