|
|
# Module - Neural network training and inference
|
||
|
|
|
||
|
|
We modularized commonly used codes for training and inference in the `module`
|
||
|
|
(or `mod` for short) package. This package provides intermediate-level and
|
||
|
|
high-level interface for executing predefined networks.
|
||
|
|
|
||
|
|
## Preliminary
|
||
|
|
|
||
|
|
In this tutorial, we will use train a multilayer perception on a
|
||
|
|
[UCI letter recognition](https://archive.ics.uci.edu/ml/datasets/letter+recognition)
|
||
|
|
dataset to demonostrate the usage of `Module`
|
||
|
|
|
||
|
|
We first download and split the dataset, and then create iterators that return a
|
||
|
|
batch of examples each time.
|
||
|
|
|
||
|
|
```python
|
||
|
|
import logging
|
||
|
|
logging.getLogger().setLevel(logging.INFO)
|
||
|
|
import mxnet as mx
|
||
|
|
import numpy as np
|
||
|
|
|
||
|
|
fname = mx.test_utils.download('http://archive.ics.uci.edu/ml/machine-learning-databases/letter-recognition/letter-recognition.data')
|
||
|
|
data = np.genfromtxt(fname, delimiter=',')[:,1:]
|
||
|
|
label = np.array([ord(l.split(',')[0])-ord('A') for l in open(fname, 'r')])
|
||
|
|
|
||
|
|
batch_size = 32
|
||
|
|
ntrain = int(data.shape[0]*0.8)
|
||
|
|
train_iter = mx.io.NDArrayIter(data[:ntrain, :], label[:ntrain], batch_size, shuffle=True)
|
||
|
|
val_iter = mx.io.NDArrayIter(data[ntrain:, :], label[ntrain:], batch_size)
|
||
|
|
```
|
||
|
|
|
||
|
|
Next we define the network:
|
||
|
|
|
||
|
|
```python
|
||
|
|
net = mx.sym.Variable('data')
|
||
|
|
net = mx.sym.FullyConnected(net, name='fc1', num_hidden=64)
|
||
|
|
net = mx.sym.Activation(net, name='relu1', act_type="relu")
|
||
|
|
net = mx.sym.FullyConnected(net, name='fc2', num_hidden=26)
|
||
|
|
net = mx.sym.SoftmaxOutput(net, name='softmax')
|
||
|
|
mx.viz.plot_network(net)
|
||
|
|
```
|
||
|
|
|
||
|
|
## High-level Interface
|
||
|
|
|
||
|
|
### Create Module
|
||
|
|
|
||
|
|
Now we are ready to introduce module. The commonly used module class is
|
||
|
|
`Module`. We can construct amodule by specifying:
|
||
|
|
|
||
|
|
- symbol : the network definition
|
||
|
|
- context : the device (or a list of devices) for execution
|
||
|
|
- data_names : the list of input data variable names
|
||
|
|
- label_names : the list of input label variable names
|
||
|
|
|
||
|
|
For `net`, we have only one data named `data`, and one label, with the name
|
||
|
|
`softmax_label`, which is automatically named for us following the name
|
||
|
|
`softmax` we specified for the `SoftmaxOutput` operator.
|
||
|
|
|
||
|
|
```python
|
||
|
|
mod = mx.mod.Module(symbol=net,
|
||
|
|
context=mx.cpu(),
|
||
|
|
data_names=['data'],
|
||
|
|
label_names=['softmax_label'])
|
||
|
|
```
|
||
|
|
|
||
|
|
### Train, Predict, and Evaluate
|
||
|
|
|
||
|
|
Modules provide high-level APIs for training, predicting and evaluating. To fit
|
||
|
|
a module, simply call the `fit` function.
|
||
|
|
|
||
|
|
|
||
|
|
```python
|
||
|
|
mod.fit(train_iter,
|
||
|
|
eval_data=val_iter,
|
||
|
|
optimizer='sgd',
|
||
|
|
optimizer_params={'learning_rate':0.1},
|
||
|
|
eval_metric='acc',
|
||
|
|
num_epoch=8)
|
||
|
|
```
|
||
|
|
|
||
|
|
To predict with a module, simply call `predict()`. It will collect and return
|
||
|
|
all the prediction results.
|
||
|
|
|
||
|
|
```python
|
||
|
|
y = mod.predict(val_iter)
|
||
|
|
assert y.shape == (4000, 26)
|
||
|
|
```
|
||
|
|
|
||
|
|
If we do not need the prediction outputs, but just need to evaluate on a test
|
||
|
|
set, we can call the `score()` function:
|
||
|
|
|
||
|
|
```python
|
||
|
|
mod.score(val_iter, ['mse', 'acc'])
|
||
|
|
```
|
||
|
|
|
||
|
|
### Save and Load
|
||
|
|
|
||
|
|
We can save the module parameters in each training epoch by using a checkpoint
|
||
|
|
callback.
|
||
|
|
|
||
|
|
```python
|
||
|
|
# construct a callback function to save checkpoints
|
||
|
|
model_prefix = 'mx_mlp'
|
||
|
|
checkpoint = mx.callback.do_checkpoint(model_prefix)
|
||
|
|
|
||
|
|
mod = mx.mod.Module(symbol=net)
|
||
|
|
mod.fit(train_iter, num_epoch=5, epoch_end_callback=checkpoint)
|
||
|
|
```
|
||
|
|
|
||
|
|
To load the saved module parameters, call the `load_checkpoint` function. It
|
||
|
|
load the Symbol and the associated parameters. We can then set the loaded
|
||
|
|
parameters into the module.
|
||
|
|
|
||
|
|
|
||
|
|
```python
|
||
|
|
sym, arg_params, aux_params = mx.model.load_checkpoint(model_prefix, 3)
|
||
|
|
assert sym.tojson() == net.tojson()
|
||
|
|
|
||
|
|
# assign the loaded parameters to the module
|
||
|
|
mod.set_params(arg_params, aux_params)
|
||
|
|
```
|
||
|
|
|
||
|
|
Or if we just want to resume training from a saved checkpoint, instead of
|
||
|
|
calling `set_params()`, we can directly call `fit()`, passing the loaded
|
||
|
|
parameters, so that `fit()` knows to start from those parameters instead of
|
||
|
|
initializing from random. We also set the `begin_epoch` so that so that `fit()`
|
||
|
|
knows we are resuming from a previous saved epoch.
|
||
|
|
|
||
|
|
|
||
|
|
```python
|
||
|
|
mod = mx.mod.Module(symbol=sym)
|
||
|
|
mod.fit(train_iter,
|
||
|
|
num_epoch=8,
|
||
|
|
arg_params=arg_params,
|
||
|
|
aux_params=aux_params,
|
||
|
|
begin_epoch=3)
|
||
|
|
```
|
||
|
|
|
||
|
|
## Intermediate-level Interface
|
||
|
|
|
||
|
|
We already seen how to module for basic training and inference. Now we are going
|
||
|
|
to show a more flexiable usage of module. Instead of calling the high-level
|
||
|
|
`fit` and `predict`, we can write a training program with the intermediate-level
|
||
|
|
interface such as `forward` and `backward`.
|
||
|
|
|
||
|
|
|
||
|
|
```python
|
||
|
|
# create module
|
||
|
|
mod = mx.mod.Module(symbol=net)
|
||
|
|
# allocate memory by given the input data and lable shapes
|
||
|
|
mod.bind(data_shapes=train_iter.provide_data, label_shapes=train_iter.provide_label)
|
||
|
|
# initialize parameters by uniform random numbers
|
||
|
|
mod.init_params(initializer=mx.init.Uniform(scale=.1))
|
||
|
|
# use SGD with learning rate 0.1 to train
|
||
|
|
mod.init_optimizer(optimizer='sgd', optimizer_params=(('learning_rate', 0.1), ))
|
||
|
|
# use accuracy as the metric
|
||
|
|
metric = mx.metric.create('acc')
|
||
|
|
# train 5 epoch, i.e. going over the data iter one pass
|
||
|
|
for epoch in range(5):
|
||
|
|
train_iter.reset()
|
||
|
|
metric.reset()
|
||
|
|
for batch in train_iter:
|
||
|
|
mod.forward(batch, is_train=True) # compute predictions
|
||
|
|
mod.update_metric(metric, batch.label) # accumulate prediction accuracy
|
||
|
|
mod.backward() # compute gradients
|
||
|
|
mod.update() # update parameters
|
||
|
|
print('Epoch %d, Training %s' % (epoch, metric.get()))
|
||
|
|
```
|
||
|
|
|
||
|
|
<!-- INSERT SOURCE DOWNLOAD BUTTONS -->
|