2017-03-18 23:45:00 -07:00
# Optimization: initialize and update weights
## Overview
This document summaries the APIs used to initialize and update the model weights
during training
``` eval_rst
.. autosummary::
:nosignatures:
mxnet.initializer
mxnet.optimizer
mxnet.lr_scheduler
```
2017-03-20 12:49:13 -07:00
and how to develop a new optimization algorithm in MXNet.
2017-03-18 23:45:00 -07:00
Assume there there is a pre-defined ``Symbol` ` and a ` `Module` ` is created for
it
` ``python
>>> data = mx.symbol.Variable('data')
>>> label = mx.symbol.Variable('softmax_label')
>>> fc = mx.symbol.FullyConnected(data, name='fc', num_hidden=10)
>>> loss = mx.symbol.SoftmaxOutput(fc, label, name='softmax')
>>> mod = mx.mod.Module(loss)
>>> mod.bind(data_shapes=[('data', (128,20))], label_shapes=[('softmax_label', (128,))])
` ``
Next we can initialize the weights with values sampled uniformly from
` `[-1,1]` `:
` ``python
>>> mod.init_params(mx.initializer.Uniform(scale=1.0))
` ``
Then we will train a model with standard SGD which decreases the learning rate
by multiplying 0.9 for each 100 batches.
` ``python
>>> lr_sch = mx.lr_scheduler.FactorScheduler(step=100, factor=0.9)
>>> mod.init_optimizer(
... optimizer='sgd', optimizer_params=(('learning_rate', 0.1), ('lr_scheduler', lr_sch)))
` ``
Finally run ` `mod.fit(...)` ` to start training.
## The ` `mxnet.initializer` ` package
` ``eval_rst
.. currentmodule:: mxnet.initializer
` ``
The base class ` `Initializer` ` defines the default behaviors to initialize
various parameters, such as set bias to 1, except for the weight. Other classes
then defines how to initialize the weight.
` ``eval_rst
.. autosummary::
:nosignatures:
Initializer
Uniform
Normal
Load
Mixed
Zero
One
Constant
Orthogonal
Xavier
MSRAPrelu
Bilinear
FusedRNN
` ``
## The ` `mxnet.optimizer` ` package
` ``eval_rst
.. currentmodule:: mxnet.optimizer
` ``
The base class ` `Optimizer` ` accepts commonly shared arguments such as
` `learning_rate` ` and defines the interface. Each other class in this package
implements one weight updating function.
` ``eval_rst
.. autosummary::
:nosignatures:
Optimizer
SGD
NAG
RMSProp
Adam
AdaGrad
AdaDelta
2018-02-13 08:01:14 -08:00
Adamax
Nadam
2017-03-18 23:45:00 -07:00
DCASGD
SGLD
2018-02-13 08:01:14 -08:00
Signum
FTML
LBSGD
Ftrl
2017-03-18 23:45:00 -07:00
` ``
## The ` `mxnet.lr_scheduler` ` package
` ``eval_rst
.. currentmodule:: mxnet.lr_scheduler
` ``
The base class ` `LRScheduler` ` defines the interface, while other classes
implement various schemes to change the learning rate during training.
` ``eval_rst
.. autosummary::
:nosignatures:
LRScheduler
FactorScheduler
MultiFactorScheduler
` ``
2017-03-20 12:49:13 -07:00
## Implement a new algorithm
2017-03-18 23:45:00 -07:00
Most classes listed in this document are implemented in Python by using ` `NDArray` `.
So implementing new weight updating or initialization functions is
straightforward.
For ` initializer`, create a subclass of ` `Initializer` ` and define the
` _init_weight` method. We can also change the default behaviors to initialize
other parameters such as ` _init_bias`. See
[` initializer.py`](https://github.com/dmlc/mxnet/blob/master/python/mxnet/initializer.py)
for examples.
For ` `optimizer` `, create a subclass of ` `Optimizer` `
and implement two methods ` `create_state` ` and ` `update` `. Also add
` `@mx.optimizer.Optimizer.register` ` before this class. See
[` optimizer.py`](https://github.com/dmlc/mxnet/blob/master/python/mxnet/optimizer.py)
for examples.
For ` lr_scheduler`, create a subclass of ` LRScheduler` and then implement the
` __call__` method. See
[` lr_scheduler.py`](https://github.com/dmlc/mxnet/blob/master/python/mxnet/lr_scheduler.py)
for examples.
## API Reference
<script type="text/javascript" src='../../_static/js/auto_module_index.js'></script>
` ``eval_rst
.. automodule:: mxnet.optimizer
:members:
.. automodule:: mxnet.lr_scheduler
:members:
.. automodule:: mxnet.initializer
:members:
` ``
<script>auto_index("api-reference");</script>