Blame: tests/python/quantization/test_quantization.py - apache/mxnet

apache / mxnet UNCLAIMED

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

0 0 0 C++

Normal View History Raw

[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00			`# Licensed to the Apache Software Foundation (ASF) under one`
			`# or more contributor license agreements. See the NOTICE file`
			`# distributed with this work for additional information`
			`# regarding copyright ownership. The ASF licenses this file`
			`# to you under the Apache License, Version 2.0 (the`
			`# "License"); you may not use this file except in compliance`
			`# with the License. You may obtain a copy of the License at`
			`#`
			`# http://www.apache.org/licenses/LICENSE-2.0`
			`#`
			`# Unless required by applicable law or agreed to in writing,`
			`# software distributed under the License is distributed on an`
			`# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY`
			`# KIND, either express or implied. See the License for the`
			`# specific language governing permissions and limitations`
			`# under the License.`

			`"""Some of the tests using CUDNN require a special GPU instruction called dp4a.`
			`Ref: http://images.nvidia.com/content/pdf/tesla/184457-Tesla-P4-Datasheet-NV-Final-Letter-Web.pdf`
			`"""`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`import os`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00			`import mxnet as mx`
			`import numpy as np`
Add quantization support for GluonCV (#15754) * enhance quantization api * integrate gluoncv solution * support gluon ssd * enhance api * [TODO]split to another PR * enhance example script * add wildcard match for exclude layers * support int8 dtype parameter * enable dataiter api * use try method * add unit test for quantize gluon * fix lint * fix lint 2 * fix temporary directory in python2 * fix lint * fix try import and add todo * trigger 2019-08-07 13:29:43 +08:00			`from mxnet.gluon.model_zoo import vision`
[MXNET-688] Fix quantization divide by zero errors (#11833) * Fix quantization bug * Added tests and made sure the edge case is now considered correctly without 1 off errors * Changed back to original truncated distribution but with different kl divergence calc * Reorder back to original format * Reorder back to original format (again) * Change comments * Clarified comments * Changed norm division 2018-07-23 22:58:29 -07:00			`from mxnet.test_utils import assert_almost_equal, assert_exception, rand_ndarray, rand_shape_nd, same, DummyIter`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00			`from common import with_seed`
			`from mxnet.module import Module`
			`from mxnet.io import NDArrayIter`
[MXNET-602] [MXNET-603] Disabling a number of tests to improve CI stability. (#11422) * NIT: cleanup whitespace in Jenkinsfile * Mark test_get_optimal_thresholds as flaky See https://github.com/apache/incubator-mxnet/issues/11456 * Marking test_hybrid_static_memory_switching as flaky 2018-06-28 14:15:40 +02:00			`import unittest`
Support Quantized Fully Connected by INT8 GEMM (#12922) * add quantized fully connect support * disable qfc cpu case since s8u8s32 is only supported by MKL BLAS library * retrigger to ci testing * move implementation to cc file and add STORAGE_TYPE_ASSIGN_CHECK * fix typo bug * retrigger the ci test * fix typo bug * retrigger ci * retrigger the ci test * retrigger the ci * retrigger the ci test * retrigger ci test * fix indent issue * retrigger the ci * retrigger the ci test * add verbose message * update log message * using range for loop * using for auto range * enable MKL BLAS ci test * fix typo issue * use TYPE_ASSIGN_CHECK * retrigger the ci 2018-12-15 13:35:08 +08:00			`import operator`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`def is_test_for_gpu():`
			`return mx.current_context().device_type == 'gpu'`

			`def is_test_for_mkldnn():`
			`return (mx.current_context().device_type == 'cpu'`
			`and os.environ.get('ENABLE_MKLDNN_QUANTIZATION_TEST') == '1')`

			`def is_test_for_native_cpu():`
			`return (mx.current_context().device_type == 'cpu'`
			`and os.environ.get('ENABLE_MKLDNN_QUANTIZATION_TEST') == None)`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00
			`@with_seed()`
			`def test_quantize_float32_to_int8():`
			`shape = rand_shape_nd(4)`
			`data = rand_ndarray(shape, 'default', dtype='float32')`
			`min_range = mx.nd.min(data)`
			`max_range = mx.nd.max(data)`
			`qdata, min_val, max_val = mx.nd.contrib.quantize(data, min_range, max_range, out_type='int8')`
			`data_np = data.asnumpy()`
			`min_range = min_range.asscalar()`
			`max_range = max_range.asscalar()`
			`real_range = np.maximum(np.abs(min_range), np.abs(max_range))`
			`quantized_range = 127.0`
			`scale = quantized_range / real_range`
			`assert qdata.dtype == np.int8`
			`assert min_val.dtype == np.float32`
			`assert max_val.dtype == np.float32`
			`assert same(min_val.asscalar(), -real_range)`
			`assert same(max_val.asscalar(), real_range)`
			`qdata_np = (np.sign(data_np) * np.minimum(np.abs(data_np) * scale + 0.5, quantized_range)).astype(np.int8)`
Fix flaky tests for quantize and requantize (#12040) 2018-08-10 19:46:01 +08:00			`assert_almost_equal(qdata.asnumpy(), qdata_np, atol = 1)`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00

			`@with_seed()`
			`def test_dequantize_int8_to_float32():`
[Mxnet-1397] Support symbolic api for requantize and dequantize (#14749) * Adding support for symbolic API for requantize and dequantize * Adding name to contributors list * Removing redundant code * Addressing indentation and using current_context() instead of cpu() * merge from master * merge from master 2019-04-24 14:06:29 -07:00
			`def get_test_data(real_range, qdata_np):`
			`qdata = mx.nd.array(qdata_np, dtype=np.int8)`
			`min_range = mx.nd.array([-real_range], dtype=np.float32)`
			`max_range = mx.nd.array([real_range], dtype=np.float32)`
			`return qdata, min_range, max_range`

			`def baseline_dequantization(qdata, real_range, qdata_np):`
			`quantized_range = 127.0`
			`scale = real_range / quantized_range`
			`data_np = qdata_np * scale`
			`return data_np`

			`def test_nd_array_dequantization(qdata, min_range, max_range, expected_result):`
			`data = mx.nd.contrib.dequantize(qdata, min_range, max_range, out_type='float32')`
			`assert data.dtype == np.float32`
Improve quantization flow (#15961) * Add mkldnn imlementation for quantized flatten and smart quantize mode Change-Id: Id9d7504890852faf4e84fdcd66585d1fd78beeb2 * Add calibrate op Change-Id: I4c82f64dbef501d2560f7ffee93991119a66a5ee * Fix merge Change-Id: I068df3d4f3309bc9b950a3b869da9407282c8577 * Fix lint Change-Id: Ia38369d31c33d0f76a671275910729dfce693950 * Run CI * Run CI * Update test_quantization.py * Run CI * Fix CI Change-Id: I7479327db5ebc7c57b7bd810a67d2b765c820534 * Fix CI Change-Id: I4273938cb972c12b8f43dbd95c736a7d32df040e * Fix CI Change-Id: I80b47bd1d95520a7cd78cacbbc1a85fe0900123d * Fix CI Change-Id: I56542470010e7bc403f62dc8a8991c2fb58d229e * Fix CI Change-Id: If8482fe4da2f3d627dd3cbac8795e021a09a441f * Fix GPU Change-Id: Ic239cbf7aa3d111f2895badd1cac196fce6a1b86 2019-08-29 11:39:33 +08:00			`assert_almost_equal(data.asnumpy(), expected_result, atol = 1)`
[Mxnet-1397] Support symbolic api for requantize and dequantize (#14749) * Adding support for symbolic API for requantize and dequantize * Adding name to contributors list * Removing redundant code * Addressing indentation and using current_context() instead of cpu() * merge from master * merge from master 2019-04-24 14:06:29 -07:00
			`def test_symbolic_api_dequantization(qdata, min_range, max_range, expected_result):`
			`sym_data = mx.sym.Variable('data')`
			`sym_min_range = mx.sym.Variable('min_range')`
			`sym_max_range = mx.sym.Variable('max_range')`
[MKLDNN]Refactor requantize to speed up execution (#14608) * Refactor requantize * fix ci * Fix CI * Fix ci 2019-04-28 09:19:37 +08:00			`dequant = mx.sym.contrib.dequantize(sym_data, sym_min_range,`
[Mxnet-1397] Support symbolic api for requantize and dequantize (#14749) * Adding support for symbolic API for requantize and dequantize * Adding name to contributors list * Removing redundant code * Addressing indentation and using current_context() instead of cpu() * merge from master * merge from master 2019-04-24 14:06:29 -07:00			`sym_max_range, out_type='float32')`
[MKLDNN]Refactor requantize to speed up execution (#14608) * Refactor requantize * fix ci * Fix CI * Fix ci 2019-04-28 09:19:37 +08:00			`out = dequant.bind(ctx=mx.current_context(),`
[Mxnet-1397] Support symbolic api for requantize and dequantize (#14749) * Adding support for symbolic API for requantize and dequantize * Adding name to contributors list * Removing redundant code * Addressing indentation and using current_context() instead of cpu() * merge from master * merge from master 2019-04-24 14:06:29 -07:00			`args={'data':qdata, 'min_range':min_range, 'max_range':max_range})`
			`data = out.forward()[0]`
			`assert data.dtype == np.float32`
Improve quantization flow (#15961) * Add mkldnn imlementation for quantized flatten and smart quantize mode Change-Id: Id9d7504890852faf4e84fdcd66585d1fd78beeb2 * Add calibrate op Change-Id: I4c82f64dbef501d2560f7ffee93991119a66a5ee * Fix merge Change-Id: I068df3d4f3309bc9b950a3b869da9407282c8577 * Fix lint Change-Id: Ia38369d31c33d0f76a671275910729dfce693950 * Run CI * Run CI * Update test_quantization.py * Run CI * Fix CI Change-Id: I7479327db5ebc7c57b7bd810a67d2b765c820534 * Fix CI Change-Id: I4273938cb972c12b8f43dbd95c736a7d32df040e * Fix CI Change-Id: I80b47bd1d95520a7cd78cacbbc1a85fe0900123d * Fix CI Change-Id: I56542470010e7bc403f62dc8a8991c2fb58d229e * Fix CI Change-Id: If8482fe4da2f3d627dd3cbac8795e021a09a441f * Fix GPU Change-Id: Ic239cbf7aa3d111f2895badd1cac196fce6a1b86 2019-08-29 11:39:33 +08:00			`assert_almost_equal(data.asnumpy(), expected_result, atol = 1)`
[Mxnet-1397] Support symbolic api for requantize and dequantize (#14749) * Adding support for symbolic API for requantize and dequantize * Adding name to contributors list * Removing redundant code * Addressing indentation and using current_context() instead of cpu() * merge from master * merge from master 2019-04-24 14:06:29 -07:00
Improve quantization flow (#15961) * Add mkldnn imlementation for quantized flatten and smart quantize mode Change-Id: Id9d7504890852faf4e84fdcd66585d1fd78beeb2 * Add calibrate op Change-Id: I4c82f64dbef501d2560f7ffee93991119a66a5ee * Fix merge Change-Id: I068df3d4f3309bc9b950a3b869da9407282c8577 * Fix lint Change-Id: Ia38369d31c33d0f76a671275910729dfce693950 * Run CI * Run CI * Update test_quantization.py * Run CI * Fix CI Change-Id: I7479327db5ebc7c57b7bd810a67d2b765c820534 * Fix CI Change-Id: I4273938cb972c12b8f43dbd95c736a7d32df040e * Fix CI Change-Id: I80b47bd1d95520a7cd78cacbbc1a85fe0900123d * Fix CI Change-Id: I56542470010e7bc403f62dc8a8991c2fb58d229e * Fix CI Change-Id: If8482fe4da2f3d627dd3cbac8795e021a09a441f * Fix GPU Change-Id: Ic239cbf7aa3d111f2895badd1cac196fce6a1b86 2019-08-29 11:39:33 +08:00			`real_range = 128`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00			`shape = rand_shape_nd(4)`
			`qdata_np = np.random.uniform(low=-127, high=127, size=shape).astype(dtype=np.int8)`
[Mxnet-1397] Support symbolic api for requantize and dequantize (#14749) * Adding support for symbolic API for requantize and dequantize * Adding name to contributors list * Removing redundant code * Addressing indentation and using current_context() instead of cpu() * merge from master * merge from master 2019-04-24 14:06:29 -07:00			`qdata, min_range, max_range = get_test_data(real_range, qdata_np)`
			`expected_result = baseline_dequantization(qdata, real_range, qdata_np)`
			`# test nd array implementation.`
			`test_nd_array_dequantization(qdata, min_range, max_range, expected_result)`
			`# test symbolic api implementaion.`
			`test_symbolic_api_dequantization(qdata, min_range, max_range, expected_result)`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00
			`@with_seed()`
			`def test_requantize_int32_to_int8():`
			`def quantized_int32_to_float(qdata, min_range, max_range):`
			`assert qdata.dtype == 'int32'`
			`quantized_range = np.iinfo('int32').max`
			`real_range = np.maximum(np.abs(min_range), np.abs(max_range))`
			`scale = float(real_range) / float(quantized_range)`
			`return qdata.astype('float32') * scale`

			`def float_to_quantized_int8(data, min_range, max_range):`
			`assert data.dtype == 'float32'`
			`real_range = np.maximum(np.abs(min_range), np.abs(max_range))`
			`quantized_range = np.iinfo('int8').max`
			`scale = float(quantized_range) / float(real_range)`
			`return (np.sign(data) * np.minimum(np.abs(data) * scale + 0.5, quantized_range)).astype('int8')`

			`def requantize(qdata, min_data, max_data, real_range):`
			`data = quantized_int32_to_float(qdata, min_data, max_data)`
			`output = float_to_quantized_int8(data, -real_range, real_range)`
			`return output, -real_range, real_range`

			`def requantize_baseline(qdata, min_data, max_data, min_calib_range=None, max_calib_range=None):`
			`if min_calib_range is not None and max_calib_range is not None:`
			`real_range = np.maximum(np.abs(min_calib_range), np.abs(max_calib_range))`
			`return requantize(qdata, min_data, max_data, real_range)`
			`else:`
			`min_range = quantized_int32_to_float(np.min(qdata), min_data, max_data)`
			`max_range = quantized_int32_to_float(np.max(qdata), min_data, max_data)`
			`return requantize(qdata, min_data, max_data, np.maximum(np.abs(min_range), np.abs(max_range)))`

			`def check_requantize(shape, min_calib_range=None, max_calib_range=None):`
			`qdata = mx.nd.random.uniform(low=-1000.0, high=1000.0, shape=shape).astype('int32')`
			`min_range = mx.nd.array([-1010.0])`
			`max_range = mx.nd.array([1020.0])`
			`if min_calib_range is None or max_calib_range is None:`
			`qdata_int8, min_output, max_output = mx.nd.contrib.requantize(qdata, min_range, max_range)`
			`else:`
			`qdata_int8, min_output, max_output = mx.nd.contrib.requantize(qdata, min_range, max_range,`
[MKLDNN]Refactor requantize to speed up execution (#14608) * Refactor requantize * fix ci * Fix CI * Fix ci 2019-04-28 09:19:37 +08:00			`min_calib_range=min_calib_range,`
			`max_calib_range=max_calib_range)`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00
			`qdata_int8_np, min_output_np, max_output_np = requantize_baseline(qdata.asnumpy(), min_range.asscalar(),`
			`max_range.asscalar(),`
			`min_calib_range=min_calib_range,`
			`max_calib_range=max_calib_range)`
Fix flaky tests for quantize and requantize (#12040) 2018-08-10 19:46:01 +08:00			`assert_almost_equal(qdata_int8.asnumpy(), qdata_int8_np, atol = 1)`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00			`assert_almost_equal(min_output.asnumpy(), np.array([min_output_np]))`
			`assert_almost_equal(max_output.asnumpy(), np.array([max_output_np]))`
[MKLDNN]Refactor requantize to speed up execution (#14608) * Refactor requantize * fix ci * Fix CI * Fix ci 2019-04-28 09:19:37 +08:00
[Mxnet-1397] Support symbolic api for requantize and dequantize (#14749) * Adding support for symbolic API for requantize and dequantize * Adding name to contributors list * Removing redundant code * Addressing indentation and using current_context() instead of cpu() * merge from master * merge from master 2019-04-24 14:06:29 -07:00			`def check_requantize_with_symbol(shape, min_calib_range=None, max_calib_range=None):`
			`qdata = mx.nd.random.uniform(low=-1000.0, high=1000.0, shape=shape).astype('int32')`
			`min_range = mx.nd.array([-1010.0])`
			`max_range = mx.nd.array([1020.0])`
			`sym_data = mx.sym.Variable('data')`
			`sym_min_range = mx.sym.Variable('min_range')`
			`sym_max_range = mx.sym.Variable('max_range')`
			`if min_calib_range is None or max_calib_range is None:`
			`requant = mx.sym.contrib.requantize(sym_data, sym_min_range, sym_max_range)`
[MKLDNN]Refactor requantize to speed up execution (#14608) * Refactor requantize * fix ci * Fix CI * Fix ci 2019-04-28 09:19:37 +08:00			`out = requant.bind(ctx=mx.current_context(),`
			`args={'data':qdata, 'min_range':min_range,`
			`'max_range':max_range})`
[Mxnet-1397] Support symbolic api for requantize and dequantize (#14749) * Adding support for symbolic API for requantize and dequantize * Adding name to contributors list * Removing redundant code * Addressing indentation and using current_context() instead of cpu() * merge from master * merge from master 2019-04-24 14:06:29 -07:00			`qdata_int8, min_output, max_output = out.forward()`
			`else:`
[MKLDNN]Refactor requantize to speed up execution (#14608) * Refactor requantize * fix ci * Fix CI * Fix ci 2019-04-28 09:19:37 +08:00			`requant = mx.sym.contrib.requantize(sym_data, sym_min_range, sym_max_range,`
			`min_calib_range=min_calib_range,`
			`max_calib_range=max_calib_range)`
			`out = requant.bind(ctx=mx.current_context(), args={'data':qdata, 'min_range':min_range,`
			`'max_range':max_range})`
			`qdata_int8, min_output, max_output = out.forward()`

[Mxnet-1397] Support symbolic api for requantize and dequantize (#14749) * Adding support for symbolic API for requantize and dequantize * Adding name to contributors list * Removing redundant code * Addressing indentation and using current_context() instead of cpu() * merge from master * merge from master 2019-04-24 14:06:29 -07:00			`qdata_int8_np, min_output_np, max_output_np = requantize_baseline(qdata.asnumpy(), min_range.asscalar(),`
			`max_range.asscalar(),`
			`min_calib_range=min_calib_range,`
			`max_calib_range=max_calib_range)`
fix requantize flaky test (#16709) 2019-11-05 13:57:06 +08:00			`assert_almost_equal(qdata_int8.asnumpy(), qdata_int8_np, atol = 1)`
[Mxnet-1397] Support symbolic api for requantize and dequantize (#14749) * Adding support for symbolic API for requantize and dequantize * Adding name to contributors list * Removing redundant code * Addressing indentation and using current_context() instead of cpu() * merge from master * merge from master 2019-04-24 14:06:29 -07:00			`assert_almost_equal(min_output.asnumpy(), np.array([min_output_np]))`
			`assert_almost_equal(max_output.asnumpy(), np.array([max_output_np]))`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00
[Mxnet-1397] Support symbolic api for requantize and dequantize (#14749) * Adding support for symbolic API for requantize and dequantize * Adding name to contributors list * Removing redundant code * Addressing indentation and using current_context() instead of cpu() * merge from master * merge from master 2019-04-24 14:06:29 -07:00			`# test with symbol API.`
			`check_requantize_with_symbol((3, 4, 10, 10))`
			`check_requantize_with_symbol((32, 3, 23, 23))`
			`check_requantize_with_symbol((3, 4, 10, 10), min_calib_range=-1050.0, max_calib_range=1040.0)`
			`check_requantize_with_symbol((32, 3, 23, 23), min_calib_range=-134.349, max_calib_range=523.43)`
			`# Test with nd array API`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00			`check_requantize((3, 4, 10, 10))`
			`check_requantize((32, 3, 23, 23))`
			`check_requantize((3, 4, 10, 10), min_calib_range=-1050.0, max_calib_range=1040.0)`
			`check_requantize((32, 3, 23, 23), min_calib_range=-134.349, max_calib_range=523.43)`


			`@with_seed()`
			`def test_quantized_conv():`
[v1.x] Backport #17689 and #17884 to v1.x branch (#18064) * [MKLDNN] apply MKLDNNRun to quantized_act/transpose (#17689) * apply MKLDNNRun to quantized_act/transpose ops * run CI * [MKL-DNN] Integrate Conv3d and Pool3d/1d (#17884) * Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master * fix conflicts * fix CI * rebase 2020-04-18 19:56:38 +08:00			`def check_quantized_conv(data_shape, kernel, num_filter, pad, stride, dilate, no_bias, qdtype):`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`if is_test_for_native_cpu():`
			`print('skipped testing quantized_conv for native cpu since it is not supported yet')`
			`return`
skip quantized conv flaky case (#16866) * Fix quantized concat when inputs are mixed int8 and uint8 Change-Id: I4da04bf4502425134a466823fb5f73da2d7a419b * skip flaky test * trigger ci 2019-12-09 13:57:52 +08:00			`elif is_test_for_mkldnn():`
[v1.9.x] TLP Updates (#21148) * Update repo URLs and website to remove Incubating references. * Update repo to remove references to Apache Incubator, update website, remove DISCLAIMER. * Update license check configuration. * Update license check configuration. * Update license check configuration. * Update license check configuration. * Update license check configuration. * Add Apache 2.0 license to files without it. * Add Apache 2.0 license to files without it. * Remove references to DISCLAIMER in build scripts / configs. * Rearrange dependencies for ubuntu_tutorials to prevent pip hangs. * Change node type for Cpp: MKLDNN+GPU builds. * Update node type * Add missing node assign for G4 node type. 2022-11-21 09:02:56 -08:00			`# (TODO)Xinyu: https://github.com/apache/mxnet/issues/16830`
skip quantized conv flaky case (#16866) * Fix quantized concat when inputs are mixed int8 and uint8 Change-Id: I4da04bf4502425134a466823fb5f73da2d7a419b * skip flaky test * trigger ci 2019-12-09 13:57:52 +08:00			`print('skipped testing quantized_conv for mkldnn cpu since it is a flaky case')`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`return`
			`elif qdtype == 'uint8' and is_test_for_gpu():`
			`print('skipped testing quantized_conv for gpu uint8 since it is not supported yet')`
			`return`
[v1.x] Backport #17689 and #17884 to v1.x branch (#18064) * [MKLDNN] apply MKLDNNRun to quantized_act/transpose (#17689) * apply MKLDNNRun to quantized_act/transpose ops * run CI * [MKL-DNN] Integrate Conv3d and Pool3d/1d (#17884) * Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master * fix conflicts * fix CI * rebase 2020-04-18 19:56:38 +08:00			`elif is_test_for_gpu() and len(data_shape) != 4:`
			`print('skipped testing quantized_conv for gpu 5d layout since it is not supported yet')`
			`return`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00
			`# run fp32 conv`
			`data = mx.sym.Variable(name='data', shape=data_shape, dtype='float32')`
[v1.x] Backport #17689 and #17884 to v1.x branch (#18064) * [MKLDNN] apply MKLDNNRun to quantized_act/transpose (#17689) * apply MKLDNNRun to quantized_act/transpose ops * run CI * [MKL-DNN] Integrate Conv3d and Pool3d/1d (#17884) * Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master * fix conflicts * fix CI * rebase 2020-04-18 19:56:38 +08:00			`conv = mx.sym.Convolution(data=data, kernel=kernel, num_filter=num_filter, pad=pad, stride=stride,`
			`dilate=dilate, no_bias=no_bias, cudnn_off=False, name='conv')`
			`arg_shapes, _, _ = conv.infer_shape(data=data_shape)`
			`arg_names = conv.list_arguments()`
			`conv_exe_fp32 = conv.simple_bind(ctx=mx.current_context(), grad_req='null')`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`if qdtype == 'uint8':`
			`data_low = 0.0`
			`data_high = 127.0`
			`else:`
			`data_low = -127.0`
			`data_high = 127.0`
			`conv_exe_fp32.arg_dict[arg_names[0]][:] = mx.nd.random.uniform(low=data_low, high=data_high,`
[v1.x] Backport #17689 and #17884 to v1.x branch (#18064) * [MKLDNN] apply MKLDNNRun to quantized_act/transpose (#17689) * apply MKLDNNRun to quantized_act/transpose ops * run CI * [MKL-DNN] Integrate Conv3d and Pool3d/1d (#17884) * Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master * fix conflicts * fix CI * rebase 2020-04-18 19:56:38 +08:00			`shape=data_shape).astype('int32')`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`conv_exe_fp32.arg_dict[arg_names[1]][:] = mx.nd.random.uniform(low=-127.0, high=127.0,`
[v1.x] Backport #17689 and #17884 to v1.x branch (#18064) * [MKLDNN] apply MKLDNNRun to quantized_act/transpose (#17689) * apply MKLDNNRun to quantized_act/transpose ops * run CI * [MKL-DNN] Integrate Conv3d and Pool3d/1d (#17884) * Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master * fix conflicts * fix CI * rebase 2020-04-18 19:56:38 +08:00			`shape=arg_shapes[1]).astype('int32')`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`if not no_bias:`
			`conv_exe_fp32.arg_dict[arg_names[2]][:] = mx.nd.random.uniform(low=-127.0, high=127.0,`
[v1.x] Backport #17689 and #17884 to v1.x branch (#18064) * [MKLDNN] apply MKLDNNRun to quantized_act/transpose (#17689) * apply MKLDNNRun to quantized_act/transpose ops * run CI * [MKL-DNN] Integrate Conv3d and Pool3d/1d (#17884) * Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master * fix conflicts * fix CI * rebase 2020-04-18 19:56:38 +08:00			`shape=arg_shapes[2]).astype('int32')`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`output = conv_exe_fp32.forward()[0]`

			`# run quantized conv`
			`qdata = mx.sym.Variable(name='qdata', shape=data_shape, dtype=qdtype)`
			`qweight = mx.sym.Variable(name='qweight', dtype='int8')`
			`min_data = mx.sym.Variable(name='min_data')`
			`max_data = mx.sym.Variable(name='max_data')`
			`min_weight = mx.sym.Variable(name='min_weight')`
			`max_weight = mx.sym.Variable(name='max_weight')`
[v1.x] Backport #17689 and #17884 to v1.x branch (#18064) * [MKLDNN] apply MKLDNNRun to quantized_act/transpose (#17689) * apply MKLDNNRun to quantized_act/transpose ops * run CI * [MKL-DNN] Integrate Conv3d and Pool3d/1d (#17884) * Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master * fix conflicts * fix CI * rebase 2020-04-18 19:56:38 +08:00			`quantized_conv = mx.sym.contrib.quantized_conv(data=qdata, weight=qweight, min_data=min_data,`
			`max_data=max_data, min_weight=min_weight,`
			`max_weight=max_weight, kernel=kernel,`
			`num_filter=num_filter, pad=pad, stride=stride,`
			`dilate=dilate, no_bias=no_bias)`
			`qarg_names = quantized_conv.list_arguments()`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`type_dict = None`
			`if not no_bias:`
			`type_dict = {qarg_names[2]: 'int8'}`
[v1.x] Backport #17689 and #17884 to v1.x branch (#18064) * [MKLDNN] apply MKLDNNRun to quantized_act/transpose (#17689) * apply MKLDNNRun to quantized_act/transpose ops * run CI * [MKL-DNN] Integrate Conv3d and Pool3d/1d (#17884) * Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master * fix conflicts * fix CI * rebase 2020-04-18 19:56:38 +08:00			`conv_exe_int8 = quantized_conv.simple_bind(ctx=mx.current_context(), type_dict=type_dict, grad_req='null')`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`conv_exe_int8.arg_dict[qarg_names[0]][:] = conv_exe_fp32.arg_dict[arg_names[0]].astype(qdtype)`
			`conv_exe_int8.arg_dict[qarg_names[1]][:] = conv_exe_fp32.arg_dict[arg_names[1]].astype('int8')`
			`quantized_range = 127.0`
			`if no_bias:`
			`conv_exe_int8.arg_dict[qarg_names[2]][:] = -quantized_range`
			`conv_exe_int8.arg_dict[qarg_names[3]][:] = quantized_range`
			`conv_exe_int8.arg_dict[qarg_names[4]][:] = -quantized_range`
			`conv_exe_int8.arg_dict[qarg_names[5]][:] = quantized_range`
			`else:`
			`conv_exe_int8.arg_dict[qarg_names[2]][:] = conv_exe_fp32.arg_dict[arg_names[2]].astype('int8')`
			`conv_exe_int8.arg_dict[qarg_names[3]][:] = -quantized_range`
			`conv_exe_int8.arg_dict[qarg_names[4]][:] = quantized_range`
			`conv_exe_int8.arg_dict[qarg_names[5]][:] = -quantized_range`
			`conv_exe_int8.arg_dict[qarg_names[6]][:] = quantized_range`
			`conv_exe_int8.arg_dict[qarg_names[7]][:] = -quantized_range`
			`conv_exe_int8.arg_dict[qarg_names[8]][:] = quantized_range`
			`qoutput, min_range, max_range = conv_exe_int8.forward()`

			`if no_bias:`
fix flaky test (#16074) 2019-09-03 01:47:20 +08:00			`assert_almost_equal(output.asnumpy(), qoutput.asnumpy(), atol = 1)`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`else:`
			`# with adding bias, accuracy loss should not be greater than one`
			`diff = mx.nd.abs(output - qoutput.astype(output.dtype))`
			`cond = mx.nd.lesser(2, diff).sum().asscalar()`
			`assert cond == 0`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`for qdtype in ['int8', 'uint8']:`
[v1.x] Backport #17689 and #17884 to v1.x branch (#18064) * [MKLDNN] apply MKLDNNRun to quantized_act/transpose (#17689) * apply MKLDNNRun to quantized_act/transpose ops * run CI * [MKL-DNN] Integrate Conv3d and Pool3d/1d (#17884) * Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master * fix conflicts * fix CI * rebase 2020-04-18 19:56:38 +08:00			`check_quantized_conv((3, 4, 28, 28), (3, 3), 128, (1, 1), (1, 1), (1, 1), True, qdtype)`
			`check_quantized_conv((3, 4, 28, 28), (3, 3), 128, (1, 1), (1, 1), (1, 1), False, qdtype)`
			`check_quantized_conv((1, 3, 4, 28, 28), (1, 3, 3), 128, (1, 1, 1), (1, 1, 1), (1, 1, 1), False, qdtype)`
			`check_quantized_conv((1, 3, 4, 28, 28), (1, 3, 3), 128, (1, 1, 1), (1, 1, 1), (1, 1, 1), True, qdtype)`
			`check_quantized_conv((1, 3, 4, 28, 28), (1, 3, 3), 128, (1, 1, 1), (1, 1, 1), (2, 2, 2), False, qdtype)`
			`check_quantized_conv((1, 3, 4, 28, 28), (1, 3, 3), 128, (1, 1, 1), (1, 1, 1), (2, 2, 2), True, qdtype)`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00
[MKLDNN] add quantized sum (#14614) * add quantized sum * fix gpu compiler error and cpu testcase fail * add default forward function for quantized_sum * skip quantized_sum for gpu ctx * fix comments * fix indetation and comments * retrigger CI * alloc memeory through TmpMemMgr * fix comments Apr.12 * change sum to elemwise_add * change Sum to ElemwiseAdd * fix indents * retrigger CI * trigger CI * fix indentation and typo * trigger CI * fix typo * fix typo * remove USE_MKLDNN macro for requantize params * rename param same as its op * trigger CI * trigger CI * trigger CI 2019-05-01 05:56:04 +08:00
			`@with_seed()`
			`def test_quantized_elemwise_add():`
			`def check_quantized_elemwise_add(data_shape, qtype):`
			`if is_test_for_native_cpu():`
			`print('skipped testing quantized_elemwise_add for native cpu since it is not supported yet')`
			`return`
			`elif qtype != 'uint8' and qtype != 'int8':`
			`print('skipped testing quantized_elemwise_add for not supported data type')`
			`return`
			`elif is_test_for_gpu():`
			`print('skipped testing quantized_elemwise_add for gpu since it is not supported yet')`
			`return`

			`dataA = mx.sym.Variable(name='dataA', shape=data_shape, dtype='float32')`
			`dataB = mx.sym.Variable(name='dataB', shape=data_shape, dtype='float32')`
			`elemwise_add_fp32 = mx.sym.elemwise_add(dataA, dataB)`
			`arg_names = elemwise_add_fp32.list_arguments()`
			`elemwise_add_fp32_exe = elemwise_add_fp32.simple_bind(ctx=mx.current_context(), grad_req='null')`
			`if qtype == 'uint8':`
			`data_low = 0.0`
			`data_high = 255.0`
			`else:`
			`data_low = -127.0`
			`data_high = 127.0`

			`dataA_val = mx.nd.random.uniform(low=data_low, high=data_high, shape=data_shape).astype('int32')`
			`dataB_val = mx.nd.random.uniform(low=data_low, high=data_high, shape=data_shape).astype('int32')`
			`elemwise_add_fp32_exe.arg_dict[arg_names[0]][:] = dataA_val`

			`elemwise_add_fp32_exe.arg_dict[arg_names[1]][:] = dataB_val`

			`output = elemwise_add_fp32_exe.forward()[0]`

			`qdataA = mx.sym.Variable(name='qdataA', shape=data_shape, dtype=qtype)`
			`qdataB = mx.sym.Variable(name='qdataB', shape=data_shape, dtype=qtype)`
			`min_dataA = mx.sym.Variable(name='min_dataA')`
			`max_dataA = mx.sym.Variable(name='max_dataA')`
			`min_dataB = mx.sym.Variable(name='min_dataB')`
			`max_dataB = mx.sym.Variable(name='max_dataB')`
			`quantized_elemwise_add = mx.sym.contrib.quantized_elemwise_add(qdataA, qdataB, min_dataA, max_dataA, min_dataB, max_dataB)`
			`elemwise_add_int8_exe = quantized_elemwise_add.simple_bind(ctx=mx.current_context(), grad_req='null')`
			`qarg_names = quantized_elemwise_add.list_arguments()`
			`elemwise_add_int8_exe.arg_dict[qarg_names[0]][:] = elemwise_add_fp32_exe.arg_dict[arg_names[0]].astype(qtype)`
			`elemwise_add_int8_exe.arg_dict[qarg_names[1]][:] = elemwise_add_fp32_exe.arg_dict[arg_names[1]].astype(qtype)`
			`quantized_range = 127.0`
			`elemwise_add_int8_exe.arg_dict[qarg_names[2]][:] = data_low`
			`elemwise_add_int8_exe.arg_dict[qarg_names[3]][:] = data_high`
			`elemwise_add_int8_exe.arg_dict[qarg_names[4]][:] = data_low`
			`elemwise_add_int8_exe.arg_dict[qarg_names[5]][:] = data_high`
			`qoutput, min_range, max_range = elemwise_add_int8_exe.forward()`

fix int8 add ut (#17166) 2019-12-30 08:57:46 +08:00			`int8_rslt = qoutput.astype(output.dtype)*max_range/0x7fffffff`
			`diff = mx.nd.abs(output - int8_rslt)`
			`cond = mx.nd.lesser(2, diff).sum().asscalar()`
			`assert cond == 0`
[MKLDNN] add quantized sum (#14614) * add quantized sum * fix gpu compiler error and cpu testcase fail * add default forward function for quantized_sum * skip quantized_sum for gpu ctx * fix comments * fix indetation and comments * retrigger CI * alloc memeory through TmpMemMgr * fix comments Apr.12 * change sum to elemwise_add * change Sum to ElemwiseAdd * fix indents * retrigger CI * trigger CI * fix indentation and typo * trigger CI * fix typo * fix typo * remove USE_MKLDNN macro for requantize params * rename param same as its op * trigger CI * trigger CI * trigger CI 2019-05-01 05:56:04 +08:00
			`for qtype in ['int8', 'uint8']:`
			`check_quantized_elemwise_add((4, 6), qtype)`
			`check_quantized_elemwise_add((13, 74, 52), qtype)`
			`check_quantized_elemwise_add((3, 4, 56, 56), qtype)`
			`check_quantized_elemwise_add((32, 56, 64, 11), qtype)`

Quantized Elemwise Mul Operator (#17147) * add elt-wise mul xinyu * fuse mul dequantize * change to use subgraph * address comments and add tests * fix ut * improve ut * skip pragma omp simd for msvc * fix lint * fix clang error 2019-12-26 18:53:19 +08:00			`@with_seed()`
			`def test_quantized_elemwise_mul():`
			`def check_quantized_elemwise_mul(data_shape, qtype):`
			`if is_test_for_native_cpu():`
			`print('skipped testing quantized_elemwise_mul for native cpu since it is not supported yet')`
			`return`
			`elif qtype != 'int8':`
			`print('skipped testing quantized_elemwise_mul for not supported data type')`
			`return`
			`elif is_test_for_gpu():`
			`print('skipped testing quantized_elemwise_mul for gpu since it is not supported yet')`
			`return`

			`dataA = mx.sym.Variable(name='dataA', shape=data_shape, dtype='float32')`
			`dataB = mx.sym.Variable(name='dataB', shape=data_shape, dtype='float32')`
			`elemwise_mul_fp32 = mx.sym.elemwise_mul(dataA, dataB)`
			`arg_names = elemwise_mul_fp32.list_arguments()`
			`elemwise_mul_fp32_exe = elemwise_mul_fp32.simple_bind(ctx=mx.current_context(), grad_req='null')`
			`if qtype == 'uint8':`
			`data_low = 0.0`
			`data_high = 255.0`
			`else:`
			`data_low = -127.0`
			`data_high = 127.0`

			`dataA_val = mx.nd.random.uniform(low=data_low, high=data_high, shape=data_shape).astype('int32')`
			`dataB_val = mx.nd.random.uniform(low=data_low, high=data_high, shape=data_shape).astype('int32')`
			`elemwise_mul_fp32_exe.arg_dict[arg_names[0]][:] = dataA_val`

			`elemwise_mul_fp32_exe.arg_dict[arg_names[1]][:] = dataB_val`

			`output = elemwise_mul_fp32_exe.forward()[0]`

			`qdataA = mx.sym.Variable(name='qdataA', shape=data_shape, dtype=qtype)`
			`qdataB = mx.sym.Variable(name='qdataB', shape=data_shape, dtype=qtype)`
			`min_dataA = mx.sym.Variable(name='min_dataA')`
			`max_dataA = mx.sym.Variable(name='max_dataA')`
			`min_dataB = mx.sym.Variable(name='min_dataB')`
			`max_dataB = mx.sym.Variable(name='max_dataB')`
			`quantized_elemwise_mul = mx.sym.contrib.quantized_elemwise_mul(qdataA, qdataB, min_dataA, max_dataA, min_dataB, max_dataB)`
			`elemwise_mul_int8_exe = quantized_elemwise_mul.simple_bind(ctx=mx.current_context(), grad_req='null')`
			`qarg_names = quantized_elemwise_mul.list_arguments()`
			`elemwise_mul_int8_exe.arg_dict[qarg_names[0]][:] = elemwise_mul_fp32_exe.arg_dict[arg_names[0]].astype(qtype)`
			`elemwise_mul_int8_exe.arg_dict[qarg_names[1]][:] = elemwise_mul_fp32_exe.arg_dict[arg_names[1]].astype(qtype)`
			`quantized_range = 127.0`
			`elemwise_mul_int8_exe.arg_dict[qarg_names[2]][:] = data_low`
			`elemwise_mul_int8_exe.arg_dict[qarg_names[3]][:] = data_high`
			`elemwise_mul_int8_exe.arg_dict[qarg_names[4]][:] = data_low`
			`elemwise_mul_int8_exe.arg_dict[qarg_names[5]][:] = data_high`
			`qoutput, min_range, max_range = elemwise_mul_int8_exe.forward()`

			`fp32_rslt = output.asnumpy()`
			`int8_rslt = qoutput.astype(output.dtype)`
			`assert_almost_equal(fp32_rslt, int8_rslt, atol = 1e-4)`

			`for qtype in ['int8', 'uint8']:`
			`check_quantized_elemwise_mul((4, 6), qtype)`
			`check_quantized_elemwise_mul((13, 74, 52), qtype)`
			`check_quantized_elemwise_mul((3, 4, 56, 56), qtype)`
			`check_quantized_elemwise_mul((32, 56, 64, 11), qtype)`
[MKLDNN] add quantized sum (#14614) * add quantized sum * fix gpu compiler error and cpu testcase fail * add default forward function for quantized_sum * skip quantized_sum for gpu ctx * fix comments * fix indetation and comments * retrigger CI * alloc memeory through TmpMemMgr * fix comments Apr.12 * change sum to elemwise_add * change Sum to ElemwiseAdd * fix indents * retrigger CI * trigger CI * fix indentation and typo * trigger CI * fix typo * fix typo * remove USE_MKLDNN macro for requantize params * rename param same as its op * trigger CI * trigger CI * trigger CI 2019-05-01 05:56:04 +08:00
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00			`@with_seed()`
			`def test_quantized_pooling():`
Support full convention in quantized pooling (#13260) * fix quantized pooling and enable it in INT8 SqueezeNet * add test * fix test * address review comments * refine the test for quantized pooling 2018-11-23 11:42:02 +08:00			`def check_quantized_pooling(data_shape, kernel, pool_type, pad, stride, global_pool, qdtype, convention='valid'):`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`if is_test_for_native_cpu():`
			`print('skipped testing quantized_pooling for native cpu since it is not supported yet')`
			`return`
			`elif qdtype == 'uint8' and is_test_for_gpu():`
			`print('skipped testing quantized_pooling for gpu uint8 since it is not supported yet')`
			`return`
[v1.x] Backport #17689 and #17884 to v1.x branch (#18064) * [MKLDNN] apply MKLDNNRun to quantized_act/transpose (#17689) * apply MKLDNNRun to quantized_act/transpose ops * run CI * [MKL-DNN] Integrate Conv3d and Pool3d/1d (#17884) * Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master * fix conflicts * fix CI * rebase 2020-04-18 19:56:38 +08:00			`elif is_test_for_gpu() and len(data_shape) != 4:`
			`print('skipped testing quantized_pooling for gpu 5d layout since it is not supported yet')`
			`return`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00
			`data = mx.sym.Variable(name='data', shape=data_shape, dtype='float32')`
			`pooling_fp32 = mx.sym.Pooling(data=data, kernel=kernel, pad=pad, stride=stride,`
Support full convention in quantized pooling (#13260) * fix quantized pooling and enable it in INT8 SqueezeNet * add test * fix test * address review comments * refine the test for quantized pooling 2018-11-23 11:42:02 +08:00			`pool_type=pool_type, global_pool=global_pool, cudnn_off=False,`
			`pooling_convention=convention)`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`arg_shapes, _, _ = pooling_fp32.infer_shape(data=data_shape)`
			`arg_names = pooling_fp32.list_arguments()`
			`pooling_fp32_exe = pooling_fp32.simple_bind(ctx=mx.current_context(), grad_req='null')`
			`if qdtype == 'uint8':`
			`data_low = 0.0`
			`data_high = 127.0`
			`else:`
			`data_low = -127.0`
			`data_high = 127.0`
			`pooling_fp32_exe.arg_dict[arg_names[0]][:] = mx.nd.random.uniform(low=data_low, high=data_high,`
			`shape=data_shape).astype('int32')`
			`output = pooling_fp32_exe.forward()[0]`

			`qdata = mx.sym.Variable(name='qdata', shape=data_shape, dtype=qdtype)`
			`min_data = mx.sym.Variable(name='min_data')`
			`max_data = mx.sym.Variable(name='max_data')`
			`quantized_pooling = mx.sym.contrib.quantized_pooling(data=qdata, min_data=min_data,`
Support full convention in quantized pooling (#13260) * fix quantized pooling and enable it in INT8 SqueezeNet * add test * fix test * address review comments * refine the test for quantized pooling 2018-11-23 11:42:02 +08:00			`max_data=max_data, kernel=kernel,`
			`pad=pad, stride=stride, pool_type=pool_type,`
			`global_pool=global_pool,`
			`pooling_convention=convention)`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`pooling_int8_exe = quantized_pooling.simple_bind(ctx=mx.current_context(), grad_req='null')`
			`qarg_names = quantized_pooling.list_arguments()`
			`pooling_int8_exe.arg_dict[qarg_names[0]][:] = pooling_fp32_exe.arg_dict[arg_names[0]].astype(qdtype)`
			`quantized_range = 127.0`
			`pooling_int8_exe.arg_dict[qarg_names[1]][:] = -quantized_range`
			`pooling_int8_exe.arg_dict[qarg_names[2]][:] = quantized_range`
			`qoutput, min_range, max_range = pooling_int8_exe.forward()`

			`if pool_type == 'max':`
			`assert_almost_equal(output.asnumpy(), qoutput.asnumpy())`
			`elif pool_type == 'avg': # for avg pooling, fp32 and int8 may be different due to rounding errors`
			`diff = mx.nd.abs(output - qoutput.astype(output.dtype))`
			`cond = mx.nd.lesser(2, diff).sum().asscalar()`
			`assert cond == 0`

			`for qdtype in ['int8', 'uint8']:`
			`check_quantized_pooling((3, 4, 56, 56), (3, 3), 'max', (0, 0), (2, 2), False, qdtype)`
			`check_quantized_pooling((3, 4, 56, 56), (3, 3), 'max', (0, 0), (2, 2), True, qdtype)`
			`check_quantized_pooling((3, 512, 7, 7), (7, 7), 'avg', (0, 0), (1, 1), False, qdtype)`
			`check_quantized_pooling((3, 512, 7, 7), (7, 7), 'avg', (0, 0), (1, 1), True, qdtype)`
[v1.x] Backport #17689 and #17884 to v1.x branch (#18064) * [MKLDNN] apply MKLDNNRun to quantized_act/transpose (#17689) * apply MKLDNNRun to quantized_act/transpose ops * run CI * [MKL-DNN] Integrate Conv3d and Pool3d/1d (#17884) * Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master * fix conflicts * fix CI * rebase 2020-04-18 19:56:38 +08:00			`check_quantized_pooling((3, 4, 3, 56, 56), (1, 3, 3), 'max', (0, 0, 0), (1, 2, 2), False, qdtype)`
			`check_quantized_pooling((3, 4, 3, 56, 56), (1, 3, 3), 'max', (0, 0, 0), (1, 2, 2), True, qdtype)`
			`check_quantized_pooling((3, 512, 3, 7, 7), (1, 7, 7), 'avg', (0, 0, 0), (1, 2, 2), False, qdtype)`
			`check_quantized_pooling((3, 512, 3, 7, 7), (1, 7, 7), 'avg', (0, 0, 0), (1, 2, 2), True, qdtype)`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00
Support full convention in quantized pooling (#13260) * fix quantized pooling and enable it in INT8 SqueezeNet * add test * fix test * address review comments * refine the test for quantized pooling 2018-11-23 11:42:02 +08:00			`check_quantized_pooling((3, 4, 56, 56), (3, 3), 'max', (0, 0), (2, 2), False, qdtype, 'full')`
			`check_quantized_pooling((3, 4, 56, 56), (3, 3), 'max', (0, 0), (2, 2), True, qdtype, 'full')`
			`check_quantized_pooling((3, 512, 7, 7), (7, 7), 'avg', (0, 0), (1, 1), False, qdtype, 'full')`
			`check_quantized_pooling((3, 512, 7, 7), (7, 7), 'avg', (0, 0), (1, 1), True, qdtype, 'full')`
[v1.x] Backport #17689 and #17884 to v1.x branch (#18064) * [MKLDNN] apply MKLDNNRun to quantized_act/transpose (#17689) * apply MKLDNNRun to quantized_act/transpose ops * run CI * [MKL-DNN] Integrate Conv3d and Pool3d/1d (#17884) * Integrate MKl-DNN conv3d and pool3d/1d * fix UT & address comments * clean code * rebase against latest master * fix conflicts * fix CI * rebase 2020-04-18 19:56:38 +08:00			`check_quantized_pooling((3, 4, 3, 56, 56), (1, 3, 3), 'max', (0, 0, 0), (1, 2, 2), False, qdtype, 'full')`
			`check_quantized_pooling((3, 4, 3, 56, 56), (1, 3, 3), 'max', (0, 0, 0), (1, 2, 2), True, qdtype, 'full')`
			`check_quantized_pooling((3, 512, 3, 7, 7), (1, 7, 7), 'avg', (0, 0, 0), (1, 2, 2), False, qdtype, 'full')`
			`check_quantized_pooling((3, 512, 3, 7, 7), (1, 7, 7), 'avg', (0, 0, 0), (1, 2, 2), True, qdtype, 'full')`
Support full convention in quantized pooling (#13260) * fix quantized pooling and enable it in INT8 SqueezeNet * add test * fix test * address review comments * refine the test for quantized pooling 2018-11-23 11:42:02 +08:00

[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00			`@with_seed()`
			`def test_quantized_fc():`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`def check_quantized_fc(data_shape, num_hidden, no_bias, qdtype, flatten=True):`
MKLDNN based Quantized FullyConnected Operator and its fusion (#14128) * add MKL-DNN quantized innerproduct * initial qfc with mkldnn * Add MKL-DNN quantized_fully_connected * refactor params order for fullyconnected * update quantized_fully_connected unittest, force data to uint8 type temporary * change mkl based quantized fully_connected to FCompute * add check data type for mkldnn quantized_fc * add fuse requantize and dequantize for mkldnn quantized fullyconnected * add env setting for enable/disable fuse requantize/dequantize for quantize fullyconnected * fix requantize scaling error * add fallback when input data is int8 * fix mkl quantized fullyconnected index error * update quantized fc test cases * add subgraph node for mkldnn fullyconnected * fix compiling and lint error * clean and refactor code * enable quantized_fc for imagenet * cleanup code * Fix StorageType error for non-mkldnn path * fix pylint * reverse BUILD_TAG for MKL IGEMM ut, remove IGEMM qfc check * rename variables and refactor codes according to comments * add subgraph qfc tests and fix shape error * remove fuse_requantize and change fuse_dequantize to enable_float_output. * change to use mxnet::Tuple and update tests * update description in file header * update input0 type check for quantized FullyConnected * fix conflit of mkl/test_subgraph.py * retrigger CI * retrigger CI due to hang 2019-03-08 12:35:03 +08:00			`if is_test_for_native_cpu():`
Upgrade MKL-DNN dependency to v1.0 (#16555) * [mkldnn-v1.0] Initiate the transition to MKL-DNN v1.0 (#15706) * update mkldnn to 1.0.1 release * change makefile * change cmake * update ci build and pip package build * fix typo in mkldnn.mk * fix build for USE_BLAS=mkl & bump MKL version * skip mkldnn unit tests * remove iomp5 from mx_mkldnn_lib * ci: skip test_mkldnn_install * retrigger ci * retrigger ci * retrigger ci * [mkldnn-v1.0] Update MKL-DNN to v1.0.2 (#16012) * bump mkldnn to v1.0.2 * skip quantization unit test * add useless build flag * Fixes openblas installation for static build * empty commit * [mkldnn-v1.0] Enable base code with new APIs. (#16064) * fix comments (#8) * add base code for mkldnn 1.0 * fix comments * Update mkldnn.mk * add base code for mkldnn 1.0 * fix build * fix lint * fix lint * [mkldnn-v1.0] Add MKL-DNN Convolution (#16141) * add mkldnn conv * revert unnecessary change * fix testcase fail for cpu: test_convolution_independent_gradients * fix failed testcase: test_reshape_transpose_6d&&test_weight_async_reorder * fix comments * change variable name from weights to weight in mkldnn_conv * [mkldnn-v1.0] Add MKL-DNN activation (#16195) * add mkldnn act; pass lint; pass mnist training * make bwd as private member * [mkldnn-v1.0] Add MKL-DNN BN (#16199) * add mkldnn bn * add static_cast to transform data type * change mkldnn_args_map_t * retrigger CI * add mkldnn lrn (#16223) * [mkldnn-v1.0] Add MKL-DNN Transpose (#16250) * add mkldnn transpose * using mkldnn_args_map_t instead of std::unordered_map<int, mkldnn::memory> * [mkldnn-v1.0] Add MKL-DNN softmax (#16246) * add mkldnn softmax * trigger CI * [mkldnn-v1.0] Add MKL-DNN FC (#16221) * add mkldnn fc; pass lint; pass mnist training * add TODO info for future debug * [mkldnn-v1.0] Add MKL-DNN deconv (#16259) * add mkldnn deconv * coding style * trigger CI * add mkldnn softmax_output (#16222) * [mkldnn-v1.0] Add MKL-DNN Pooling (#16272) * add mkldnn pooling * add workaround for mkldnn v1.0 pooling fwd && bwd workspace mismatch * code clean * fix lint error * trigger CI * trigger CI * add extra work_space check and fix some typo * trigger CI * [mkldnn-v1.0] Add MKL-DNN reshape&flatten&expand_dims (#16258) * Add mkldnn 1.0 support for reshape/flatten/expanddims ops * improve log & modify definition location of args_map_ * fix comments * rebase code * trigger CI * trigger CI * trigger CI * trigger CI * [mkldnn-v1.0] Add MKL-DNN int8 activation&pooling&flatten (#16425) * Add mkldnn quantized activation/pooling/flatten * int8 flatten * [mkldnn-1.0] int8 conv quantize dequantize requantize (#16283) * int8 conv quantize dequantize requantize Change-Id: Ibd9df97288a95c61d6d85ec3831fd18b626ca283 * Fix lint * Fix clang build Change-Id: I9468774d014c852901e4cc3bffabd8a3d8004519 * add mkldnn sum concat (#16263) * [mkldnn-1.0] mkldnn int8 elemwise_add (#16454) * add mkldnn int8 elemwise_add * add workaround to fix format any issue * code clean * upgrade int8 bn to MKLDNN1.0 (#16458) * [mkldnn-v1.0] Fused RNN Op (#16420) * [mkldnn-v1.0] Add MKL-DNN int8 fc (#16457) * Add mkldnn_v1.0 int8 fc * trigger CI * trigger CI * [mkldnn-v1.0] Update enabling flag for MKL dropout (#16433) * use MSHADOW_USE_MKL to determine whther to use mkl optimized dropout * rebase code * [mkldnn-1.0] upgrade int8 concat to MKLDNN1.0 (#16466) * [mkldnn-1.0] upgrade int8 concat to MKLDNN1.0 * fix lint * use mkldnn_args_map_t * update dict usage style * retrigger CI * retrigger CI again * retrigger CI again 2 * [mkldnn-v1.0] Add MKL-DNN slice (#16484) * change slice to mkldnn v1.0 * fix lint * [mkldnn-1.0] add mkldnn subgraph fc (#16468) * add mkldnn subgraph fc * code clean * trigger CI * [mkldnn-v1.0]enable mkldnn concat (#16507) * enable mkldnn concat * trigger CI * trigger CI * [mkldnn-v1.0] Enable mkldnn cpp-test, copy op, concat op (#16503) * [mkldnn-v1.0] Enable mkldnn test, copy op, concat op Exclude gpu topology via MXNET_USE_CUDA nit default format Remove whitespace * Unix-GPU Tensor-RT build timeout, re-trigger CI * [mkldnn-1.0] add skipped case for mkldnn_v1.0 (#16470) * add skipped case for mkldnn_v1.0 * enable mkl quantized testcase * enable skipped testcase * trigger CI * trigger CI * trigger CI * trigger CI * [mkldnn-1.0]enable mkldnn elemwise_sum (#16521) * enable mkldnn elemwise_sum * trigger CI * trigger CI * trigger CI * [mkldnn-v1.0] Enable more checks for MXNET_USE_MKLDNN (#16520) * open USE_MKLDNN check * trigger ci * ci * [mkldnn-v1.0]Minor fix for leakyrelu compile flag (#16519) * change to MXNET_USE_MKLDNN == 100 * trigger * remove MKL license (#16534) * change MXNET_USE_MKLDNN from 100 to 1 (#16551) * re-enable unit tests (#16565) * [mkldnn-v1.0] Skip flaky test for unidirectional rnn_relu (#16545) Skip `test_rnnrelu_sym`, and add some issue tracking message Add return Revert test_rnnrelu_sym to origin * Add some annotations and log strings, rename mem_desc variables (#16609) * [mkldnn-v1.0]set fc weight layout as mkldnn v0.2x did (#16593) * set fc weight layout as mkldnn v0.2x did * fix lint * [mkldnn-v1.0] Upgrade to MKL-DNN v1.0.4 patch release (#16592) * upgrade to mkldnn v1.0.3 patch release * retrigger ci * mkldnn v1.0.4 patch release * [mkldnn-1.0]Rebase to master (#16648) * fixed broken links across multiple files (#16581) * fix missing docs due to git add issues (#16496) * Create SECURITY.md (#16573) * Create SECURITY.md * Update SECURITY.md * [Numpy] Support N_D(N>=3) batch_dot (#16586) * Support N_D(N>=3) batch_dot * use 1E-4 * fix lint * remove unnecessary comment * Update test_numpy_op.py * Large Vector tests for DGL Ops Part 2 (#16497) * add hyperbolic, logical, sign and regression tests for large vector * changed hyperbolic functions into existing trignometric functions * fix trigo and simple bind needs shape as tuple * fix logical ops, add with_seed * fix arcosh in largearray, remove regression from largevector * [Numpy] Loading numpy-incompatible NDArray in numpy-compatible mode (#16597) * Make MXIsNumpyShape return enum * address the comment * Surpress subgraph log in CI (#16607) Change-Id: Ia2ed6fdbb1d2cb5cc607a8856ca13ee338e27eac * Fix dequantize memory corruption (#16606) Change-Id: I51b62a32987bdbcf96f04b1bc6617e66796f648b * [MKLDNN]Fix reorder2default (#16602) * Fix reorder2default Change-Id: I74c87af9535f6264e6d1ea7eaed089a6480a3358 * fix Change-Id: I6d07b43b520a47e7c78bd4b4b6390f5fb95e6957 * Fix Change-Id: Id72f25c34291be4711f55569c6d61467edd6113d * Fix CI Change-Id: I8c33a82555d5ace2d0b682c1e3eefa13f3a44768 * Run CI Change-Id: Ie8a6dab80ef91c0337cafbae4e3db277e0c7ebf7 * second round of fixing broken links in multiple files (#16598) * Python Docstring Convetion (#16550) * Docstring convetnion for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention * Revert removing new line * Remove white space * [MXNET-1434] Fix a broken link for basic C++ tutorial (#16461) * Fix for wrong reqs set after switching from training to inference (#16553) * Debugging reqs * Move literal strings to const static members * Fix lint * julia/docs: more DRY on page rendering (#16396) * [mkldnn-v1.0]rebase with master (#16649) * fixed broken links across multiple files (#16581) * fix missing docs due to git add issues (#16496) * Create SECURITY.md (#16573) * Create SECURITY.md * Update SECURITY.md * [Numpy] Support N_D(N>=3) batch_dot (#16586) * Support N_D(N>=3) batch_dot * use 1E-4 * fix lint * remove unnecessary comment * Update test_numpy_op.py * Large Vector tests for DGL Ops Part 2 (#16497) * add hyperbolic, logical, sign and regression tests for large vector * changed hyperbolic functions into existing trignometric functions * fix trigo and simple bind needs shape as tuple * fix logical ops, add with_seed * fix arcosh in largearray, remove regression from largevector * [Numpy] Loading numpy-incompatible NDArray in numpy-compatible mode (#16597) * Make MXIsNumpyShape return enum * address the comment * Surpress subgraph log in CI (#16607) Change-Id: Ia2ed6fdbb1d2cb5cc607a8856ca13ee338e27eac * Fix dequantize memory corruption (#16606) Change-Id: I51b62a32987bdbcf96f04b1bc6617e66796f648b * [MKLDNN]Fix reorder2default (#16602) * Fix reorder2default Change-Id: I74c87af9535f6264e6d1ea7eaed089a6480a3358 * fix Change-Id: I6d07b43b520a47e7c78bd4b4b6390f5fb95e6957 * Fix Change-Id: Id72f25c34291be4711f55569c6d61467edd6113d * Fix CI Change-Id: I8c33a82555d5ace2d0b682c1e3eefa13f3a44768 * Run CI Change-Id: Ie8a6dab80ef91c0337cafbae4e3db277e0c7ebf7 * second round of fixing broken links in multiple files (#16598) * Python Docstring Convetion (#16550) * Docstring convetnion for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention * Revert removing new line * Remove white space * [MXNET-1434] Fix a broken link for basic C++ tutorial (#16461) * Fix for wrong reqs set after switching from training to inference (#16553) * Debugging reqs * Move literal strings to const static members * Fix lint * julia/docs: more DRY on page rendering (#16396) * Disables test_bulking_operator_gpu due to flakiness (#16611) * C Api for simplebind, fix comment for trigoops, add atol to assert (#16585) * C Api for simplebind, fix comment for trigoops, add atol to assert * fix build issues * fix lint and add regression test * fix indent * api doc and function name change * fix lint and add infer shape test * Imagenet inference to nightly fix (#16599) * split to cd and shell * comment * lots of prints * copy binary at correct location * remove comments * add mkl lib * update docker run build function * set nvidia docker true to run imagenet inference on GPU * Revert "set nvidia docker true to run imagenet inference on GPU" This reverts commit 98f8eef2057351d7964f1e9326ea6772c216f0af. As we don't need GPU for compilation. * Fix python doc build issue (#16630) * pin the pip versions * remove nbconvert comment * Faster general take (#16615) * Sped up perf of take op when axis != 0 * Formatting and syntax fixes * Rename Take to specify axis * Fix line length lint errors * [Gluon] Don't serialize shared parameters twice (#16582) Add deduplicate argument (default of False) to save_parameters. * Fix index overflow bug in einsum (#16589) * fix index overflow * check index overflow * fix index overflow in einsum path * fix indent * reduce NPY_MAXARGS * safe accumulate * Move some subgraph verbose to MXNET_SUBGRAPH_VERBOSE=2 (#16622) * Move subgraph pass log to verbose=2 * Run CI * add npx reshape (#16640) * RNNOp only call cuda/cudnn if GPU ctx is requested (#16632) * fix bad encode (#16641) * [Perl] - ndarray to native array conversion fix (#16635) * fixing broken links in multiple files - round 3 (#16634) * add type switch to weight tensor (#16543) * numpy doc enhancement (#16637) * Change NDArray to ndarray for npx ops Add nonzero boolean mask supports boolean ndarray Add argmin op and interoperability test for nonzero Fix vdot, inner, outter docs Add nonzero to mx.nd.np Add docs Fix * Fix lint * Fix * Fix * Fix get_constant * Disable float16 test (#16643) * Fix GetMKLDNNData for delay alloc (#16618) * Fix GetMKLDNNData for delay alloc * Run CI * Run CI * Run CI * Run CI * Run CI Change-Id: I7ac2796e0ee8439c92fd2bd7a70a23a359b76b12 * Revert "[mkldnn-1.0]Rebase to master (#16648)" This reverts commit dea3dd23d1982c913b3af6cfc7f4115c2cfa7244. * [mkldnn-v1.0] Minor fix of mkldnn-v1.0 transition (#16644) mk and rm directory in mkldnn.mk ndarray.cc redundant whitespace mkldnn_act rename variables of bwd primitives mkldnn_rnn.cc iterator -> const_iterator Use != instead of < for iterator in for-loop Code comment for explaining the reason why excludes the last layer * [mkldnn-v1.0]rm int8 sum workaround (#16623) * rm int8 sum workaround due to mkldnn lib update * simple dims asignments in mkldnn_quantized_elemwise_add.cc * make MKLDNN macro simple for imperative_utils.h (#16652) * fix ci jenkins step groovy (#16659) * Adopt autograd.record() context to RNNOp (#16657) * Use memcopy instead of set_handle when num_layer=0, direction=1 (#16663) * fallback mkldnn fc bwd in imperative mode (#16672) * disable MKLDNN FC backward * [mkldnn-v1.0] Must reorder and emplace weights for inference primitives (#16682) * add default parameter for mkldnn rnn 2019-10-31 22:55:13 +08:00			`hasMKL = False`
Support Quantized Fully Connected by INT8 GEMM (#12922) * add quantized fully connect support * disable qfc cpu case since s8u8s32 is only supported by MKL BLAS library * retrigger to ci testing * move implementation to cc file and add STORAGE_TYPE_ASSIGN_CHECK * fix typo bug * retrigger the ci test * fix typo bug * retrigger ci * retrigger the ci test * retrigger the ci * retrigger the ci test * retrigger ci test * fix indent issue * retrigger the ci * retrigger the ci test * add verbose message * update log message * using range for loop * using for auto range * enable MKL BLAS ci test * fix typo issue * use TYPE_ASSIGN_CHECK * retrigger the ci 2018-12-15 13:35:08 +08:00			`for key in os.environ.keys():`
			`if operator.eq(key, "BUILD_TAG"):`
			`if os.environ['BUILD_TAG'].find("MKL") != -1:`
			`hasMKL = True`
			`break`
			`if hasMKL == False:`
			`print('skipped testing quantized_fc on cpu since s8u8s32 is only supported by MKL BLAS library')`
			`return`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`elif qdtype == 'uint8' and is_test_for_gpu():`
			`print('skipped testing quantized_fc for gpu uint8 since it is not supported yet')`
			`return`

MKLDNN based Quantized FullyConnected Operator and its fusion (#14128) * add MKL-DNN quantized innerproduct * initial qfc with mkldnn * Add MKL-DNN quantized_fully_connected * refactor params order for fullyconnected * update quantized_fully_connected unittest, force data to uint8 type temporary * change mkl based quantized fully_connected to FCompute * add check data type for mkldnn quantized_fc * add fuse requantize and dequantize for mkldnn quantized fullyconnected * add env setting for enable/disable fuse requantize/dequantize for quantize fullyconnected * fix requantize scaling error * add fallback when input data is int8 * fix mkl quantized fullyconnected index error * update quantized fc test cases * add subgraph node for mkldnn fullyconnected * fix compiling and lint error * clean and refactor code * enable quantized_fc for imagenet * cleanup code * Fix StorageType error for non-mkldnn path * fix pylint * reverse BUILD_TAG for MKL IGEMM ut, remove IGEMM qfc check * rename variables and refactor codes according to comments * add subgraph qfc tests and fix shape error * remove fuse_requantize and change fuse_dequantize to enable_float_output. * change to use mxnet::Tuple and update tests * update description in file header * update input0 type check for quantized FullyConnected * fix conflit of mkl/test_subgraph.py * retrigger CI * retrigger CI due to hang 2019-03-08 12:35:03 +08:00			`def maxabs(a, b):`
			`return mx.nd.maximum(mx.nd.abs(a), mx.nd.abs(b))`

[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`data = mx.sym.Variable(name='data', shape=data_shape, dtype='float32')`
			`fc_fp32 = mx.sym.FullyConnected(data=data, num_hidden=num_hidden, no_bias=no_bias, flatten=flatten)`
			`arg_shapes, _, _ = fc_fp32.infer_shape(data=data_shape)`
			`arg_names = fc_fp32.list_arguments()`
			`fc_fp32_exe = fc_fp32.simple_bind(ctx=mx.current_context(), grad_req='null')`
MKLDNN based Quantized FullyConnected Operator and its fusion (#14128) * add MKL-DNN quantized innerproduct * initial qfc with mkldnn * Add MKL-DNN quantized_fully_connected * refactor params order for fullyconnected * update quantized_fully_connected unittest, force data to uint8 type temporary * change mkl based quantized fully_connected to FCompute * add check data type for mkldnn quantized_fc * add fuse requantize and dequantize for mkldnn quantized fullyconnected * add env setting for enable/disable fuse requantize/dequantize for quantize fullyconnected * fix requantize scaling error * add fallback when input data is int8 * fix mkl quantized fullyconnected index error * update quantized fc test cases * add subgraph node for mkldnn fullyconnected * fix compiling and lint error * clean and refactor code * enable quantized_fc for imagenet * cleanup code * Fix StorageType error for non-mkldnn path * fix pylint * reverse BUILD_TAG for MKL IGEMM ut, remove IGEMM qfc check * rename variables and refactor codes according to comments * add subgraph qfc tests and fix shape error * remove fuse_requantize and change fuse_dequantize to enable_float_output. * change to use mxnet::Tuple and update tests * update description in file header * update input0 type check for quantized FullyConnected * fix conflit of mkl/test_subgraph.py * retrigger CI * retrigger CI due to hang 2019-03-08 12:35:03 +08:00			`int8_range = 127.0`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`if qdtype == 'uint8':`
			`data_low = 0.0`
Support Quantized Fully Connected by INT8 GEMM (#12922) * add quantized fully connect support * disable qfc cpu case since s8u8s32 is only supported by MKL BLAS library * retrigger to ci testing * move implementation to cc file and add STORAGE_TYPE_ASSIGN_CHECK * fix typo bug * retrigger the ci test * fix typo bug * retrigger ci * retrigger the ci test * retrigger the ci * retrigger the ci test * retrigger ci test * fix indent issue * retrigger the ci * retrigger the ci test * add verbose message * update log message * using range for loop * using for auto range * enable MKL BLAS ci test * fix typo issue * use TYPE_ASSIGN_CHECK * retrigger the ci 2018-12-15 13:35:08 +08:00			`data_high = 63.0`
MKLDNN based Quantized FullyConnected Operator and its fusion (#14128) * add MKL-DNN quantized innerproduct * initial qfc with mkldnn * Add MKL-DNN quantized_fully_connected * refactor params order for fullyconnected * update quantized_fully_connected unittest, force data to uint8 type temporary * change mkl based quantized fully_connected to FCompute * add check data type for mkldnn quantized_fc * add fuse requantize and dequantize for mkldnn quantized fullyconnected * add env setting for enable/disable fuse requantize/dequantize for quantize fullyconnected * fix requantize scaling error * add fallback when input data is int8 * fix mkl quantized fullyconnected index error * update quantized fc test cases * add subgraph node for mkldnn fullyconnected * fix compiling and lint error * clean and refactor code * enable quantized_fc for imagenet * cleanup code * Fix StorageType error for non-mkldnn path * fix pylint * reverse BUILD_TAG for MKL IGEMM ut, remove IGEMM qfc check * rename variables and refactor codes according to comments * add subgraph qfc tests and fix shape error * remove fuse_requantize and change fuse_dequantize to enable_float_output. * change to use mxnet::Tuple and update tests * update description in file header * update input0 type check for quantized FullyConnected * fix conflit of mkl/test_subgraph.py * retrigger CI * retrigger CI due to hang 2019-03-08 12:35:03 +08:00			`quantized_range = 255.0`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`else:`
Support Quantized Fully Connected by INT8 GEMM (#12922) * add quantized fully connect support * disable qfc cpu case since s8u8s32 is only supported by MKL BLAS library * retrigger to ci testing * move implementation to cc file and add STORAGE_TYPE_ASSIGN_CHECK * fix typo bug * retrigger the ci test * fix typo bug * retrigger ci * retrigger the ci test * retrigger the ci * retrigger the ci test * retrigger ci test * fix indent issue * retrigger the ci * retrigger the ci test * add verbose message * update log message * using range for loop * using for auto range * enable MKL BLAS ci test * fix typo issue * use TYPE_ASSIGN_CHECK * retrigger the ci 2018-12-15 13:35:08 +08:00			`data_low = -63.0`
			`data_high = 63.0`
MKLDNN based Quantized FullyConnected Operator and its fusion (#14128) * add MKL-DNN quantized innerproduct * initial qfc with mkldnn * Add MKL-DNN quantized_fully_connected * refactor params order for fullyconnected * update quantized_fully_connected unittest, force data to uint8 type temporary * change mkl based quantized fully_connected to FCompute * add check data type for mkldnn quantized_fc * add fuse requantize and dequantize for mkldnn quantized fullyconnected * add env setting for enable/disable fuse requantize/dequantize for quantize fullyconnected * fix requantize scaling error * add fallback when input data is int8 * fix mkl quantized fullyconnected index error * update quantized fc test cases * add subgraph node for mkldnn fullyconnected * fix compiling and lint error * clean and refactor code * enable quantized_fc for imagenet * cleanup code * Fix StorageType error for non-mkldnn path * fix pylint * reverse BUILD_TAG for MKL IGEMM ut, remove IGEMM qfc check * rename variables and refactor codes according to comments * add subgraph qfc tests and fix shape error * remove fuse_requantize and change fuse_dequantize to enable_float_output. * change to use mxnet::Tuple and update tests * update description in file header * update input0 type check for quantized FullyConnected * fix conflit of mkl/test_subgraph.py * retrigger CI * retrigger CI due to hang 2019-03-08 12:35:03 +08:00			`quantized_range = 127.0`

			`data = mx.nd.random.uniform(low=data_low, high=data_high,`
			`shape=data_shape).astype('int32')`
			`weight = mx.nd.random.uniform(low=data_low, high=data_high,`
			`shape=arg_shapes[1]).astype('int32')`
			`fc_fp32_exe.arg_dict[arg_names[0]][:] = data`
			`fc_fp32_exe.arg_dict[arg_names[1]][:] = weight`

			`data_min = mx.nd.min(data).astype('float32')`
			`data_max = mx.nd.max(data).astype('float32')`
			`weight_min = mx.nd.min(weight).astype('float32')`
			`weight_max = mx.nd.max(weight).astype('float32')`
			`data_range = maxabs(data_min, data_max)`
			`weight_range = maxabs(weight_min, weight_max)`

[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`if not no_bias:`
MKLDNN based Quantized FullyConnected Operator and its fusion (#14128) * add MKL-DNN quantized innerproduct * initial qfc with mkldnn * Add MKL-DNN quantized_fully_connected * refactor params order for fullyconnected * update quantized_fully_connected unittest, force data to uint8 type temporary * change mkl based quantized fully_connected to FCompute * add check data type for mkldnn quantized_fc * add fuse requantize and dequantize for mkldnn quantized fullyconnected * add env setting for enable/disable fuse requantize/dequantize for quantize fullyconnected * fix requantize scaling error * add fallback when input data is int8 * fix mkl quantized fullyconnected index error * update quantized fc test cases * add subgraph node for mkldnn fullyconnected * fix compiling and lint error * clean and refactor code * enable quantized_fc for imagenet * cleanup code * Fix StorageType error for non-mkldnn path * fix pylint * reverse BUILD_TAG for MKL IGEMM ut, remove IGEMM qfc check * rename variables and refactor codes according to comments * add subgraph qfc tests and fix shape error * remove fuse_requantize and change fuse_dequantize to enable_float_output. * change to use mxnet::Tuple and update tests * update description in file header * update input0 type check for quantized FullyConnected * fix conflit of mkl/test_subgraph.py * retrigger CI * retrigger CI due to hang 2019-03-08 12:35:03 +08:00			`bias = mx.nd.random.uniform(low=data_low, high=data_high,`
			`shape=arg_shapes[2]).astype('int32')`
			`bias_min = mx.nd.min(bias).astype('float32')`
			`bias_max = mx.nd.max(bias).astype('float32')`
			`bias_range = maxabs(bias_min, bias_max)`

			`bias_scale = int8_range / bias_range`
			`data_scale = quantized_range / data_range`
			`weight_scale = int8_range / weight_range`
			`bias_int32_rescale = data_scale * weight_scale / bias_scale`
			`new_bias = mx.nd.cast(bias, dtype='float32') * bias_int32_rescale`
			`fc_fp32_exe.arg_dict[arg_names[2]][:] = new_bias.astype('int32')`

[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`output = fc_fp32_exe.forward()[0]`

MKLDNN based Quantized FullyConnected Operator and its fusion (#14128) * add MKL-DNN quantized innerproduct * initial qfc with mkldnn * Add MKL-DNN quantized_fully_connected * refactor params order for fullyconnected * update quantized_fully_connected unittest, force data to uint8 type temporary * change mkl based quantized fully_connected to FCompute * add check data type for mkldnn quantized_fc * add fuse requantize and dequantize for mkldnn quantized fullyconnected * add env setting for enable/disable fuse requantize/dequantize for quantize fullyconnected * fix requantize scaling error * add fallback when input data is int8 * fix mkl quantized fullyconnected index error * update quantized fc test cases * add subgraph node for mkldnn fullyconnected * fix compiling and lint error * clean and refactor code * enable quantized_fc for imagenet * cleanup code * Fix StorageType error for non-mkldnn path * fix pylint * reverse BUILD_TAG for MKL IGEMM ut, remove IGEMM qfc check * rename variables and refactor codes according to comments * add subgraph qfc tests and fix shape error * remove fuse_requantize and change fuse_dequantize to enable_float_output. * change to use mxnet::Tuple and update tests * update description in file header * update input0 type check for quantized FullyConnected * fix conflit of mkl/test_subgraph.py * retrigger CI * retrigger CI due to hang 2019-03-08 12:35:03 +08:00			`qdata = mx.sym.Variable(name='qdata', shape=data_shape, dtype=qdtype)`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`fc_int8 = mx.sym.contrib.quantized_fully_connected(data=qdata, num_hidden=num_hidden,`
			`no_bias=no_bias, flatten=flatten)`
			`qarg_names = fc_int8.list_arguments()`
			`type_dict = {qarg_names[1]: 'int8'}`
			`if not no_bias:`
			`type_dict.update({qarg_names[2]: 'int8'})`
			`fc_int8_exe = fc_int8.simple_bind(ctx=mx.current_context(), type_dict=type_dict, grad_req='null')`
			`fc_int8_exe.arg_dict[qarg_names[0]][:] = fc_fp32_exe.arg_dict[arg_names[0]].astype(qdtype)`
			`fc_int8_exe.arg_dict[qarg_names[1]][:] = fc_fp32_exe.arg_dict[arg_names[1]].astype('int8')`
			`if no_bias:`
MKLDNN based Quantized FullyConnected Operator and its fusion (#14128) * add MKL-DNN quantized innerproduct * initial qfc with mkldnn * Add MKL-DNN quantized_fully_connected * refactor params order for fullyconnected * update quantized_fully_connected unittest, force data to uint8 type temporary * change mkl based quantized fully_connected to FCompute * add check data type for mkldnn quantized_fc * add fuse requantize and dequantize for mkldnn quantized fullyconnected * add env setting for enable/disable fuse requantize/dequantize for quantize fullyconnected * fix requantize scaling error * add fallback when input data is int8 * fix mkl quantized fullyconnected index error * update quantized fc test cases * add subgraph node for mkldnn fullyconnected * fix compiling and lint error * clean and refactor code * enable quantized_fc for imagenet * cleanup code * Fix StorageType error for non-mkldnn path * fix pylint * reverse BUILD_TAG for MKL IGEMM ut, remove IGEMM qfc check * rename variables and refactor codes according to comments * add subgraph qfc tests and fix shape error * remove fuse_requantize and change fuse_dequantize to enable_float_output. * change to use mxnet::Tuple and update tests * update description in file header * update input0 type check for quantized FullyConnected * fix conflit of mkl/test_subgraph.py * retrigger CI * retrigger CI due to hang 2019-03-08 12:35:03 +08:00			`fc_int8_exe.arg_dict[qarg_names[2]][:] = -data_range`
			`fc_int8_exe.arg_dict[qarg_names[3]][:] = data_range`
			`fc_int8_exe.arg_dict[qarg_names[4]][:] = -weight_range`
			`fc_int8_exe.arg_dict[qarg_names[5]][:] = weight_range`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`else:`
MKLDNN based Quantized FullyConnected Operator and its fusion (#14128) * add MKL-DNN quantized innerproduct * initial qfc with mkldnn * Add MKL-DNN quantized_fully_connected * refactor params order for fullyconnected * update quantized_fully_connected unittest, force data to uint8 type temporary * change mkl based quantized fully_connected to FCompute * add check data type for mkldnn quantized_fc * add fuse requantize and dequantize for mkldnn quantized fullyconnected * add env setting for enable/disable fuse requantize/dequantize for quantize fullyconnected * fix requantize scaling error * add fallback when input data is int8 * fix mkl quantized fullyconnected index error * update quantized fc test cases * add subgraph node for mkldnn fullyconnected * fix compiling and lint error * clean and refactor code * enable quantized_fc for imagenet * cleanup code * Fix StorageType error for non-mkldnn path * fix pylint * reverse BUILD_TAG for MKL IGEMM ut, remove IGEMM qfc check * rename variables and refactor codes according to comments * add subgraph qfc tests and fix shape error * remove fuse_requantize and change fuse_dequantize to enable_float_output. * change to use mxnet::Tuple and update tests * update description in file header * update input0 type check for quantized FullyConnected * fix conflit of mkl/test_subgraph.py * retrigger CI * retrigger CI due to hang 2019-03-08 12:35:03 +08:00			`fc_int8_exe.arg_dict[qarg_names[2]][:] = bias.astype('int8')`
			`fc_int8_exe.arg_dict[qarg_names[3]][:] = -data_range`
			`fc_int8_exe.arg_dict[qarg_names[4]][:] = data_range`
			`fc_int8_exe.arg_dict[qarg_names[5]][:] = -weight_range`
			`fc_int8_exe.arg_dict[qarg_names[6]][:] = weight_range`
			`fc_int8_exe.arg_dict[qarg_names[7]][:] = -bias_range`
			`fc_int8_exe.arg_dict[qarg_names[8]][:] = bias_range`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`qoutput, min_range, max_range = fc_int8_exe.forward()`

			`if no_bias:`
			`assert_almost_equal(output.asnumpy(), qoutput.asnumpy())`
			`else:`
			`# with adding bias, accuracy loss should not be greater than one`
			`diff = mx.nd.abs(output - qoutput.astype(output.dtype))`
			`cond = mx.nd.lesser(2, diff).sum().asscalar()`
			`assert cond == 0`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`for qdtype in ['int8', 'uint8']:`
[MKL-DNN] Enable s8 support for inner product and 3d input with flatten=false (#14466) * support 3d input innerproduct with flatten=false * simplify test case and improve error msg * enable s8s8 inner product * improve code style * add tests and improve error msg 2019-03-19 23:13:59 -07:00			`if is_test_for_mkldnn():`
			`check_quantized_fc((32, 512, 2), 100, True, qdtype, flatten=False)`
			`check_quantized_fc((32, 512, 2), 100, False, qdtype, flatten=False)`
			`check_quantized_fc((32, 512, 2, 2), 100, True, qdtype, flatten=False)`
			`check_quantized_fc((32, 512, 2, 2), 100, False, qdtype, flatten=False)`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`check_quantized_fc((32, 512, 2, 2), 100, True, qdtype)`
			`check_quantized_fc((32, 111, 2, 2), 100, True, qdtype)`
			`check_quantized_fc((32, 512, 2, 2), 100, False, qdtype)`
			`check_quantized_fc((32, 111, 2, 2), 100, False, qdtype)`
Support Quantized Fully Connected by INT8 GEMM (#12922) * add quantized fully connect support * disable qfc cpu case since s8u8s32 is only supported by MKL BLAS library * retrigger to ci testing * move implementation to cc file and add STORAGE_TYPE_ASSIGN_CHECK * fix typo bug * retrigger the ci test * fix typo bug * retrigger ci * retrigger the ci test * retrigger the ci * retrigger the ci test * retrigger ci test * fix indent issue * retrigger the ci * retrigger the ci test * add verbose message * update log message * using range for loop * using for auto range * enable MKL BLAS ci test * fix typo issue * use TYPE_ASSIGN_CHECK * retrigger the ci 2018-12-15 13:35:08 +08:00			`check_quantized_fc((256, 2048, 2, 2), 800, False, qdtype)`
			`check_quantized_fc((256, 111, 2, 2), 800, False, qdtype)`
			`check_quantized_fc((256, 2048, 2, 2), 800, True, qdtype)`
			`check_quantized_fc((256, 111, 2, 2), 800, True, qdtype)`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00
[v1.9.x] Port #20759 from v1.x (#20815) * [v1.x] Restore quantized RNN operator from MXNet 1.6 (#20759) * restore but seg fault * Refactor & seg fault fixed * apply formatter * fix sanity * Fix docs build * anko review * Remove copyright by contributors from touched files * remove comments / apply formatter * Update call to work with older mkldnn version. Co-authored-by: bgawrych <bartlomiej.gawrych@intel.com> 2022-01-12 15:31:10 -08:00			`@with_seed()`
			`def test_quantized_rnn():`
			`def check_quantized_rnn(num_layers, bidirectional, seq_len, batch_size, input_dim, state_dim):`
			`if is_test_for_gpu():`
			`print('skipped testing test_quantized_rnn for gpu since it is not supported yet')`
			`return`
			`if is_test_for_native_cpu():`
			`print('skipped testing test_quantized_rnn for native cpu since it is not supported yet')`
			`return`

			`data_shape = (seq_len, batch_size, input_dim)`
			`data = mx.sym.Variable(name='data', shape=data_shape, dtype='float32')`
			`rnn_fp32 = mx.sym.RNN(data=data,`
			`num_layers=num_layers,`
			`bidirectional=bidirectional,`
			`state_outputs=True,`
			`state_size=state_dim,`
			`mode='lstm',`
			`name='rnn')`
			`arg_shapes, _, _ = rnn_fp32.infer_shape(data=data_shape)`
			`arg_names = rnn_fp32.list_arguments()`
			`rnn_fp32_exe = rnn_fp32.simple_bind(ctx=mx.current_context(), grad_req='null')`

			`data = mx.nd.random.uniform(low=-1, high=1, shape=arg_shapes[0])`
			`weight = mx.nd.random.uniform(low=-1, high=1, shape=arg_shapes[1])`
			`state = mx.nd.random.uniform(low=-1, high=1, shape=arg_shapes[2])`
			`cell = mx.nd.random.uniform(low=-1, high=1, shape=arg_shapes[3])`

			`rnn_fp32_exe.arg_dict[arg_names[0]][:] = data`
			`rnn_fp32_exe.arg_dict[arg_names[1]][:] = weight`
			`rnn_fp32_exe.arg_dict[arg_names[2]][:] = state`
			`rnn_fp32_exe.arg_dict[arg_names[3]][:] = cell`
			`output = rnn_fp32_exe.forward()[0]`

			`data_min = mx.nd.min(data)`
			`data_max = mx.nd.max(data)`
			`qdata = mx.sym.Variable(name='qdata', shape=data_shape, dtype='uint8')`
			`rnn_int8 = mx.sym.contrib.quantized_rnn(data=qdata,`
			`num_layers=num_layers,`
			`bidirectional=bidirectional,`
			`state_outputs=True,`
			`state_size=state_dim,`
			`mode='lstm',`
			`name='qrnn')`
			`qarg_names = rnn_int8.list_arguments()`
			`rnn_int8_exe = rnn_int8.simple_bind(ctx=mx.current_context(), grad_req='null')`
			`data_scale = 128.0 / (data_max - data_min)`
			`data_shift = 128.0 - data_max * data_scale`
			`qdata = (data * data_scale + data_shift + 0.5).astype('uint8')`
			`rnn_int8_exe.arg_dict[qarg_names[0]][:] = qdata`
			`rnn_int8_exe.arg_dict[qarg_names[1]][:] = weight`
			`rnn_int8_exe.arg_dict[qarg_names[2]][:] = state`
			`rnn_int8_exe.arg_dict[qarg_names[3]][:] = cell`
			`rnn_int8_exe.arg_dict[qarg_names[4]][:] = data_scale`
			`rnn_int8_exe.arg_dict[qarg_names[5]][:] = data_shift`
			`qoutput = rnn_int8_exe.forward()[0]`

			`mse = np.mean((output.asnumpy() - qoutput.asnumpy())**2)`
			`assert mse < 0.001`

			`check_quantized_rnn(1, False, 5, 2, 16, 16)`
			`check_quantized_rnn(1, True, 5, 2, 16, 16)`

Quantized Embedding (#16691) * add quantized embedding * add quantized embedding * add quantized embedding * imporve lint * change to ksupport * fix lint * add quantized embedding test case * skip gpu ut 2019-12-09 20:38:43 +08:00			`@with_seed()`
			`def test_quantized_embedding():`
			`def check_quantized_embedding(data_shape, input_dim, output_dim):`
			`if is_test_for_gpu():`
			`print('skipped testing test_quantized_embedding for gpu since it is not supported yet')`
			`return`

			`def maxabs(a, b):`
			`return mx.nd.maximum(mx.nd.abs(a), mx.nd.abs(b))`

			`data0 = mx.sym.Variable(name='data', shape=data_shape, dtype='int32')`
			`embedding_fp32 = mx.sym.Embedding(data=data0, input_dim=input_dim, output_dim=output_dim)`
			`arg_shapes, _, _ = embedding_fp32.infer_shape(data=data_shape)`
			`arg_names = embedding_fp32.list_arguments()`
			`embedding_fp32_exe = embedding_fp32.simple_bind(ctx=mx.current_context(), grad_req='null')`
			`int8_range = 127.0`
			`data = mx.nd.random.uniform(low=0, high=input_dim,`
			`shape=arg_shapes[0]).astype('int32')`
			`weight = mx.nd.random.uniform(low=-int8_range, high=int8_range,`
			`shape=arg_shapes[1]).astype('int32')`
			`embedding_fp32_exe.arg_dict[arg_names[0]][:] = data`
			`embedding_fp32_exe.arg_dict[arg_names[1]][:] = weight`

			`weight_min = mx.nd.min(weight).astype('float32')`
			`weight_max = mx.nd.max(weight).astype('float32')`
			`weight_range = maxabs(weight_min, weight_max)`

			`output = embedding_fp32_exe.forward()[0]`

			`embedding_int8 = mx.sym.contrib.quantized_embedding(data=data0, input_dim=input_dim, output_dim=output_dim)`
			`qarg_names = embedding_int8.list_arguments()`
			`type_dict = {qarg_names[1]: 'int8'}`
			`embedding_int8_exe = embedding_int8.simple_bind(ctx=mx.current_context(), type_dict=type_dict, grad_req='null')`
			`embedding_int8_exe.arg_dict[qarg_names[0]][:] = embedding_fp32_exe.arg_dict[arg_names[0]]`
			`embedding_int8_exe.arg_dict[qarg_names[1]][:] = embedding_fp32_exe.arg_dict[arg_names[1]].astype('int8')`
			`embedding_int8_exe.arg_dict[qarg_names[2]][:] = -weight_range`
			`embedding_int8_exe.arg_dict[qarg_names[3]][:] = weight_range`
			`qoutput, min_range, max_range = embedding_int8_exe.forward()`

			`assert_almost_equal(output.asnumpy(), qoutput.asnumpy())`

			`check_quantized_embedding((1,), 1000, 256)`
			`check_quantized_embedding((1,), 1024, 512)`
			`check_quantized_embedding((32,), 1000, 256)`
			`check_quantized_embedding((32,), 1024, 512)`

[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00			`@with_seed()`
			`def test_quantized_flatten():`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`def check_quantized_flatten(shape, qdtype):`
			`if qdtype == 'uint8':`
			`data_low = 0.0`
			`data_high = 127.0`
			`else:`
			`data_low = -127.0`
			`data_high = 127.0`
			`qdata = mx.nd.random.uniform(low=data_low, high=data_high, shape=shape).astype(qdtype)`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00			`min_data = mx.nd.array([-1023.343], dtype='float32')`
			`max_data = mx.nd.array([2343.324275], dtype='float32')`
			`qoutput, min_output, max_output = mx.nd.contrib.quantized_flatten(qdata, min_data, max_data)`
			`assert qoutput.ndim == 2`
			`assert qoutput.shape[0] == qdata.shape[0]`
			`assert qoutput.shape[1] == np.prod(qdata.shape[1:])`
			`assert same(qdata.asnumpy().flatten(), qoutput.asnumpy().flatten())`
			`assert same(min_data.asnumpy(), min_output.asnumpy())`
			`assert same(max_data.asnumpy(), max_output.asnumpy())`

[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`for qdtype in ['int8', 'uint8']:`
			`check_quantized_flatten((10,), qdtype)`
			`check_quantized_flatten((10, 15), qdtype)`
			`check_quantized_flatten((10, 15, 18), qdtype)`
			`check_quantized_flatten((3, 4, 23, 23), qdtype)`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00
[MKLDNN]Add quantized relu (#14604) * add quantized relu * fix testcase * add author and skip quantized-relu for gpu * fix comments * retrigger ci * retrigger ci * comment fix * retrigger ci * retrigger ci 2019-04-18 20:18:45 +08:00			`@with_seed()`
			`def test_quantized_act():`
			`def check_quantized_act(data_shape, qdtype):`
			`if is_test_for_native_cpu():`
			`print('skipped testing quantized_act for native cpu since it is not supported yet')`
			`return`
			`elif qdtype == 'int8' and is_test_for_mkldnn():`
			`print('skipped testing quantized_act for mkldnn cpu int8 since it is not supported yet')`
			`return`
			`elif is_test_for_gpu():`
			`print('skipped testing quantized_act for gpu since it is not supported yet')`
			`return`
			`data = mx.sym.Variable(name='data', shape=data_shape, dtype='float32')`
			`act_fp32 = mx.sym.Activation(data=data, act_type='relu', name='relu')`
			`arg_shapes, _, _ = act_fp32.infer_shape(data=data_shape)`
			`arg_names = act_fp32.list_arguments()`
			`act_fp32_exe = act_fp32.simple_bind(ctx=mx.current_context(), grad_req='null')`
			`if qdtype == 'uint8':`
			`data_low = 0.0`
			`data_high = 127.0`
			`else:`
			`data_low = -127.0`
			`data_high = 127.0`

			`act_fp32_exe.arg_dict[arg_names[0]][:] = mx.nd.random.uniform(low=data_low,`
			`high=data_high, shape=data_shape).astype(qdtype)`
			`output = act_fp32_exe.forward()[0]`

			`qdata = mx.sym.Variable(name='qdata', shape=data_shape, dtype=qdtype)`
			`min_data = mx.sym.Variable(name='min_data')`
			`max_data = mx.sym.Variable(name='max_data')`
			`quantized_act = mx.sym.contrib.quantized_act(data=qdata, min_data=min_data, max_data=max_data, act_type='relu')`
			`act_int8_exe = quantized_act.simple_bind(ctx=mx.current_context(), grad_req='null')`
			`qarg_names = quantized_act.list_arguments()`

			`act_int8_exe.arg_dict[qarg_names[0]][:] = act_fp32_exe.arg_dict[arg_names[0]].astype(qdtype)`
			`quantized_range_min = mx.nd.min(act_int8_exe.arg_dict[qarg_names[0]][:])`
			`quantized_range_max = mx.nd.max(act_int8_exe.arg_dict[qarg_names[0]][:])`
			`act_int8_exe.arg_dict[qarg_names[1]][:] = quantized_range_min.astype(qdtype)`
			`act_int8_exe.arg_dict[qarg_names[2]][:] = quantized_range_max.astype(qdtype)`
			`qoutput, min_range, max_range = act_int8_exe.forward()`

			`assert_almost_equal(output.asnumpy(), qoutput.asnumpy())`
			`assert_almost_equal(min_range.asscalar(), quantized_range_min.asscalar())`
			`assert_almost_equal(max_range.asscalar(), quantized_range_max.asscalar())`

			`for qdtype in ['int8', 'uint8']:`
			`check_quantized_act((10,), qdtype)`
			`check_quantized_act((10, 15), qdtype)`
			`check_quantized_act((10, 15, 18), qdtype)`
			`check_quantized_act((3, 4, 23, 23), qdtype)`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00
add int8 bn mkldnn implementation and test (#15664) * add int8 bn mkldnn implementation and test * fix lint * fix ci * enable int8 bn test only in mkldnn backend * disable int8 bn forward test with gpu backend * update int8 bn with reference to comments * fix lint * disable int8 bn gluon forward test with gpu backend * disable uint8 bn forward test with mkldnn backend * restore support mkldnn bn condition * rm duplicate code 2019-08-08 18:20:19 +08:00			`@with_seed()`
			`def test_quantized_bn():`
			`def get_mean_var(data):`
			`mean = mx.ndarray.mean(data, axis=1, exclude=1)`
			`mean_broad = mx.ndarray.expand_dims(mean, axis=0)`
			`mean_broad = mx.ndarray.expand_dims(mean_broad, axis=2)`
			`mean_broad = mx.ndarray.expand_dims(mean_broad, axis=3)`
			`mean_broad = mx.ndarray.broadcast_like(mean_broad, data)`
			`var = mx.ndarray.multiply(data - mean_broad, data - mean_broad)`
			`var = mx.ndarray.mean(var, axis=1, exclude=1)`
			`return mean, var`

			`def check_quantized_bn(data_shape, qdtype):`
add uint8 bn mkldnn implementation (#16003) * add uint8 bn mkldnn implementation * update test case for uint8 bn * fix lint * update test with gpu * add comment for quantization * fix quantized_bn test * fix quantize_model_with_forward test 2019-08-26 20:56:23 +08:00			`if is_test_for_native_cpu():`
add int8 bn mkldnn implementation and test (#15664) * add int8 bn mkldnn implementation and test * fix lint * fix ci * enable int8 bn test only in mkldnn backend * disable int8 bn forward test with gpu backend * update int8 bn with reference to comments * fix lint * disable int8 bn gluon forward test with gpu backend * disable uint8 bn forward test with mkldnn backend * restore support mkldnn bn condition * rm duplicate code 2019-08-08 18:20:19 +08:00			`print('skipped testing quantize_bn for native cpu since it is not supported yet')`
			`return`
			`elif is_test_for_gpu():`
			`print('skipped testing quantize_bn for gpu since it is not supported yet')`
			`return`

add uint8 bn mkldnn implementation (#16003) * add uint8 bn mkldnn implementation * update test case for uint8 bn * fix lint * update test with gpu * add comment for quantization * fix quantized_bn test * fix quantize_model_with_forward test 2019-08-26 20:56:23 +08:00			`# qdtype = uint8`
			`if qdtype == 'uint8':`
			`data_low = 0.0`
Upgrade MKL-DNN dependency to v1.0 (#16555) * [mkldnn-v1.0] Initiate the transition to MKL-DNN v1.0 (#15706) * update mkldnn to 1.0.1 release * change makefile * change cmake * update ci build and pip package build * fix typo in mkldnn.mk * fix build for USE_BLAS=mkl & bump MKL version * skip mkldnn unit tests * remove iomp5 from mx_mkldnn_lib * ci: skip test_mkldnn_install * retrigger ci * retrigger ci * retrigger ci * [mkldnn-v1.0] Update MKL-DNN to v1.0.2 (#16012) * bump mkldnn to v1.0.2 * skip quantization unit test * add useless build flag * Fixes openblas installation for static build * empty commit * [mkldnn-v1.0] Enable base code with new APIs. (#16064) * fix comments (#8) * add base code for mkldnn 1.0 * fix comments * Update mkldnn.mk * add base code for mkldnn 1.0 * fix build * fix lint * fix lint * [mkldnn-v1.0] Add MKL-DNN Convolution (#16141) * add mkldnn conv * revert unnecessary change * fix testcase fail for cpu: test_convolution_independent_gradients * fix failed testcase: test_reshape_transpose_6d&&test_weight_async_reorder * fix comments * change variable name from weights to weight in mkldnn_conv * [mkldnn-v1.0] Add MKL-DNN activation (#16195) * add mkldnn act; pass lint; pass mnist training * make bwd as private member * [mkldnn-v1.0] Add MKL-DNN BN (#16199) * add mkldnn bn * add static_cast to transform data type * change mkldnn_args_map_t * retrigger CI * add mkldnn lrn (#16223) * [mkldnn-v1.0] Add MKL-DNN Transpose (#16250) * add mkldnn transpose * using mkldnn_args_map_t instead of std::unordered_map<int, mkldnn::memory> * [mkldnn-v1.0] Add MKL-DNN softmax (#16246) * add mkldnn softmax * trigger CI * [mkldnn-v1.0] Add MKL-DNN FC (#16221) * add mkldnn fc; pass lint; pass mnist training * add TODO info for future debug * [mkldnn-v1.0] Add MKL-DNN deconv (#16259) * add mkldnn deconv * coding style * trigger CI * add mkldnn softmax_output (#16222) * [mkldnn-v1.0] Add MKL-DNN Pooling (#16272) * add mkldnn pooling * add workaround for mkldnn v1.0 pooling fwd && bwd workspace mismatch * code clean * fix lint error * trigger CI * trigger CI * add extra work_space check and fix some typo * trigger CI * [mkldnn-v1.0] Add MKL-DNN reshape&flatten&expand_dims (#16258) * Add mkldnn 1.0 support for reshape/flatten/expanddims ops * improve log & modify definition location of args_map_ * fix comments * rebase code * trigger CI * trigger CI * trigger CI * trigger CI * [mkldnn-v1.0] Add MKL-DNN int8 activation&pooling&flatten (#16425) * Add mkldnn quantized activation/pooling/flatten * int8 flatten * [mkldnn-1.0] int8 conv quantize dequantize requantize (#16283) * int8 conv quantize dequantize requantize Change-Id: Ibd9df97288a95c61d6d85ec3831fd18b626ca283 * Fix lint * Fix clang build Change-Id: I9468774d014c852901e4cc3bffabd8a3d8004519 * add mkldnn sum concat (#16263) * [mkldnn-1.0] mkldnn int8 elemwise_add (#16454) * add mkldnn int8 elemwise_add * add workaround to fix format any issue * code clean * upgrade int8 bn to MKLDNN1.0 (#16458) * [mkldnn-v1.0] Fused RNN Op (#16420) * [mkldnn-v1.0] Add MKL-DNN int8 fc (#16457) * Add mkldnn_v1.0 int8 fc * trigger CI * trigger CI * [mkldnn-v1.0] Update enabling flag for MKL dropout (#16433) * use MSHADOW_USE_MKL to determine whther to use mkl optimized dropout * rebase code * [mkldnn-1.0] upgrade int8 concat to MKLDNN1.0 (#16466) * [mkldnn-1.0] upgrade int8 concat to MKLDNN1.0 * fix lint * use mkldnn_args_map_t * update dict usage style * retrigger CI * retrigger CI again * retrigger CI again 2 * [mkldnn-v1.0] Add MKL-DNN slice (#16484) * change slice to mkldnn v1.0 * fix lint * [mkldnn-1.0] add mkldnn subgraph fc (#16468) * add mkldnn subgraph fc * code clean * trigger CI * [mkldnn-v1.0]enable mkldnn concat (#16507) * enable mkldnn concat * trigger CI * trigger CI * [mkldnn-v1.0] Enable mkldnn cpp-test, copy op, concat op (#16503) * [mkldnn-v1.0] Enable mkldnn test, copy op, concat op Exclude gpu topology via MXNET_USE_CUDA nit default format Remove whitespace * Unix-GPU Tensor-RT build timeout, re-trigger CI * [mkldnn-1.0] add skipped case for mkldnn_v1.0 (#16470) * add skipped case for mkldnn_v1.0 * enable mkl quantized testcase * enable skipped testcase * trigger CI * trigger CI * trigger CI * trigger CI * [mkldnn-1.0]enable mkldnn elemwise_sum (#16521) * enable mkldnn elemwise_sum * trigger CI * trigger CI * trigger CI * [mkldnn-v1.0] Enable more checks for MXNET_USE_MKLDNN (#16520) * open USE_MKLDNN check * trigger ci * ci * [mkldnn-v1.0]Minor fix for leakyrelu compile flag (#16519) * change to MXNET_USE_MKLDNN == 100 * trigger * remove MKL license (#16534) * change MXNET_USE_MKLDNN from 100 to 1 (#16551) * re-enable unit tests (#16565) * [mkldnn-v1.0] Skip flaky test for unidirectional rnn_relu (#16545) Skip `test_rnnrelu_sym`, and add some issue tracking message Add return Revert test_rnnrelu_sym to origin * Add some annotations and log strings, rename mem_desc variables (#16609) * [mkldnn-v1.0]set fc weight layout as mkldnn v0.2x did (#16593) * set fc weight layout as mkldnn v0.2x did * fix lint * [mkldnn-v1.0] Upgrade to MKL-DNN v1.0.4 patch release (#16592) * upgrade to mkldnn v1.0.3 patch release * retrigger ci * mkldnn v1.0.4 patch release * [mkldnn-1.0]Rebase to master (#16648) * fixed broken links across multiple files (#16581) * fix missing docs due to git add issues (#16496) * Create SECURITY.md (#16573) * Create SECURITY.md * Update SECURITY.md * [Numpy] Support N_D(N>=3) batch_dot (#16586) * Support N_D(N>=3) batch_dot * use 1E-4 * fix lint * remove unnecessary comment * Update test_numpy_op.py * Large Vector tests for DGL Ops Part 2 (#16497) * add hyperbolic, logical, sign and regression tests for large vector * changed hyperbolic functions into existing trignometric functions * fix trigo and simple bind needs shape as tuple * fix logical ops, add with_seed * fix arcosh in largearray, remove regression from largevector * [Numpy] Loading numpy-incompatible NDArray in numpy-compatible mode (#16597) * Make MXIsNumpyShape return enum * address the comment * Surpress subgraph log in CI (#16607) Change-Id: Ia2ed6fdbb1d2cb5cc607a8856ca13ee338e27eac * Fix dequantize memory corruption (#16606) Change-Id: I51b62a32987bdbcf96f04b1bc6617e66796f648b * [MKLDNN]Fix reorder2default (#16602) * Fix reorder2default Change-Id: I74c87af9535f6264e6d1ea7eaed089a6480a3358 * fix Change-Id: I6d07b43b520a47e7c78bd4b4b6390f5fb95e6957 * Fix Change-Id: Id72f25c34291be4711f55569c6d61467edd6113d * Fix CI Change-Id: I8c33a82555d5ace2d0b682c1e3eefa13f3a44768 * Run CI Change-Id: Ie8a6dab80ef91c0337cafbae4e3db277e0c7ebf7 * second round of fixing broken links in multiple files (#16598) * Python Docstring Convetion (#16550) * Docstring convetnion for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention * Revert removing new line * Remove white space * [MXNET-1434] Fix a broken link for basic C++ tutorial (#16461) * Fix for wrong reqs set after switching from training to inference (#16553) * Debugging reqs * Move literal strings to const static members * Fix lint * julia/docs: more DRY on page rendering (#16396) * [mkldnn-v1.0]rebase with master (#16649) * fixed broken links across multiple files (#16581) * fix missing docs due to git add issues (#16496) * Create SECURITY.md (#16573) * Create SECURITY.md * Update SECURITY.md * [Numpy] Support N_D(N>=3) batch_dot (#16586) * Support N_D(N>=3) batch_dot * use 1E-4 * fix lint * remove unnecessary comment * Update test_numpy_op.py * Large Vector tests for DGL Ops Part 2 (#16497) * add hyperbolic, logical, sign and regression tests for large vector * changed hyperbolic functions into existing trignometric functions * fix trigo and simple bind needs shape as tuple * fix logical ops, add with_seed * fix arcosh in largearray, remove regression from largevector * [Numpy] Loading numpy-incompatible NDArray in numpy-compatible mode (#16597) * Make MXIsNumpyShape return enum * address the comment * Surpress subgraph log in CI (#16607) Change-Id: Ia2ed6fdbb1d2cb5cc607a8856ca13ee338e27eac * Fix dequantize memory corruption (#16606) Change-Id: I51b62a32987bdbcf96f04b1bc6617e66796f648b * [MKLDNN]Fix reorder2default (#16602) * Fix reorder2default Change-Id: I74c87af9535f6264e6d1ea7eaed089a6480a3358 * fix Change-Id: I6d07b43b520a47e7c78bd4b4b6390f5fb95e6957 * Fix Change-Id: Id72f25c34291be4711f55569c6d61467edd6113d * Fix CI Change-Id: I8c33a82555d5ace2d0b682c1e3eefa13f3a44768 * Run CI Change-Id: Ie8a6dab80ef91c0337cafbae4e3db277e0c7ebf7 * second round of fixing broken links in multiple files (#16598) * Python Docstring Convetion (#16550) * Docstring convetnion for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention * Revert removing new line * Remove white space * [MXNET-1434] Fix a broken link for basic C++ tutorial (#16461) * Fix for wrong reqs set after switching from training to inference (#16553) * Debugging reqs * Move literal strings to const static members * Fix lint * julia/docs: more DRY on page rendering (#16396) * Disables test_bulking_operator_gpu due to flakiness (#16611) * C Api for simplebind, fix comment for trigoops, add atol to assert (#16585) * C Api for simplebind, fix comment for trigoops, add atol to assert * fix build issues * fix lint and add regression test * fix indent * api doc and function name change * fix lint and add infer shape test * Imagenet inference to nightly fix (#16599) * split to cd and shell * comment * lots of prints * copy binary at correct location * remove comments * add mkl lib * update docker run build function * set nvidia docker true to run imagenet inference on GPU * Revert "set nvidia docker true to run imagenet inference on GPU" This reverts commit 98f8eef2057351d7964f1e9326ea6772c216f0af. As we don't need GPU for compilation. * Fix python doc build issue (#16630) * pin the pip versions * remove nbconvert comment * Faster general take (#16615) * Sped up perf of take op when axis != 0 * Formatting and syntax fixes * Rename Take to specify axis * Fix line length lint errors * [Gluon] Don't serialize shared parameters twice (#16582) Add deduplicate argument (default of False) to save_parameters. * Fix index overflow bug in einsum (#16589) * fix index overflow * check index overflow * fix index overflow in einsum path * fix indent * reduce NPY_MAXARGS * safe accumulate * Move some subgraph verbose to MXNET_SUBGRAPH_VERBOSE=2 (#16622) * Move subgraph pass log to verbose=2 * Run CI * add npx reshape (#16640) * RNNOp only call cuda/cudnn if GPU ctx is requested (#16632) * fix bad encode (#16641) * [Perl] - ndarray to native array conversion fix (#16635) * fixing broken links in multiple files - round 3 (#16634) * add type switch to weight tensor (#16543) * numpy doc enhancement (#16637) * Change NDArray to ndarray for npx ops Add nonzero boolean mask supports boolean ndarray Add argmin op and interoperability test for nonzero Fix vdot, inner, outter docs Add nonzero to mx.nd.np Add docs Fix * Fix lint * Fix * Fix * Fix get_constant * Disable float16 test (#16643) * Fix GetMKLDNNData for delay alloc (#16618) * Fix GetMKLDNNData for delay alloc * Run CI * Run CI * Run CI * Run CI * Run CI Change-Id: I7ac2796e0ee8439c92fd2bd7a70a23a359b76b12 * Revert "[mkldnn-1.0]Rebase to master (#16648)" This reverts commit dea3dd23d1982c913b3af6cfc7f4115c2cfa7244. * [mkldnn-v1.0] Minor fix of mkldnn-v1.0 transition (#16644) mk and rm directory in mkldnn.mk ndarray.cc redundant whitespace mkldnn_act rename variables of bwd primitives mkldnn_rnn.cc iterator -> const_iterator Use != instead of < for iterator in for-loop Code comment for explaining the reason why excludes the last layer * [mkldnn-v1.0]rm int8 sum workaround (#16623) * rm int8 sum workaround due to mkldnn lib update * simple dims asignments in mkldnn_quantized_elemwise_add.cc * make MKLDNN macro simple for imperative_utils.h (#16652) * fix ci jenkins step groovy (#16659) * Adopt autograd.record() context to RNNOp (#16657) * Use memcopy instead of set_handle when num_layer=0, direction=1 (#16663) * fallback mkldnn fc bwd in imperative mode (#16672) * disable MKLDNN FC backward * [mkldnn-v1.0] Must reorder and emplace weights for inference primitives (#16682) * add default parameter for mkldnn rnn 2019-10-31 22:55:13 +08:00			`data_high = 255.0`
add uint8 bn mkldnn implementation (#16003) * add uint8 bn mkldnn implementation * update test case for uint8 bn * fix lint * update test with gpu * add comment for quantization * fix quantized_bn test * fix quantize_model_with_forward test 2019-08-26 20:56:23 +08:00			`else:`
			`data_low = -127.0`
			`data_high = 127.0`
Upgrade MKL-DNN dependency to v1.0 (#16555) * [mkldnn-v1.0] Initiate the transition to MKL-DNN v1.0 (#15706) * update mkldnn to 1.0.1 release * change makefile * change cmake * update ci build and pip package build * fix typo in mkldnn.mk * fix build for USE_BLAS=mkl & bump MKL version * skip mkldnn unit tests * remove iomp5 from mx_mkldnn_lib * ci: skip test_mkldnn_install * retrigger ci * retrigger ci * retrigger ci * [mkldnn-v1.0] Update MKL-DNN to v1.0.2 (#16012) * bump mkldnn to v1.0.2 * skip quantization unit test * add useless build flag * Fixes openblas installation for static build * empty commit * [mkldnn-v1.0] Enable base code with new APIs. (#16064) * fix comments (#8) * add base code for mkldnn 1.0 * fix comments * Update mkldnn.mk * add base code for mkldnn 1.0 * fix build * fix lint * fix lint * [mkldnn-v1.0] Add MKL-DNN Convolution (#16141) * add mkldnn conv * revert unnecessary change * fix testcase fail for cpu: test_convolution_independent_gradients * fix failed testcase: test_reshape_transpose_6d&&test_weight_async_reorder * fix comments * change variable name from weights to weight in mkldnn_conv * [mkldnn-v1.0] Add MKL-DNN activation (#16195) * add mkldnn act; pass lint; pass mnist training * make bwd as private member * [mkldnn-v1.0] Add MKL-DNN BN (#16199) * add mkldnn bn * add static_cast to transform data type * change mkldnn_args_map_t * retrigger CI * add mkldnn lrn (#16223) * [mkldnn-v1.0] Add MKL-DNN Transpose (#16250) * add mkldnn transpose * using mkldnn_args_map_t instead of std::unordered_map<int, mkldnn::memory> * [mkldnn-v1.0] Add MKL-DNN softmax (#16246) * add mkldnn softmax * trigger CI * [mkldnn-v1.0] Add MKL-DNN FC (#16221) * add mkldnn fc; pass lint; pass mnist training * add TODO info for future debug * [mkldnn-v1.0] Add MKL-DNN deconv (#16259) * add mkldnn deconv * coding style * trigger CI * add mkldnn softmax_output (#16222) * [mkldnn-v1.0] Add MKL-DNN Pooling (#16272) * add mkldnn pooling * add workaround for mkldnn v1.0 pooling fwd && bwd workspace mismatch * code clean * fix lint error * trigger CI * trigger CI * add extra work_space check and fix some typo * trigger CI * [mkldnn-v1.0] Add MKL-DNN reshape&flatten&expand_dims (#16258) * Add mkldnn 1.0 support for reshape/flatten/expanddims ops * improve log & modify definition location of args_map_ * fix comments * rebase code * trigger CI * trigger CI * trigger CI * trigger CI * [mkldnn-v1.0] Add MKL-DNN int8 activation&pooling&flatten (#16425) * Add mkldnn quantized activation/pooling/flatten * int8 flatten * [mkldnn-1.0] int8 conv quantize dequantize requantize (#16283) * int8 conv quantize dequantize requantize Change-Id: Ibd9df97288a95c61d6d85ec3831fd18b626ca283 * Fix lint * Fix clang build Change-Id: I9468774d014c852901e4cc3bffabd8a3d8004519 * add mkldnn sum concat (#16263) * [mkldnn-1.0] mkldnn int8 elemwise_add (#16454) * add mkldnn int8 elemwise_add * add workaround to fix format any issue * code clean * upgrade int8 bn to MKLDNN1.0 (#16458) * [mkldnn-v1.0] Fused RNN Op (#16420) * [mkldnn-v1.0] Add MKL-DNN int8 fc (#16457) * Add mkldnn_v1.0 int8 fc * trigger CI * trigger CI * [mkldnn-v1.0] Update enabling flag for MKL dropout (#16433) * use MSHADOW_USE_MKL to determine whther to use mkl optimized dropout * rebase code * [mkldnn-1.0] upgrade int8 concat to MKLDNN1.0 (#16466) * [mkldnn-1.0] upgrade int8 concat to MKLDNN1.0 * fix lint * use mkldnn_args_map_t * update dict usage style * retrigger CI * retrigger CI again * retrigger CI again 2 * [mkldnn-v1.0] Add MKL-DNN slice (#16484) * change slice to mkldnn v1.0 * fix lint * [mkldnn-1.0] add mkldnn subgraph fc (#16468) * add mkldnn subgraph fc * code clean * trigger CI * [mkldnn-v1.0]enable mkldnn concat (#16507) * enable mkldnn concat * trigger CI * trigger CI * [mkldnn-v1.0] Enable mkldnn cpp-test, copy op, concat op (#16503) * [mkldnn-v1.0] Enable mkldnn test, copy op, concat op Exclude gpu topology via MXNET_USE_CUDA nit default format Remove whitespace * Unix-GPU Tensor-RT build timeout, re-trigger CI * [mkldnn-1.0] add skipped case for mkldnn_v1.0 (#16470) * add skipped case for mkldnn_v1.0 * enable mkl quantized testcase * enable skipped testcase * trigger CI * trigger CI * trigger CI * trigger CI * [mkldnn-1.0]enable mkldnn elemwise_sum (#16521) * enable mkldnn elemwise_sum * trigger CI * trigger CI * trigger CI * [mkldnn-v1.0] Enable more checks for MXNET_USE_MKLDNN (#16520) * open USE_MKLDNN check * trigger ci * ci * [mkldnn-v1.0]Minor fix for leakyrelu compile flag (#16519) * change to MXNET_USE_MKLDNN == 100 * trigger * remove MKL license (#16534) * change MXNET_USE_MKLDNN from 100 to 1 (#16551) * re-enable unit tests (#16565) * [mkldnn-v1.0] Skip flaky test for unidirectional rnn_relu (#16545) Skip `test_rnnrelu_sym`, and add some issue tracking message Add return Revert test_rnnrelu_sym to origin * Add some annotations and log strings, rename mem_desc variables (#16609) * [mkldnn-v1.0]set fc weight layout as mkldnn v0.2x did (#16593) * set fc weight layout as mkldnn v0.2x did * fix lint * [mkldnn-v1.0] Upgrade to MKL-DNN v1.0.4 patch release (#16592) * upgrade to mkldnn v1.0.3 patch release * retrigger ci * mkldnn v1.0.4 patch release * [mkldnn-1.0]Rebase to master (#16648) * fixed broken links across multiple files (#16581) * fix missing docs due to git add issues (#16496) * Create SECURITY.md (#16573) * Create SECURITY.md * Update SECURITY.md * [Numpy] Support N_D(N>=3) batch_dot (#16586) * Support N_D(N>=3) batch_dot * use 1E-4 * fix lint * remove unnecessary comment * Update test_numpy_op.py * Large Vector tests for DGL Ops Part 2 (#16497) * add hyperbolic, logical, sign and regression tests for large vector * changed hyperbolic functions into existing trignometric functions * fix trigo and simple bind needs shape as tuple * fix logical ops, add with_seed * fix arcosh in largearray, remove regression from largevector * [Numpy] Loading numpy-incompatible NDArray in numpy-compatible mode (#16597) * Make MXIsNumpyShape return enum * address the comment * Surpress subgraph log in CI (#16607) Change-Id: Ia2ed6fdbb1d2cb5cc607a8856ca13ee338e27eac * Fix dequantize memory corruption (#16606) Change-Id: I51b62a32987bdbcf96f04b1bc6617e66796f648b * [MKLDNN]Fix reorder2default (#16602) * Fix reorder2default Change-Id: I74c87af9535f6264e6d1ea7eaed089a6480a3358 * fix Change-Id: I6d07b43b520a47e7c78bd4b4b6390f5fb95e6957 * Fix Change-Id: Id72f25c34291be4711f55569c6d61467edd6113d * Fix CI Change-Id: I8c33a82555d5ace2d0b682c1e3eefa13f3a44768 * Run CI Change-Id: Ie8a6dab80ef91c0337cafbae4e3db277e0c7ebf7 * second round of fixing broken links in multiple files (#16598) * Python Docstring Convetion (#16550) * Docstring convetnion for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention * Revert removing new line * Remove white space * [MXNET-1434] Fix a broken link for basic C++ tutorial (#16461) * Fix for wrong reqs set after switching from training to inference (#16553) * Debugging reqs * Move literal strings to const static members * Fix lint * julia/docs: more DRY on page rendering (#16396) * [mkldnn-v1.0]rebase with master (#16649) * fixed broken links across multiple files (#16581) * fix missing docs due to git add issues (#16496) * Create SECURITY.md (#16573) * Create SECURITY.md * Update SECURITY.md * [Numpy] Support N_D(N>=3) batch_dot (#16586) * Support N_D(N>=3) batch_dot * use 1E-4 * fix lint * remove unnecessary comment * Update test_numpy_op.py * Large Vector tests for DGL Ops Part 2 (#16497) * add hyperbolic, logical, sign and regression tests for large vector * changed hyperbolic functions into existing trignometric functions * fix trigo and simple bind needs shape as tuple * fix logical ops, add with_seed * fix arcosh in largearray, remove regression from largevector * [Numpy] Loading numpy-incompatible NDArray in numpy-compatible mode (#16597) * Make MXIsNumpyShape return enum * address the comment * Surpress subgraph log in CI (#16607) Change-Id: Ia2ed6fdbb1d2cb5cc607a8856ca13ee338e27eac * Fix dequantize memory corruption (#16606) Change-Id: I51b62a32987bdbcf96f04b1bc6617e66796f648b * [MKLDNN]Fix reorder2default (#16602) * Fix reorder2default Change-Id: I74c87af9535f6264e6d1ea7eaed089a6480a3358 * fix Change-Id: I6d07b43b520a47e7c78bd4b4b6390f5fb95e6957 * Fix Change-Id: Id72f25c34291be4711f55569c6d61467edd6113d * Fix CI Change-Id: I8c33a82555d5ace2d0b682c1e3eefa13f3a44768 * Run CI Change-Id: Ie8a6dab80ef91c0337cafbae4e3db277e0c7ebf7 * second round of fixing broken links in multiple files (#16598) * Python Docstring Convetion (#16550) * Docstring convetnion for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention * Revert removing new line * Remove white space * [MXNET-1434] Fix a broken link for basic C++ tutorial (#16461) * Fix for wrong reqs set after switching from training to inference (#16553) * Debugging reqs * Move literal strings to const static members * Fix lint * julia/docs: more DRY on page rendering (#16396) * Disables test_bulking_operator_gpu due to flakiness (#16611) * C Api for simplebind, fix comment for trigoops, add atol to assert (#16585) * C Api for simplebind, fix comment for trigoops, add atol to assert * fix build issues * fix lint and add regression test * fix indent * api doc and function name change * fix lint and add infer shape test * Imagenet inference to nightly fix (#16599) * split to cd and shell * comment * lots of prints * copy binary at correct location * remove comments * add mkl lib * update docker run build function * set nvidia docker true to run imagenet inference on GPU * Revert "set nvidia docker true to run imagenet inference on GPU" This reverts commit 98f8eef2057351d7964f1e9326ea6772c216f0af. As we don't need GPU for compilation. * Fix python doc build issue (#16630) * pin the pip versions * remove nbconvert comment * Faster general take (#16615) * Sped up perf of take op when axis != 0 * Formatting and syntax fixes * Rename Take to specify axis * Fix line length lint errors * [Gluon] Don't serialize shared parameters twice (#16582) Add deduplicate argument (default of False) to save_parameters. * Fix index overflow bug in einsum (#16589) * fix index overflow * check index overflow * fix index overflow in einsum path * fix indent * reduce NPY_MAXARGS * safe accumulate * Move some subgraph verbose to MXNET_SUBGRAPH_VERBOSE=2 (#16622) * Move subgraph pass log to verbose=2 * Run CI * add npx reshape (#16640) * RNNOp only call cuda/cudnn if GPU ctx is requested (#16632) * fix bad encode (#16641) * [Perl] - ndarray to native array conversion fix (#16635) * fixing broken links in multiple files - round 3 (#16634) * add type switch to weight tensor (#16543) * numpy doc enhancement (#16637) * Change NDArray to ndarray for npx ops Add nonzero boolean mask supports boolean ndarray Add argmin op and interoperability test for nonzero Fix vdot, inner, outter docs Add nonzero to mx.nd.np Add docs Fix * Fix lint * Fix * Fix * Fix get_constant * Disable float16 test (#16643) * Fix GetMKLDNNData for delay alloc (#16618) * Fix GetMKLDNNData for delay alloc * Run CI * Run CI * Run CI * Run CI * Run CI Change-Id: I7ac2796e0ee8439c92fd2bd7a70a23a359b76b12 * Revert "[mkldnn-1.0]Rebase to master (#16648)" This reverts commit dea3dd23d1982c913b3af6cfc7f4115c2cfa7244. * [mkldnn-v1.0] Minor fix of mkldnn-v1.0 transition (#16644) mk and rm directory in mkldnn.mk ndarray.cc redundant whitespace mkldnn_act rename variables of bwd primitives mkldnn_rnn.cc iterator -> const_iterator Use != instead of < for iterator in for-loop Code comment for explaining the reason why excludes the last layer * [mkldnn-v1.0]rm int8 sum workaround (#16623) * rm int8 sum workaround due to mkldnn lib update * simple dims asignments in mkldnn_quantized_elemwise_add.cc * make MKLDNN macro simple for imperative_utils.h (#16652) * fix ci jenkins step groovy (#16659) * Adopt autograd.record() context to RNNOp (#16657) * Use memcopy instead of set_handle when num_layer=0, direction=1 (#16663) * fallback mkldnn fc bwd in imperative mode (#16672) * disable MKLDNN FC backward * [mkldnn-v1.0] Must reorder and emplace weights for inference primitives (#16682) * add default parameter for mkldnn rnn 2019-10-31 22:55:13 +08:00
add int8 bn mkldnn implementation and test (#15664) * add int8 bn mkldnn implementation and test * fix lint * fix ci * enable int8 bn test only in mkldnn backend * disable int8 bn forward test with gpu backend * update int8 bn with reference to comments * fix lint * disable int8 bn gluon forward test with gpu backend * disable uint8 bn forward test with mkldnn backend * restore support mkldnn bn condition * rm duplicate code 2019-08-08 18:20:19 +08:00			`# run fp32 bn`
			`data_sym = mx.sym.Variable(name='data', shape=data_shape, dtype='float32')`
			`bn_fp32 = mx.sym.BatchNorm(data=data_sym, name='bn', use_global_stats=True, fix_gamma=False)`
			`arg_shapes, out_shapes, aux_shapes = bn_fp32.infer_shape(data=data_shape)`
			`arg_names = bn_fp32.list_arguments()`
			`aux_names = bn_fp32.list_auxiliary_states()`

			`data = mx.nd.random.uniform(low=data_low, high=data_high, shape=data_shape)`
			`gamma = mx.nd.random.uniform(low=data_low, high=data_high, shape=arg_shapes[1])`
			`beta = mx.nd.random.uniform(low=data_low, high=data_high, shape=arg_shapes[2])`
			`moving_mean, moving_var = get_mean_var(data)`

			`bn_fp32_exe = bn_fp32.simple_bind(ctx=mx.current_context(), grad_req='null')`
			`bn_fp32_exe.arg_dict[arg_names[0]][:] = data`
			`bn_fp32_exe.arg_dict[arg_names[1]][:] = gamma`
			`bn_fp32_exe.arg_dict[arg_names[2]][:] = beta`
			`bn_fp32_exe.aux_dict[aux_names[0]][:] = moving_mean`
			`bn_fp32_exe.aux_dict[aux_names[1]][:] = moving_var`

			`output= bn_fp32_exe.forward()[0]`

			`# generate int8 bn from fp32 bn`
			`arg_params = dict()`
			`for k,v in bn_fp32_exe.arg_dict.items():`
			`if 'data' in k or 'softmax_label' in k:`
			`continue`
			`arg_params[k] = v`

			`calib_data = NDArrayIter(data=data, batch_size=data_shape[0])`
			`calib_data = DummyIter(calib_data)`
			`qsym, qarg_params, qaux_params = mx.contrib.quant.quantize_model(sym=bn_fp32,`
			`arg_params=arg_params,`
			`aux_params=bn_fp32_exe.aux_dict,`
			`ctx=mx.current_context(),`
Upgrade MKL-DNN dependency to v1.0 (#16555) * [mkldnn-v1.0] Initiate the transition to MKL-DNN v1.0 (#15706) * update mkldnn to 1.0.1 release * change makefile * change cmake * update ci build and pip package build * fix typo in mkldnn.mk * fix build for USE_BLAS=mkl & bump MKL version * skip mkldnn unit tests * remove iomp5 from mx_mkldnn_lib * ci: skip test_mkldnn_install * retrigger ci * retrigger ci * retrigger ci * [mkldnn-v1.0] Update MKL-DNN to v1.0.2 (#16012) * bump mkldnn to v1.0.2 * skip quantization unit test * add useless build flag * Fixes openblas installation for static build * empty commit * [mkldnn-v1.0] Enable base code with new APIs. (#16064) * fix comments (#8) * add base code for mkldnn 1.0 * fix comments * Update mkldnn.mk * add base code for mkldnn 1.0 * fix build * fix lint * fix lint * [mkldnn-v1.0] Add MKL-DNN Convolution (#16141) * add mkldnn conv * revert unnecessary change * fix testcase fail for cpu: test_convolution_independent_gradients * fix failed testcase: test_reshape_transpose_6d&&test_weight_async_reorder * fix comments * change variable name from weights to weight in mkldnn_conv * [mkldnn-v1.0] Add MKL-DNN activation (#16195) * add mkldnn act; pass lint; pass mnist training * make bwd as private member * [mkldnn-v1.0] Add MKL-DNN BN (#16199) * add mkldnn bn * add static_cast to transform data type * change mkldnn_args_map_t * retrigger CI * add mkldnn lrn (#16223) * [mkldnn-v1.0] Add MKL-DNN Transpose (#16250) * add mkldnn transpose * using mkldnn_args_map_t instead of std::unordered_map<int, mkldnn::memory> * [mkldnn-v1.0] Add MKL-DNN softmax (#16246) * add mkldnn softmax * trigger CI * [mkldnn-v1.0] Add MKL-DNN FC (#16221) * add mkldnn fc; pass lint; pass mnist training * add TODO info for future debug * [mkldnn-v1.0] Add MKL-DNN deconv (#16259) * add mkldnn deconv * coding style * trigger CI * add mkldnn softmax_output (#16222) * [mkldnn-v1.0] Add MKL-DNN Pooling (#16272) * add mkldnn pooling * add workaround for mkldnn v1.0 pooling fwd && bwd workspace mismatch * code clean * fix lint error * trigger CI * trigger CI * add extra work_space check and fix some typo * trigger CI * [mkldnn-v1.0] Add MKL-DNN reshape&flatten&expand_dims (#16258) * Add mkldnn 1.0 support for reshape/flatten/expanddims ops * improve log & modify definition location of args_map_ * fix comments * rebase code * trigger CI * trigger CI * trigger CI * trigger CI * [mkldnn-v1.0] Add MKL-DNN int8 activation&pooling&flatten (#16425) * Add mkldnn quantized activation/pooling/flatten * int8 flatten * [mkldnn-1.0] int8 conv quantize dequantize requantize (#16283) * int8 conv quantize dequantize requantize Change-Id: Ibd9df97288a95c61d6d85ec3831fd18b626ca283 * Fix lint * Fix clang build Change-Id: I9468774d014c852901e4cc3bffabd8a3d8004519 * add mkldnn sum concat (#16263) * [mkldnn-1.0] mkldnn int8 elemwise_add (#16454) * add mkldnn int8 elemwise_add * add workaround to fix format any issue * code clean * upgrade int8 bn to MKLDNN1.0 (#16458) * [mkldnn-v1.0] Fused RNN Op (#16420) * [mkldnn-v1.0] Add MKL-DNN int8 fc (#16457) * Add mkldnn_v1.0 int8 fc * trigger CI * trigger CI * [mkldnn-v1.0] Update enabling flag for MKL dropout (#16433) * use MSHADOW_USE_MKL to determine whther to use mkl optimized dropout * rebase code * [mkldnn-1.0] upgrade int8 concat to MKLDNN1.0 (#16466) * [mkldnn-1.0] upgrade int8 concat to MKLDNN1.0 * fix lint * use mkldnn_args_map_t * update dict usage style * retrigger CI * retrigger CI again * retrigger CI again 2 * [mkldnn-v1.0] Add MKL-DNN slice (#16484) * change slice to mkldnn v1.0 * fix lint * [mkldnn-1.0] add mkldnn subgraph fc (#16468) * add mkldnn subgraph fc * code clean * trigger CI * [mkldnn-v1.0]enable mkldnn concat (#16507) * enable mkldnn concat * trigger CI * trigger CI * [mkldnn-v1.0] Enable mkldnn cpp-test, copy op, concat op (#16503) * [mkldnn-v1.0] Enable mkldnn test, copy op, concat op Exclude gpu topology via MXNET_USE_CUDA nit default format Remove whitespace * Unix-GPU Tensor-RT build timeout, re-trigger CI * [mkldnn-1.0] add skipped case for mkldnn_v1.0 (#16470) * add skipped case for mkldnn_v1.0 * enable mkl quantized testcase * enable skipped testcase * trigger CI * trigger CI * trigger CI * trigger CI * [mkldnn-1.0]enable mkldnn elemwise_sum (#16521) * enable mkldnn elemwise_sum * trigger CI * trigger CI * trigger CI * [mkldnn-v1.0] Enable more checks for MXNET_USE_MKLDNN (#16520) * open USE_MKLDNN check * trigger ci * ci * [mkldnn-v1.0]Minor fix for leakyrelu compile flag (#16519) * change to MXNET_USE_MKLDNN == 100 * trigger * remove MKL license (#16534) * change MXNET_USE_MKLDNN from 100 to 1 (#16551) * re-enable unit tests (#16565) * [mkldnn-v1.0] Skip flaky test for unidirectional rnn_relu (#16545) Skip `test_rnnrelu_sym`, and add some issue tracking message Add return Revert test_rnnrelu_sym to origin * Add some annotations and log strings, rename mem_desc variables (#16609) * [mkldnn-v1.0]set fc weight layout as mkldnn v0.2x did (#16593) * set fc weight layout as mkldnn v0.2x did * fix lint * [mkldnn-v1.0] Upgrade to MKL-DNN v1.0.4 patch release (#16592) * upgrade to mkldnn v1.0.3 patch release * retrigger ci * mkldnn v1.0.4 patch release * [mkldnn-1.0]Rebase to master (#16648) * fixed broken links across multiple files (#16581) * fix missing docs due to git add issues (#16496) * Create SECURITY.md (#16573) * Create SECURITY.md * Update SECURITY.md * [Numpy] Support N_D(N>=3) batch_dot (#16586) * Support N_D(N>=3) batch_dot * use 1E-4 * fix lint * remove unnecessary comment * Update test_numpy_op.py * Large Vector tests for DGL Ops Part 2 (#16497) * add hyperbolic, logical, sign and regression tests for large vector * changed hyperbolic functions into existing trignometric functions * fix trigo and simple bind needs shape as tuple * fix logical ops, add with_seed * fix arcosh in largearray, remove regression from largevector * [Numpy] Loading numpy-incompatible NDArray in numpy-compatible mode (#16597) * Make MXIsNumpyShape return enum * address the comment * Surpress subgraph log in CI (#16607) Change-Id: Ia2ed6fdbb1d2cb5cc607a8856ca13ee338e27eac * Fix dequantize memory corruption (#16606) Change-Id: I51b62a32987bdbcf96f04b1bc6617e66796f648b * [MKLDNN]Fix reorder2default (#16602) * Fix reorder2default Change-Id: I74c87af9535f6264e6d1ea7eaed089a6480a3358 * fix Change-Id: I6d07b43b520a47e7c78bd4b4b6390f5fb95e6957 * Fix Change-Id: Id72f25c34291be4711f55569c6d61467edd6113d * Fix CI Change-Id: I8c33a82555d5ace2d0b682c1e3eefa13f3a44768 * Run CI Change-Id: Ie8a6dab80ef91c0337cafbae4e3db277e0c7ebf7 * second round of fixing broken links in multiple files (#16598) * Python Docstring Convetion (#16550) * Docstring convetnion for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention * Revert removing new line * Remove white space * [MXNET-1434] Fix a broken link for basic C++ tutorial (#16461) * Fix for wrong reqs set after switching from training to inference (#16553) * Debugging reqs * Move literal strings to const static members * Fix lint * julia/docs: more DRY on page rendering (#16396) * [mkldnn-v1.0]rebase with master (#16649) * fixed broken links across multiple files (#16581) * fix missing docs due to git add issues (#16496) * Create SECURITY.md (#16573) * Create SECURITY.md * Update SECURITY.md * [Numpy] Support N_D(N>=3) batch_dot (#16586) * Support N_D(N>=3) batch_dot * use 1E-4 * fix lint * remove unnecessary comment * Update test_numpy_op.py * Large Vector tests for DGL Ops Part 2 (#16497) * add hyperbolic, logical, sign and regression tests for large vector * changed hyperbolic functions into existing trignometric functions * fix trigo and simple bind needs shape as tuple * fix logical ops, add with_seed * fix arcosh in largearray, remove regression from largevector * [Numpy] Loading numpy-incompatible NDArray in numpy-compatible mode (#16597) * Make MXIsNumpyShape return enum * address the comment * Surpress subgraph log in CI (#16607) Change-Id: Ia2ed6fdbb1d2cb5cc607a8856ca13ee338e27eac * Fix dequantize memory corruption (#16606) Change-Id: I51b62a32987bdbcf96f04b1bc6617e66796f648b * [MKLDNN]Fix reorder2default (#16602) * Fix reorder2default Change-Id: I74c87af9535f6264e6d1ea7eaed089a6480a3358 * fix Change-Id: I6d07b43b520a47e7c78bd4b4b6390f5fb95e6957 * Fix Change-Id: Id72f25c34291be4711f55569c6d61467edd6113d * Fix CI Change-Id: I8c33a82555d5ace2d0b682c1e3eefa13f3a44768 * Run CI Change-Id: Ie8a6dab80ef91c0337cafbae4e3db277e0c7ebf7 * second round of fixing broken links in multiple files (#16598) * Python Docstring Convetion (#16550) * Docstring convetnion for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention * Revert removing new line * Remove white space * [MXNET-1434] Fix a broken link for basic C++ tutorial (#16461) * Fix for wrong reqs set after switching from training to inference (#16553) * Debugging reqs * Move literal strings to const static members * Fix lint * julia/docs: more DRY on page rendering (#16396) * Disables test_bulking_operator_gpu due to flakiness (#16611) * C Api for simplebind, fix comment for trigoops, add atol to assert (#16585) * C Api for simplebind, fix comment for trigoops, add atol to assert * fix build issues * fix lint and add regression test * fix indent * api doc and function name change * fix lint and add infer shape test * Imagenet inference to nightly fix (#16599) * split to cd and shell * comment * lots of prints * copy binary at correct location * remove comments * add mkl lib * update docker run build function * set nvidia docker true to run imagenet inference on GPU * Revert "set nvidia docker true to run imagenet inference on GPU" This reverts commit 98f8eef2057351d7964f1e9326ea6772c216f0af. As we don't need GPU for compilation. * Fix python doc build issue (#16630) * pin the pip versions * remove nbconvert comment * Faster general take (#16615) * Sped up perf of take op when axis != 0 * Formatting and syntax fixes * Rename Take to specify axis * Fix line length lint errors * [Gluon] Don't serialize shared parameters twice (#16582) Add deduplicate argument (default of False) to save_parameters. * Fix index overflow bug in einsum (#16589) * fix index overflow * check index overflow * fix index overflow in einsum path * fix indent * reduce NPY_MAXARGS * safe accumulate * Move some subgraph verbose to MXNET_SUBGRAPH_VERBOSE=2 (#16622) * Move subgraph pass log to verbose=2 * Run CI * add npx reshape (#16640) * RNNOp only call cuda/cudnn if GPU ctx is requested (#16632) * fix bad encode (#16641) * [Perl] - ndarray to native array conversion fix (#16635) * fixing broken links in multiple files - round 3 (#16634) * add type switch to weight tensor (#16543) * numpy doc enhancement (#16637) * Change NDArray to ndarray for npx ops Add nonzero boolean mask supports boolean ndarray Add argmin op and interoperability test for nonzero Fix vdot, inner, outter docs Add nonzero to mx.nd.np Add docs Fix * Fix lint * Fix * Fix * Fix get_constant * Disable float16 test (#16643) * Fix GetMKLDNNData for delay alloc (#16618) * Fix GetMKLDNNData for delay alloc * Run CI * Run CI * Run CI * Run CI * Run CI Change-Id: I7ac2796e0ee8439c92fd2bd7a70a23a359b76b12 * Revert "[mkldnn-1.0]Rebase to master (#16648)" This reverts commit dea3dd23d1982c913b3af6cfc7f4115c2cfa7244. * [mkldnn-v1.0] Minor fix of mkldnn-v1.0 transition (#16644) mk and rm directory in mkldnn.mk ndarray.cc redundant whitespace mkldnn_act rename variables of bwd primitives mkldnn_rnn.cc iterator -> const_iterator Use != instead of < for iterator in for-loop Code comment for explaining the reason why excludes the last layer * [mkldnn-v1.0]rm int8 sum workaround (#16623) * rm int8 sum workaround due to mkldnn lib update * simple dims asignments in mkldnn_quantized_elemwise_add.cc * make MKLDNN macro simple for imperative_utils.h (#16652) * fix ci jenkins step groovy (#16659) * Adopt autograd.record() context to RNNOp (#16657) * Use memcopy instead of set_handle when num_layer=0, direction=1 (#16663) * fallback mkldnn fc bwd in imperative mode (#16672) * disable MKLDNN FC backward * [mkldnn-v1.0] Must reorder and emplace weights for inference primitives (#16682) * add default parameter for mkldnn rnn 2019-10-31 22:55:13 +08:00			`quantized_dtype=qdtype,`
			`quantize_mode='full',`
add int8 bn mkldnn implementation and test (#15664) * add int8 bn mkldnn implementation and test * fix lint * fix ci * enable int8 bn test only in mkldnn backend * disable int8 bn forward test with gpu backend * update int8 bn with reference to comments * fix lint * disable int8 bn gluon forward test with gpu backend * disable uint8 bn forward test with mkldnn backend * restore support mkldnn bn condition * rm duplicate code 2019-08-08 18:20:19 +08:00			`calib_mode='naive',`
			`calib_data=calib_data,`
			`num_calib_examples=20)`

			`mod = mx.mod.Module(symbol=qsym, label_names=None, context=mx.current_context())`
			`mod.bind(for_training=False, data_shapes=[('data', data_shape)])`
			`mod.set_params(qarg_params, qaux_params)`
			`batch = mx.io.DataBatch([data], [])`
			`mod.forward(batch, is_train=False)`
add uint8 bn mkldnn implementation (#16003) * add uint8 bn mkldnn implementation * update test case for uint8 bn * fix lint * update test with gpu * add comment for quantization * fix quantized_bn test * fix quantize_model_with_forward test 2019-08-26 20:56:23 +08:00			`output_int8_to_fp32 = mod.get_outputs()[0]`
add int8 bn mkldnn implementation and test (#15664) * add int8 bn mkldnn implementation and test * fix lint * fix ci * enable int8 bn test only in mkldnn backend * disable int8 bn forward test with gpu backend * update int8 bn with reference to comments * fix lint * disable int8 bn gluon forward test with gpu backend * disable uint8 bn forward test with mkldnn backend * restore support mkldnn bn condition * rm duplicate code 2019-08-08 18:20:19 +08:00
Upgrade MKL-DNN dependency to v1.0 (#16555) * [mkldnn-v1.0] Initiate the transition to MKL-DNN v1.0 (#15706) * update mkldnn to 1.0.1 release * change makefile * change cmake * update ci build and pip package build * fix typo in mkldnn.mk * fix build for USE_BLAS=mkl & bump MKL version * skip mkldnn unit tests * remove iomp5 from mx_mkldnn_lib * ci: skip test_mkldnn_install * retrigger ci * retrigger ci * retrigger ci * [mkldnn-v1.0] Update MKL-DNN to v1.0.2 (#16012) * bump mkldnn to v1.0.2 * skip quantization unit test * add useless build flag * Fixes openblas installation for static build * empty commit * [mkldnn-v1.0] Enable base code with new APIs. (#16064) * fix comments (#8) * add base code for mkldnn 1.0 * fix comments * Update mkldnn.mk * add base code for mkldnn 1.0 * fix build * fix lint * fix lint * [mkldnn-v1.0] Add MKL-DNN Convolution (#16141) * add mkldnn conv * revert unnecessary change * fix testcase fail for cpu: test_convolution_independent_gradients * fix failed testcase: test_reshape_transpose_6d&&test_weight_async_reorder * fix comments * change variable name from weights to weight in mkldnn_conv * [mkldnn-v1.0] Add MKL-DNN activation (#16195) * add mkldnn act; pass lint; pass mnist training * make bwd as private member * [mkldnn-v1.0] Add MKL-DNN BN (#16199) * add mkldnn bn * add static_cast to transform data type * change mkldnn_args_map_t * retrigger CI * add mkldnn lrn (#16223) * [mkldnn-v1.0] Add MKL-DNN Transpose (#16250) * add mkldnn transpose * using mkldnn_args_map_t instead of std::unordered_map<int, mkldnn::memory> * [mkldnn-v1.0] Add MKL-DNN softmax (#16246) * add mkldnn softmax * trigger CI * [mkldnn-v1.0] Add MKL-DNN FC (#16221) * add mkldnn fc; pass lint; pass mnist training * add TODO info for future debug * [mkldnn-v1.0] Add MKL-DNN deconv (#16259) * add mkldnn deconv * coding style * trigger CI * add mkldnn softmax_output (#16222) * [mkldnn-v1.0] Add MKL-DNN Pooling (#16272) * add mkldnn pooling * add workaround for mkldnn v1.0 pooling fwd && bwd workspace mismatch * code clean * fix lint error * trigger CI * trigger CI * add extra work_space check and fix some typo * trigger CI * [mkldnn-v1.0] Add MKL-DNN reshape&flatten&expand_dims (#16258) * Add mkldnn 1.0 support for reshape/flatten/expanddims ops * improve log & modify definition location of args_map_ * fix comments * rebase code * trigger CI * trigger CI * trigger CI * trigger CI * [mkldnn-v1.0] Add MKL-DNN int8 activation&pooling&flatten (#16425) * Add mkldnn quantized activation/pooling/flatten * int8 flatten * [mkldnn-1.0] int8 conv quantize dequantize requantize (#16283) * int8 conv quantize dequantize requantize Change-Id: Ibd9df97288a95c61d6d85ec3831fd18b626ca283 * Fix lint * Fix clang build Change-Id: I9468774d014c852901e4cc3bffabd8a3d8004519 * add mkldnn sum concat (#16263) * [mkldnn-1.0] mkldnn int8 elemwise_add (#16454) * add mkldnn int8 elemwise_add * add workaround to fix format any issue * code clean * upgrade int8 bn to MKLDNN1.0 (#16458) * [mkldnn-v1.0] Fused RNN Op (#16420) * [mkldnn-v1.0] Add MKL-DNN int8 fc (#16457) * Add mkldnn_v1.0 int8 fc * trigger CI * trigger CI * [mkldnn-v1.0] Update enabling flag for MKL dropout (#16433) * use MSHADOW_USE_MKL to determine whther to use mkl optimized dropout * rebase code * [mkldnn-1.0] upgrade int8 concat to MKLDNN1.0 (#16466) * [mkldnn-1.0] upgrade int8 concat to MKLDNN1.0 * fix lint * use mkldnn_args_map_t * update dict usage style * retrigger CI * retrigger CI again * retrigger CI again 2 * [mkldnn-v1.0] Add MKL-DNN slice (#16484) * change slice to mkldnn v1.0 * fix lint * [mkldnn-1.0] add mkldnn subgraph fc (#16468) * add mkldnn subgraph fc * code clean * trigger CI * [mkldnn-v1.0]enable mkldnn concat (#16507) * enable mkldnn concat * trigger CI * trigger CI * [mkldnn-v1.0] Enable mkldnn cpp-test, copy op, concat op (#16503) * [mkldnn-v1.0] Enable mkldnn test, copy op, concat op Exclude gpu topology via MXNET_USE_CUDA nit default format Remove whitespace * Unix-GPU Tensor-RT build timeout, re-trigger CI * [mkldnn-1.0] add skipped case for mkldnn_v1.0 (#16470) * add skipped case for mkldnn_v1.0 * enable mkl quantized testcase * enable skipped testcase * trigger CI * trigger CI * trigger CI * trigger CI * [mkldnn-1.0]enable mkldnn elemwise_sum (#16521) * enable mkldnn elemwise_sum * trigger CI * trigger CI * trigger CI * [mkldnn-v1.0] Enable more checks for MXNET_USE_MKLDNN (#16520) * open USE_MKLDNN check * trigger ci * ci * [mkldnn-v1.0]Minor fix for leakyrelu compile flag (#16519) * change to MXNET_USE_MKLDNN == 100 * trigger * remove MKL license (#16534) * change MXNET_USE_MKLDNN from 100 to 1 (#16551) * re-enable unit tests (#16565) * [mkldnn-v1.0] Skip flaky test for unidirectional rnn_relu (#16545) Skip `test_rnnrelu_sym`, and add some issue tracking message Add return Revert test_rnnrelu_sym to origin * Add some annotations and log strings, rename mem_desc variables (#16609) * [mkldnn-v1.0]set fc weight layout as mkldnn v0.2x did (#16593) * set fc weight layout as mkldnn v0.2x did * fix lint * [mkldnn-v1.0] Upgrade to MKL-DNN v1.0.4 patch release (#16592) * upgrade to mkldnn v1.0.3 patch release * retrigger ci * mkldnn v1.0.4 patch release * [mkldnn-1.0]Rebase to master (#16648) * fixed broken links across multiple files (#16581) * fix missing docs due to git add issues (#16496) * Create SECURITY.md (#16573) * Create SECURITY.md * Update SECURITY.md * [Numpy] Support N_D(N>=3) batch_dot (#16586) * Support N_D(N>=3) batch_dot * use 1E-4 * fix lint * remove unnecessary comment * Update test_numpy_op.py * Large Vector tests for DGL Ops Part 2 (#16497) * add hyperbolic, logical, sign and regression tests for large vector * changed hyperbolic functions into existing trignometric functions * fix trigo and simple bind needs shape as tuple * fix logical ops, add with_seed * fix arcosh in largearray, remove regression from largevector * [Numpy] Loading numpy-incompatible NDArray in numpy-compatible mode (#16597) * Make MXIsNumpyShape return enum * address the comment * Surpress subgraph log in CI (#16607) Change-Id: Ia2ed6fdbb1d2cb5cc607a8856ca13ee338e27eac * Fix dequantize memory corruption (#16606) Change-Id: I51b62a32987bdbcf96f04b1bc6617e66796f648b * [MKLDNN]Fix reorder2default (#16602) * Fix reorder2default Change-Id: I74c87af9535f6264e6d1ea7eaed089a6480a3358 * fix Change-Id: I6d07b43b520a47e7c78bd4b4b6390f5fb95e6957 * Fix Change-Id: Id72f25c34291be4711f55569c6d61467edd6113d * Fix CI Change-Id: I8c33a82555d5ace2d0b682c1e3eefa13f3a44768 * Run CI Change-Id: Ie8a6dab80ef91c0337cafbae4e3db277e0c7ebf7 * second round of fixing broken links in multiple files (#16598) * Python Docstring Convetion (#16550) * Docstring convetnion for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention * Revert removing new line * Remove white space * [MXNET-1434] Fix a broken link for basic C++ tutorial (#16461) * Fix for wrong reqs set after switching from training to inference (#16553) * Debugging reqs * Move literal strings to const static members * Fix lint * julia/docs: more DRY on page rendering (#16396) * [mkldnn-v1.0]rebase with master (#16649) * fixed broken links across multiple files (#16581) * fix missing docs due to git add issues (#16496) * Create SECURITY.md (#16573) * Create SECURITY.md * Update SECURITY.md * [Numpy] Support N_D(N>=3) batch_dot (#16586) * Support N_D(N>=3) batch_dot * use 1E-4 * fix lint * remove unnecessary comment * Update test_numpy_op.py * Large Vector tests for DGL Ops Part 2 (#16497) * add hyperbolic, logical, sign and regression tests for large vector * changed hyperbolic functions into existing trignometric functions * fix trigo and simple bind needs shape as tuple * fix logical ops, add with_seed * fix arcosh in largearray, remove regression from largevector * [Numpy] Loading numpy-incompatible NDArray in numpy-compatible mode (#16597) * Make MXIsNumpyShape return enum * address the comment * Surpress subgraph log in CI (#16607) Change-Id: Ia2ed6fdbb1d2cb5cc607a8856ca13ee338e27eac * Fix dequantize memory corruption (#16606) Change-Id: I51b62a32987bdbcf96f04b1bc6617e66796f648b * [MKLDNN]Fix reorder2default (#16602) * Fix reorder2default Change-Id: I74c87af9535f6264e6d1ea7eaed089a6480a3358 * fix Change-Id: I6d07b43b520a47e7c78bd4b4b6390f5fb95e6957 * Fix Change-Id: Id72f25c34291be4711f55569c6d61467edd6113d * Fix CI Change-Id: I8c33a82555d5ace2d0b682c1e3eefa13f3a44768 * Run CI Change-Id: Ie8a6dab80ef91c0337cafbae4e3db277e0c7ebf7 * second round of fixing broken links in multiple files (#16598) * Python Docstring Convetion (#16550) * Docstring convetnion for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention for * Docstring convention * Revert removing new line * Remove white space * [MXNET-1434] Fix a broken link for basic C++ tutorial (#16461) * Fix for wrong reqs set after switching from training to inference (#16553) * Debugging reqs * Move literal strings to const static members * Fix lint * julia/docs: more DRY on page rendering (#16396) * Disables test_bulking_operator_gpu due to flakiness (#16611) * C Api for simplebind, fix comment for trigoops, add atol to assert (#16585) * C Api for simplebind, fix comment for trigoops, add atol to assert * fix build issues * fix lint and add regression test * fix indent * api doc and function name change * fix lint and add infer shape test * Imagenet inference to nightly fix (#16599) * split to cd and shell * comment * lots of prints * copy binary at correct location * remove comments * add mkl lib * update docker run build function * set nvidia docker true to run imagenet inference on GPU * Revert "set nvidia docker true to run imagenet inference on GPU" This reverts commit 98f8eef2057351d7964f1e9326ea6772c216f0af. As we don't need GPU for compilation. * Fix python doc build issue (#16630) * pin the pip versions * remove nbconvert comment * Faster general take (#16615) * Sped up perf of take op when axis != 0 * Formatting and syntax fixes * Rename Take to specify axis * Fix line length lint errors * [Gluon] Don't serialize shared parameters twice (#16582) Add deduplicate argument (default of False) to save_parameters. * Fix index overflow bug in einsum (#16589) * fix index overflow * check index overflow * fix index overflow in einsum path * fix indent * reduce NPY_MAXARGS * safe accumulate * Move some subgraph verbose to MXNET_SUBGRAPH_VERBOSE=2 (#16622) * Move subgraph pass log to verbose=2 * Run CI * add npx reshape (#16640) * RNNOp only call cuda/cudnn if GPU ctx is requested (#16632) * fix bad encode (#16641) * [Perl] - ndarray to native array conversion fix (#16635) * fixing broken links in multiple files - round 3 (#16634) * add type switch to weight tensor (#16543) * numpy doc enhancement (#16637) * Change NDArray to ndarray for npx ops Add nonzero boolean mask supports boolean ndarray Add argmin op and interoperability test for nonzero Fix vdot, inner, outter docs Add nonzero to mx.nd.np Add docs Fix * Fix lint * Fix * Fix * Fix get_constant * Disable float16 test (#16643) * Fix GetMKLDNNData for delay alloc (#16618) * Fix GetMKLDNNData for delay alloc * Run CI * Run CI * Run CI * Run CI * Run CI Change-Id: I7ac2796e0ee8439c92fd2bd7a70a23a359b76b12 * Revert "[mkldnn-1.0]Rebase to master (#16648)" This reverts commit dea3dd23d1982c913b3af6cfc7f4115c2cfa7244. * [mkldnn-v1.0] Minor fix of mkldnn-v1.0 transition (#16644) mk and rm directory in mkldnn.mk ndarray.cc redundant whitespace mkldnn_act rename variables of bwd primitives mkldnn_rnn.cc iterator -> const_iterator Use != instead of < for iterator in for-loop Code comment for explaining the reason why excludes the last layer * [mkldnn-v1.0]rm int8 sum workaround (#16623) * rm int8 sum workaround due to mkldnn lib update * simple dims asignments in mkldnn_quantized_elemwise_add.cc * make MKLDNN macro simple for imperative_utils.h (#16652) * fix ci jenkins step groovy (#16659) * Adopt autograd.record() context to RNNOp (#16657) * Use memcopy instead of set_handle when num_layer=0, direction=1 (#16663) * fallback mkldnn fc bwd in imperative mode (#16672) * disable MKLDNN FC backward * [mkldnn-v1.0] Must reorder and emplace weights for inference primitives (#16682) * add default parameter for mkldnn rnn 2019-10-31 22:55:13 +08:00			`assert_almost_equal(output.asnumpy(), output_int8_to_fp32.asnumpy(), rtol=1e-1, atol=8)`
add int8 bn mkldnn implementation and test (#15664) * add int8 bn mkldnn implementation and test * fix lint * fix ci * enable int8 bn test only in mkldnn backend * disable int8 bn forward test with gpu backend * update int8 bn with reference to comments * fix lint * disable int8 bn gluon forward test with gpu backend * disable uint8 bn forward test with mkldnn backend * restore support mkldnn bn condition * rm duplicate code 2019-08-08 18:20:19 +08:00
add uint8 bn mkldnn implementation (#16003) * add uint8 bn mkldnn implementation * update test case for uint8 bn * fix lint * update test with gpu * add comment for quantization * fix quantized_bn test * fix quantize_model_with_forward test 2019-08-26 20:56:23 +08:00			`for qdtype in ['int8', 'uint8']:`
			`check_quantized_bn((32, 512, 4, 4), qdtype)`
			`check_quantized_bn((32, 1024, 8, 8), qdtype)`
			`check_quantized_bn((32, 3, 224, 224), qdtype)`
add int8 bn mkldnn implementation and test (#15664) * add int8 bn mkldnn implementation and test * fix lint * fix ci * enable int8 bn test only in mkldnn backend * disable int8 bn forward test with gpu backend * update int8 bn with reference to comments * fix lint * disable int8 bn gluon forward test with gpu backend * disable uint8 bn forward test with mkldnn backend * restore support mkldnn bn condition * rm duplicate code 2019-08-08 18:20:19 +08:00
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00			`@with_seed()`
			`def test_quantize_params():`
[Quantization]support exclude operators while quantization (#15910) * skip unregistered op and add exclude_op_names * wrap a function * add testcase for excluded op * skip test for native cpu since all op will be excluded * trigger * trigger again * revert check inner node op name * test subgraph * trigger * trigger 2019-08-21 13:59:57 +08:00			`if is_test_for_native_cpu():`
			`print('skipped testing quantized_params for native cpu since it is not supported yet')`
			`return`

[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00			`data = mx.sym.Variable('data')`
			`conv = mx.sym.Convolution(data, kernel=(1, 1), num_filter=2048, name='conv')`
			`sym = mx.sym.BatchNorm(data=conv, eps=2e-05, fix_gamma=False, momentum=0.9, use_global_stats=False, name='bn')`
			`offline_params = [name for name in sym.list_arguments()`
			`if not name.startswith('data') and not name.endswith('label')]`
			`params = {}`
			`for name in offline_params:`
			`params[name] = mx.nd.uniform(shape=(2, 2))`
Improve quantization flow (#15961) * Add mkldnn imlementation for quantized flatten and smart quantize mode Change-Id: Id9d7504890852faf4e84fdcd66585d1fd78beeb2 * Add calibrate op Change-Id: I4c82f64dbef501d2560f7ffee93991119a66a5ee * Fix merge Change-Id: I068df3d4f3309bc9b950a3b869da9407282c8577 * Fix lint Change-Id: Ia38369d31c33d0f76a671275910729dfce693950 * Run CI * Run CI * Update test_quantization.py * Run CI * Fix CI Change-Id: I7479327db5ebc7c57b7bd810a67d2b765c820534 * Fix CI Change-Id: I4273938cb972c12b8f43dbd95c736a7d32df040e * Fix CI Change-Id: I80b47bd1d95520a7cd78cacbbc1a85fe0900123d * Fix CI Change-Id: I56542470010e7bc403f62dc8a8991c2fb58d229e * Fix CI Change-Id: If8482fe4da2f3d627dd3cbac8795e021a09a441f * Fix GPU Change-Id: Ic239cbf7aa3d111f2895badd1cac196fce6a1b86 2019-08-29 11:39:33 +08:00			`qsym, _ = mx.contrib.quant._quantize_symbol(sym, ctx=mx.current_context(),`
			`offline_params=offline_params, quantize_mode='full')`
Implement mkldnn convolution fusion and quantization. (#12530) * Implement mkldnn convolution fusion. Implement mkldnn convolution quantization. * Fix lint * Fix performance regression caused by mkldnn fallback. * clean up include * Fix msbuild on openmp pragma. * Fix quantization test, allow to use original op names as exclude layer for quantization. * Fix unittest. * Fix unittest * fix lint * Add post quantize fusion * add test case * add head license in test case * Remove GetBoolHash() * Remove mkldnn fallback change. * Address Haibin's comments. * Add TIsMKLDNN for _sg_mkldnn_conv temporarily. * Address reminisce's comments. * Handle the case that inplace fail. * pass unit test. * Add symbol api get_backend_symbol() * Retrigger ci * update the test case * Check subgraph index. * Use index as FAvoidQuantizeInput's parameter. * Add mkldnn_hwigo support as quantizaiton needs. * Address KellenSunderland's comments. * Handle input order change after subgraph pass. * Fix ci test 2018-10-10 01:38:53 +08:00			`qparams = mx.contrib.quant._quantize_params(qsym, params, th_dict = {})`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00			`param_names = params.keys()`
			`qparam_names = qparams.keys()`
			`for name in qparam_names:`
			`if name.startswith('bn'):`
			`assert name in param_names`
			`elif name.startswith('conv'):`
			`assert name not in param_names`
			`assert name.find('quantize') != -1`


			`def get_fp32_sym():`
			`data = mx.sym.Variable('data')`
			`conv = mx.sym.Convolution(data, kernel=(1, 1), num_filter=16, name='conv')`
			`bn = mx.sym.BatchNorm(data=conv, eps=2e-05, fix_gamma=False, momentum=0.9, use_global_stats=False, name='bn')`
			`act = mx.sym.Activation(data=bn, act_type='relu', name='relu')`
			`pool = mx.sym.Pooling(act, kernel=(4, 4), pool_type='avg', name='pool')`
			`fc = mx.sym.FullyConnected(pool, num_hidden=10, flatten=True, name='fc')`
			`sym = mx.sym.SoftmaxOutput(fc, grad_scale=1, ignore_label=-1, multi_output=False,`
			`out_grad=False, preserve_shape=False, use_ignore=False, name='softmax')`
			`return sym`

Fix quantized graphpass bug (#11937) * fix quantized graphpass bug * add residual quantization testcase * handle dtype and backend issues 2018-08-12 12:51:13 +08:00			`def get_fp32_residual():`
			`data = mx.sym.Variable('data')`
fix quantize pass error when the quantization supported Op are excluded in the model (#13596) 2018-12-13 05:45:35 +08:00			`conv0 = mx.sym.Convolution(data=data, num_filter=4, kernel=(1,1), pad=(0,0),`
			`no_bias=True, name='conv0')`
			`bn = mx.sym.BatchNorm(data=conv0, fix_gamma=False, eps=2e-5, momentum=0.9, name='bn')`
[MKLDNN] add quantized sum (#14614) * add quantized sum * fix gpu compiler error and cpu testcase fail * add default forward function for quantized_sum * skip quantized_sum for gpu ctx * fix comments * fix indetation and comments * retrigger CI * alloc memeory through TmpMemMgr * fix comments Apr.12 * change sum to elemwise_add * change Sum to ElemwiseAdd * fix indents * retrigger CI * trigger CI * fix indentation and typo * trigger CI * fix typo * fix typo * remove USE_MKLDNN macro for requantize params * rename param same as its op * trigger CI * trigger CI * trigger CI 2019-05-01 05:56:04 +08:00			`sum0 = mx.sym.elemwise_add(bn, data, name='sum0')`
			`act0 = mx.sym.Activation(data=sum0, act_type='relu', name='relu0')`
fix quantize pass error when the quantization supported Op are excluded in the model (#13596) 2018-12-13 05:45:35 +08:00			`pool0 = mx.sym.Pooling(act0, kernel=(4, 4), pool_type='avg', name='pool0')`
			`conv1 = mx.sym.Convolution(data=pool0, num_filter=4, kernel=(1,1), pad=(0,0),`
			`no_bias=False, name='conv1')`
			`act1 = mx.sym.Activation(data=conv1, act_type='relu', name='relu1')`
			`pool1 = mx.sym.Pooling(act1, kernel=(4, 4), pool_type='avg', name='pool1')`
			`fc = mx.sym.FullyConnected(pool1, num_hidden=10, flatten=True, name='fc')`
Fix quantized graphpass bug (#11937) * fix quantized graphpass bug * add residual quantization testcase * handle dtype and backend issues 2018-08-12 12:51:13 +08:00			`sym = mx.sym.SoftmaxOutput(fc, grad_scale=1, ignore_label=-1, multi_output=False,`
			`out_grad=False, preserve_shape=False, use_ignore=False, name='softmax')`
Implement mkldnn convolution fusion and quantization. (#12530) * Implement mkldnn convolution fusion. Implement mkldnn convolution quantization. * Fix lint * Fix performance regression caused by mkldnn fallback. * clean up include * Fix msbuild on openmp pragma. * Fix quantization test, allow to use original op names as exclude layer for quantization. * Fix unittest. * Fix unittest * fix lint * Add post quantize fusion * add test case * add head license in test case * Remove GetBoolHash() * Remove mkldnn fallback change. * Address Haibin's comments. * Add TIsMKLDNN for _sg_mkldnn_conv temporarily. * Address reminisce's comments. * Handle the case that inplace fail. * pass unit test. * Add symbol api get_backend_symbol() * Retrigger ci * update the test case * Check subgraph index. * Use index as FAvoidQuantizeInput's parameter. * Add mkldnn_hwigo support as quantizaiton needs. * Address KellenSunderland's comments. * Handle input order change after subgraph pass. * Fix ci test 2018-10-10 01:38:53 +08:00			`return sym`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00
fix quantize_graph pass error when there're multiple outputs from a single node (#13000) * fix quantize_graph pass error when there're multiple outputs from a single node that need to insert 'contrib_quantize', 'min' and 'max' nodes for these outputs. * fix lint * Make the single output align with multiple outputs when inserting contrib_quantize * Change op comparing from its name to itself * skip unsupported quantize_concat * retrigger ci 2018-11-30 10:58:08 +08:00			`def get_fp32_sym_with_multiple_outputs(length=1):`
			`data = mx.sym.Variable('data')`
			`inputs = list(mx.sym.split(data, axis=0, num_outputs=length, squeeze_axis=1, name='split'))`

			`_conv_outs = []`
			`for i in range(length):`
			`_conv_outs.append(mx.sym.Convolution(data=inputs[i], kernel=(1, 1), num_filter=16, name='conv_{0}'.format(i)))`
			`conv_out = [mx.sym.expand_dims(i, axis=0) for i in _conv_outs]`
			`conv_out = mx.sym.Concat(*conv_out, dim=0, name='concat')`
			`reshape_out = mx.sym.reshape(data=conv_out, shape=((length, -1)), name='reshape')`
			`fc_out = mx.sym.FullyConnected(reshape_out, num_hidden=10, flatten=True, name='fc')`
			`sym= mx.sym.SoftmaxOutput(fc_out, grad_scale=1, ignore_label=-1, multi_output=False,`
			`out_grad=False, preserve_shape=False, use_ignore=False, name='softmax')`
			`return sym`

[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00			`@with_seed()`
			`def test_quantize_model():`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`def check_quantize_model(qdtype):`
Enhance gpu quantization (#14094) * enhance gpu quantization * fix test and improve error message * add check srctype to quantized_conv.cu * improve infer type * fix lint * add dtype check in quantize * revert check in python level and quantized_conv * Revert "add dtype check in quantize" This reverts commit ab6866811346e12dadb679fe325e86badbe93c15. * add dtype check in quantize * fix quantize test case 2019-03-06 09:42:11 +08:00			`if is_test_for_native_cpu():`
			`print('skipped testing quantize_model for native cpu since it is not supported yet')`
			`return`
			`elif qdtype == 'int8' and is_test_for_mkldnn():`
			`print('skipped testing quantize_model for mkldnn cpu int8 since it is not supported yet')`
			`return`
			`elif qdtype == 'uint8' and is_test_for_gpu():`
			`print('skipped testing quantize_model for gpu uint8 since it is not supported yet')`
			`return`

[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`def check_params(params, qparams, qsym=None):`
			`if qsym is None:`
			`assert len(params) == len(qparams)`
			`for k, v in params.items():`
			`assert k in qparams`
			`assert same(v.asnumpy(), qparams[k].asnumpy())`
			`else:`
Implement mkldnn convolution fusion and quantization. (#12530) * Implement mkldnn convolution fusion. Implement mkldnn convolution quantization. * Fix lint * Fix performance regression caused by mkldnn fallback. * clean up include * Fix msbuild on openmp pragma. * Fix quantization test, allow to use original op names as exclude layer for quantization. * Fix unittest. * Fix unittest * fix lint * Add post quantize fusion * add test case * add head license in test case * Remove GetBoolHash() * Remove mkldnn fallback change. * Address Haibin's comments. * Add TIsMKLDNN for _sg_mkldnn_conv temporarily. * Address reminisce's comments. * Handle the case that inplace fail. * pass unit test. * Add symbol api get_backend_symbol() * Retrigger ci * update the test case * Check subgraph index. * Use index as FAvoidQuantizeInput's parameter. * Add mkldnn_hwigo support as quantizaiton needs. * Address KellenSunderland's comments. * Handle input order change after subgraph pass. * Fix ci test 2018-10-10 01:38:53 +08:00			`qparams_ground_truth = mx.contrib.quant._quantize_params(qsym, params, th_dict = {})`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00			`assert len(qparams) == len(qparams_ground_truth)`
			`for k, v in qparams_ground_truth.items():`
			`assert k in qparams`
			`assert same(v.asnumpy(), qparams[k].asnumpy())`

			`def check_qsym_calibrated(qsym):`
			`attrs = qsym.attr_dict()`
			`for k, v in attrs.items():`
			`if k.find('requantize_') != -1:`
			`assert 'min_calib_range' in v`
			`assert 'max_calib_range' in v`

			`def check_qsym_qdtype(qsym, qdtype):`
			`attrs = qsym.attr_dict()`
			`for k, v in attrs.items():`
			`if k.find('_quantize') != -1:`
			`assert 'out_type' in v`
			`assert v['out_type'] == qdtype`

			`sym = get_fp32_sym()`
			`batch_size = 4`
			`label_shape = (batch_size, 10)`
fix quantize_graph pass error when there're multiple outputs from a single node (#13000) * fix quantize_graph pass error when there're multiple outputs from a single node that need to insert 'contrib_quantize', 'min' and 'max' nodes for these outputs. * fix lint * Make the single output align with multiple outputs when inserting contrib_quantize * Change op comparing from its name to itself * skip unsupported quantize_concat * retrigger ci 2018-11-30 10:58:08 +08:00			`data_shape = (batch_size, 4, 10, 10)`

			`length = batch_size # specify num of outputs from split op`
			`msym = get_fp32_sym_with_multiple_outputs(length)`
			`msym_label_shape = (length, 10)`
			`msym_data_shape = (length, 4, 4, 10, 10)`

			`for s, dshape, lshape in zip((sym, msym), (data_shape, msym_data_shape),`
			`(label_shape, msym_label_shape)):`
			`mod = Module(symbol=s)`
			`mod.bind(data_shapes=[('data', dshape)], label_shapes=[('softmax_label', lshape)])`
			`mod.init_params()`
			`arg_params, aux_params = mod.get_params()`
			`qsym, qarg_params, qaux_params = mx.contrib.quant.quantize_model(sym=s,`
			`arg_params=arg_params,`
			`aux_params=aux_params,`
			`ctx=mx.current_context(),`
			`quantized_dtype=qdtype,`
Improve quantization flow (#15961) * Add mkldnn imlementation for quantized flatten and smart quantize mode Change-Id: Id9d7504890852faf4e84fdcd66585d1fd78beeb2 * Add calibrate op Change-Id: I4c82f64dbef501d2560f7ffee93991119a66a5ee * Fix merge Change-Id: I068df3d4f3309bc9b950a3b869da9407282c8577 * Fix lint Change-Id: Ia38369d31c33d0f76a671275910729dfce693950 * Run CI * Run CI * Update test_quantization.py * Run CI * Fix CI Change-Id: I7479327db5ebc7c57b7bd810a67d2b765c820534 * Fix CI Change-Id: I4273938cb972c12b8f43dbd95c736a7d32df040e * Fix CI Change-Id: I80b47bd1d95520a7cd78cacbbc1a85fe0900123d * Fix CI Change-Id: I56542470010e7bc403f62dc8a8991c2fb58d229e * Fix CI Change-Id: If8482fe4da2f3d627dd3cbac8795e021a09a441f * Fix GPU Change-Id: Ic239cbf7aa3d111f2895badd1cac196fce6a1b86 2019-08-29 11:39:33 +08:00			`calib_mode='none',`
			`quantize_mode='full')`
fix quantize_graph pass error when there're multiple outputs from a single node (#13000) * fix quantize_graph pass error when there're multiple outputs from a single node that need to insert 'contrib_quantize', 'min' and 'max' nodes for these outputs. * fix lint * Make the single output align with multiple outputs when inserting contrib_quantize * Change op comparing from its name to itself * skip unsupported quantize_concat * retrigger ci 2018-11-30 10:58:08 +08:00			`check_params(arg_params, qarg_params, qsym)`
			`check_params(aux_params, qaux_params)`

			`calib_data = mx.nd.random.uniform(shape=dshape)`
			`calib_data = NDArrayIter(data=calib_data, batch_size=batch_size)`
			`calib_data = DummyIter(calib_data)`
			`qsym, qarg_params, qaux_params = mx.contrib.quant.quantize_model(sym=s,`
			`arg_params=arg_params,`
			`aux_params=aux_params,`
			`ctx=mx.current_context(),`
			`quantized_dtype=qdtype,`
			`calib_mode='naive',`
			`calib_data=calib_data,`
Improve quantization flow (#15961) * Add mkldnn imlementation for quantized flatten and smart quantize mode Change-Id: Id9d7504890852faf4e84fdcd66585d1fd78beeb2 * Add calibrate op Change-Id: I4c82f64dbef501d2560f7ffee93991119a66a5ee * Fix merge Change-Id: I068df3d4f3309bc9b950a3b869da9407282c8577 * Fix lint Change-Id: Ia38369d31c33d0f76a671275910729dfce693950 * Run CI * Run CI * Update test_quantization.py * Run CI * Fix CI Change-Id: I7479327db5ebc7c57b7bd810a67d2b765c820534 * Fix CI Change-Id: I4273938cb972c12b8f43dbd95c736a7d32df040e * Fix CI Change-Id: I80b47bd1d95520a7cd78cacbbc1a85fe0900123d * Fix CI Change-Id: I56542470010e7bc403f62dc8a8991c2fb58d229e * Fix CI Change-Id: If8482fe4da2f3d627dd3cbac8795e021a09a441f * Fix GPU Change-Id: Ic239cbf7aa3d111f2895badd1cac196fce6a1b86 2019-08-29 11:39:33 +08:00			`num_calib_examples=20,`
			`quantize_mode='full')`
fix quantize_graph pass error when there're multiple outputs from a single node (#13000) * fix quantize_graph pass error when there're multiple outputs from a single node that need to insert 'contrib_quantize', 'min' and 'max' nodes for these outputs. * fix lint * Make the single output align with multiple outputs when inserting contrib_quantize * Change op comparing from its name to itself * skip unsupported quantize_concat * retrigger ci 2018-11-30 10:58:08 +08:00			`check_params(arg_params, qarg_params, qsym)`
			`check_params(aux_params, qaux_params)`
			`check_qsym_calibrated(qsym)`
			`check_qsym_qdtype(qsym, qdtype)`
[MXNET-290] MKLDNN support for model quantization (#10433) * mkldnn support for quantization * fix output number in graph * update licsence * modify Jenkinsfile * modify Jenkinsfile * mkldnn has no int8 fc api, excluded_sym_names includes fc for cpu * add mkldnn uint8 pass for quantization graph * update ut * retrig ic * remove no mkldnn quantization test temp * seperate mkldnn quantization ut from gpu quantization ut * rm dev_id check for cpu * add mkl tests dictionary * resolve review comments * simplify DequantizeStorageType() logic * simplify quantize/quantized_conv storage type logic * Add mkldnn_OIhw4i16o4i type case (needed by int8) * INT8 conv/pooling: share with FP32 convolution/pooling class/function * minor indent changes * Remove unnecessary mkldnn_quantized_pooling-inl.h * Fix minor issue * Fix lint * delete duplicated data type * fix bugs and convert requantize data to NDArray * fix lint * fix requantize storgetype * fix requantize storge type * Fix coding style comments * Fix compile issue * Change to use quantized_dtype option to support uint8/int8 scenarios * fix gpu test quantization failure * Fix indent * fix quantized pooling param parser * Fix imagenet_gen_qsym.py option style * retrigger jenkins * retrigger again * trigger jenkins * Resolve further comments * share test code * remove unnecessary test code * add test_quantize_model for cpu * add comments in quantize_graph_pass.cc * jenkins * jenkins * improve coding style * improve coding style * Add naive CPU quantization test back and share quantization code between naive-CPU/MKLDNN/GPU * rename test_quantization_cpu.py to test_quantization_mkldnn.py * code style * trigger * Adjust variable naming for test quantization * add qdtype for quantized op test case to test/bypass all cases explicitly * change expressions to be consistent * revert unnecessary change 2018-06-14 12:58:33 +08:00
			`for qdtype in ['int8', 'uint8']:`
			`check_quantize_model(qdtype)`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00
Fix quantized graphpass bug (#11937) * fix quantized graphpass bug * add residual quantization testcase * handle dtype and backend issues 2018-08-12 12:51:13 +08:00			`@with_seed()`
fix quantize_graph pass error when there're multiple outputs from a single node (#13000) * fix quantize_graph pass error when there're multiple outputs from a single node that need to insert 'contrib_quantize', 'min' and 'max' nodes for these outputs. * fix lint * Make the single output align with multiple outputs when inserting contrib_quantize * Change op comparing from its name to itself * skip unsupported quantize_concat * retrigger ci 2018-11-30 10:58:08 +08:00			`def test_quantize_model_with_forward():`
Fix quantized graphpass bug (#11937) * fix quantized graphpass bug * add residual quantization testcase * handle dtype and backend issues 2018-08-12 12:51:13 +08:00			`def check_quantize_model(qdtype):`
			`if is_test_for_native_cpu():`
fix quantize_graph pass error when there're multiple outputs from a single node (#13000) * fix quantize_graph pass error when there're multiple outputs from a single node that need to insert 'contrib_quantize', 'min' and 'max' nodes for these outputs. * fix lint * Make the single output align with multiple outputs when inserting contrib_quantize * Change op comparing from its name to itself * skip unsupported quantize_concat * retrigger ci 2018-11-30 10:58:08 +08:00			`print('skipped testing test_quantize_model_with_forward for native cpu since it is not supported yet')`
Fix quantized graphpass bug (#11937) * fix quantized graphpass bug * add residual quantization testcase * handle dtype and backend issues 2018-08-12 12:51:13 +08:00			`return`
			`elif qdtype == 'uint8' and is_test_for_gpu():`
fix quantize_graph pass error when there're multiple outputs from a single node (#13000) * fix quantize_graph pass error when there're multiple outputs from a single node that need to insert 'contrib_quantize', 'min' and 'max' nodes for these outputs. * fix lint * Make the single output align with multiple outputs when inserting contrib_quantize * Change op comparing from its name to itself * skip unsupported quantize_concat * retrigger ci 2018-11-30 10:58:08 +08:00			`print('skipped testing test_quantize_model_with_forward for gpu uint8 since it is not supported yet')`
Fix quantized graphpass bug (#11937) * fix quantized graphpass bug * add residual quantization testcase * handle dtype and backend issues 2018-08-12 12:51:13 +08:00			`return`

			`def check_params(params, qparams, qsym=None):`
			`if qsym is None:`
			`assert len(params) == len(qparams)`
			`for k, v in params.items():`
			`assert k in qparams`
			`assert same(v.asnumpy(), qparams[k].asnumpy())`
			`else:`
Implement mkldnn convolution fusion and quantization. (#12530) * Implement mkldnn convolution fusion. Implement mkldnn convolution quantization. * Fix lint * Fix performance regression caused by mkldnn fallback. * clean up include * Fix msbuild on openmp pragma. * Fix quantization test, allow to use original op names as exclude layer for quantization. * Fix unittest. * Fix unittest * fix lint * Add post quantize fusion * add test case * add head license in test case * Remove GetBoolHash() * Remove mkldnn fallback change. * Address Haibin's comments. * Add TIsMKLDNN for _sg_mkldnn_conv temporarily. * Address reminisce's comments. * Handle the case that inplace fail. * pass unit test. * Add symbol api get_backend_symbol() * Retrigger ci * update the test case * Check subgraph index. * Use index as FAvoidQuantizeInput's parameter. * Add mkldnn_hwigo support as quantizaiton needs. * Address KellenSunderland's comments. * Handle input order change after subgraph pass. * Fix ci test 2018-10-10 01:38:53 +08:00			`qparams_ground_truth = mx.contrib.quant._quantize_params(qsym, params, th_dict = {})`
Fix quantized graphpass bug (#11937) * fix quantized graphpass bug * add residual quantization testcase * handle dtype and backend issues 2018-08-12 12:51:13 +08:00			`assert len(qparams) == len(qparams_ground_truth)`
			`for k, v in qparams_ground_truth.items():`
			`assert k in qparams`
			`assert same(v.asnumpy(), qparams[k].asnumpy())`

			`def check_qsym_calibrated(qsym):`
			`attrs = qsym.attr_dict()`
			`for k, v in attrs.items():`
			`if k.find('requantize_') != -1:`
			`assert 'min_calib_range' in v`
			`assert 'max_calib_range' in v`

			`def check_qsym_qdtype(qsym, qdtype):`
			`attrs = qsym.attr_dict()`
			`for k, v in attrs.items():`
			`if k.find('_quantize') != -1:`
			`assert 'out_type' in v`
			`assert v['out_type'] == qdtype`

[Quantization] Support zero-size tensor input for quantization flow (#15031) * [Quantization] Support zero-size tensor input for quantization flow * Comment out quantized_act and quantized_sum * retrigger CI * Add test cases 2019-05-23 09:38:57 +08:00			`def check_qsym_forward(qsym, qarg_params, qaux_params, data_shape, label_shape=None):`
			`if label_shape is None:`
			`mod = mx.mod.Module(symbol=qsym, label_names=None, context=mx.current_context())`
			`mod.bind(for_training=False,`
			`data_shapes=[('data', data_shape)])`
			`else:`
			`mod = mx.mod.Module(symbol=qsym, context=mx.current_context())`
			`mod.bind(for_training=False,`
			`data_shapes=[('data', data_shape)],`
			`label_shapes=[('softmax_label', label_shape)])`
Fix quantized graphpass bug (#11937) * fix quantized graphpass bug * add residual quantization testcase * handle dtype and backend issues 2018-08-12 12:51:13 +08:00			`mod.set_params(qarg_params, qaux_params)`
			`data = [mx.random.uniform(-1.0, 1.0, shape=shape) for _, shape in mod.data_shapes]`
			`batch = mx.io.DataBatch(data, [])`
			`mod.forward(batch, is_train=False)`
			`for output in mod.get_outputs():`
			`output.wait_to_read()`
Implement mkldnn convolution fusion and quantization. (#12530) * Implement mkldnn convolution fusion. Implement mkldnn convolution quantization. * Fix lint * Fix performance regression caused by mkldnn fallback. * clean up include * Fix msbuild on openmp pragma. * Fix quantization test, allow to use original op names as exclude layer for quantization. * Fix unittest. * Fix unittest * fix lint * Add post quantize fusion * add test case * add head license in test case * Remove GetBoolHash() * Remove mkldnn fallback change. * Address Haibin's comments. * Add TIsMKLDNN for _sg_mkldnn_conv temporarily. * Address reminisce's comments. * Handle the case that inplace fail. * pass unit test. * Add symbol api get_backend_symbol() * Retrigger ci * update the test case * Check subgraph index. * Use index as FAvoidQuantizeInput's parameter. * Add mkldnn_hwigo support as quantizaiton needs. * Address KellenSunderland's comments. * Handle input order change after subgraph pass. * Fix ci test 2018-10-10 01:38:53 +08:00
Fix quantized graphpass bug (#11937) * fix quantized graphpass bug * add residual quantization testcase * handle dtype and backend issues 2018-08-12 12:51:13 +08:00			`batch_size = 4`
fix quantize_graph pass error when there're multiple outputs from a single node (#13000) * fix quantize_graph pass error when there're multiple outputs from a single node that need to insert 'contrib_quantize', 'min' and 'max' nodes for these outputs. * fix lint * Make the single output align with multiple outputs when inserting contrib_quantize * Change op comparing from its name to itself * skip unsupported quantize_concat * retrigger ci 2018-11-30 10:58:08 +08:00			`length = batch_size # specify num of outputs from split op`
[Quantization] Support zero-size tensor input for quantization flow (#15031) * [Quantization] Support zero-size tensor input for quantization flow * Comment out quantized_act and quantized_sum * retrigger CI * Add test cases 2019-05-23 09:38:57 +08:00			`sym_list = []`
			`name_list = []`
			`dshape_list = []`
			`lshape_list = []`

			`# sym 1`
			`sym_list.append(get_fp32_residual())`
			`name_list.append('sym1')`
			`dshape_list.append((batch_size, 4, 10, 10))`
			`lshape_list.append((batch_size, 10))`

			`# sym 2`
			`sym_list.append(get_fp32_sym_with_multiple_outputs(length))`
			`name_list.append('sym2')`
			`dshape_list.append((length, 4, 4, 10, 10))`
			`lshape_list.append((length, 10))`
fix quantize_graph pass error when there're multiple outputs from a single node (#13000) * fix quantize_graph pass error when there're multiple outputs from a single node that need to insert 'contrib_quantize', 'min' and 'max' nodes for these outputs. * fix lint * Make the single output align with multiple outputs when inserting contrib_quantize * Change op comparing from its name to itself * skip unsupported quantize_concat * retrigger ci 2018-11-30 10:58:08 +08:00
[Quantization] Support zero-size tensor input for quantization flow (#15031) * [Quantization] Support zero-size tensor input for quantization flow * Comment out quantized_act and quantized_sum * retrigger CI * Add test cases 2019-05-23 09:38:57 +08:00			`data = mx.sym.Variable('data')`
			`# sym 3`
			`sym_list.append(mx.sym.Convolution(data, kernel=(1, 1), num_filter=16, name='conv0'))`
			`name_list.append('sym3')`
			`dshape_list.append((batch_size, 4, 10, 10))`
			`lshape_list.append(None)`

			`# sym 4`
			`cell = mx.rnn.LSTMCell(num_hidden=64)`
			`outputs, _ = cell.unroll(length, data)`
			`sym_list.append(mx.sym.Group(outputs))`
			`name_list.append('sym4')`
			`dshape_list.append((batch_size, length, 32))`
			`lshape_list.append(None)`

			`for s, dshape, lshape, name in zip(sym_list, dshape_list, lshape_list, name_list):`
add uint8 bn mkldnn implementation (#16003) * add uint8 bn mkldnn implementation * update test case for uint8 bn * fix lint * update test with gpu * add comment for quantization * fix quantized_bn test * fix quantize_model_with_forward test 2019-08-26 20:56:23 +08:00			`if qdtype == 'int8' and name in ['sym1','sym2','sym3']:`
			`print('mkldnn_quantized_conv op only supports uint8 as input type, skip test with int8.')`
			`continue`
			`if qdtype == 'uint8' and name in ['sym1']:`
			`print('mkldnn_quantized_bn doesn\'t support calib_mode=None')`
			`continue`
[Quantization] Support zero-size tensor input for quantization flow (#15031) * [Quantization] Support zero-size tensor input for quantization flow * Comment out quantized_act and quantized_sum * retrigger CI * Add test cases 2019-05-23 09:38:57 +08:00			`if lshape is None:`
			`mod = Module(symbol=s, label_names=None)`
			`mod.bind(for_training=False,`
			`data_shapes=[('data', dshape)])`
			`else:`
			`mod = Module(symbol=s)`
			`mod.bind(for_training=False,`
			`data_shapes=[('data', dshape)],`
			`label_shapes=[('softmax_label', lshape)])`
fix quantize_graph pass error when there're multiple outputs from a single node (#13000) * fix quantize_graph pass error when there're multiple outputs from a single node that need to insert 'contrib_quantize', 'min' and 'max' nodes for these outputs. * fix lint * Make the single output align with multiple outputs when inserting contrib_quantize * Change op comparing from its name to itself * skip unsupported quantize_concat * retrigger ci 2018-11-30 10:58:08 +08:00
			`mod.init_params()`
			`arg_params, aux_params = mod.get_params()`
fix quantize graph pass (#14605) 2019-04-03 19:58:22 -07:00
[Quantization] Support zero-size tensor input for quantization flow (#15031) * [Quantization] Support zero-size tensor input for quantization flow * Comment out quantized_act and quantized_sum * retrigger CI * Add test cases 2019-05-23 09:38:57 +08:00			`excluded_sym_names = []`
[Quantization]support exclude operators while quantization (#15910) * skip unregistered op and add exclude_op_names * wrap a function * add testcase for excluded op * skip test for native cpu since all op will be excluded * trigger * trigger again * revert check inner node op name * test subgraph * trigger * trigger 2019-08-21 13:59:57 +08:00			`excluded_op_names = []`
[Quantization] Support zero-size tensor input for quantization flow (#15031) * [Quantization] Support zero-size tensor input for quantization flow * Comment out quantized_act and quantized_sum * retrigger CI * Add test cases 2019-05-23 09:38:57 +08:00			`# sym3/sym4 doesn't have such layers`
			`if name not in ['sym3', 'sym4']:`
			`excluded_names = []`
			`if mx.current_context() == mx.cpu():`
[Quantization]support exclude operators while quantization (#15910) * skip unregistered op and add exclude_op_names * wrap a function * add testcase for excluded op * skip test for native cpu since all op will be excluded * trigger * trigger again * revert check inner node op name * test subgraph * trigger * trigger 2019-08-21 13:59:57 +08:00			`excluded_op_names += ['FullyConnected']`
			`excluded_names += ['conv1']`
[Quantization] Support zero-size tensor input for quantization flow (#15031) * [Quantization] Support zero-size tensor input for quantization flow * Comment out quantized_act and quantized_sum * retrigger CI * Add test cases 2019-05-23 09:38:57 +08:00			`excluded_names += ['concat']`

			`optional_names = ['pool0']`
			`for skip_optional_names in [False, True]:`
			`exclude_sym_names = []`
			`if skip_optional_names:`
			`excluded_sym_names = excluded_names`
			`else:`
			`excluded_sym_names = excluded_names + optional_names`
[MKL-DNN] Enable and Optimization for s8 eltwise_add (#16931) * optimization for s8 sum * fix lint * fix lint * exclude sum in lstm cell * remove debug info * remove todo 2019-12-09 20:36:26 +08:00			`if name == 'sym4':`
Quantized Elemwise Mul Operator (#17147) * add elt-wise mul xinyu * fuse mul dequantize * change to use subgraph * address comments and add tests * fix ut * improve ut * skip pragma omp simd for msvc * fix lint * fix clang error 2019-12-26 18:53:19 +08:00			`excluded_op_names += ['elemwise_add', 'elemwise_mul']`
fix quantize graph pass (#14605) 2019-04-03 19:58:22 -07:00
[Quantization] Support zero-size tensor input for quantization flow (#15031) * [Quantization] Support zero-size tensor input for quantization flow * Comment out quantized_act and quantized_sum * retrigger CI * Add test cases 2019-05-23 09:38:57 +08:00			`qsym, qarg_params, qaux_params = mx.contrib.quant.quantize_model(sym=s,`
			`arg_params=arg_params,`
			`aux_params=aux_params,`
			`excluded_sym_names=excluded_sym_names,`
[Quantization]support exclude operators while quantization (#15910) * skip unregistered op and add exclude_op_names * wrap a function * add testcase for excluded op * skip test for native cpu since all op will be excluded * trigger * trigger again * revert check inner node op name * test subgraph * trigger * trigger 2019-08-21 13:59:57 +08:00			`excluded_op_names=excluded_op_names,`
[Quantization] Support zero-size tensor input for quantization flow (#15031) * [Quantization] Support zero-size tensor input for quantization flow * Comment out quantized_act and quantized_sum * retrigger CI * Add test cases 2019-05-23 09:38:57 +08:00			`ctx=mx.current_context(),`
			`quantized_dtype=qdtype,`
Improve quantization flow (#15961) * Add mkldnn imlementation for quantized flatten and smart quantize mode Change-Id: Id9d7504890852faf4e84fdcd66585d1fd78beeb2 * Add calibrate op Change-Id: I4c82f64dbef501d2560f7ffee93991119a66a5ee * Fix merge Change-Id: I068df3d4f3309bc9b950a3b869da9407282c8577 * Fix lint Change-Id: Ia38369d31c33d0f76a671275910729dfce693950 * Run CI * Run CI * Update test_quantization.py * Run CI * Fix CI Change-Id: I7479327db5ebc7c57b7bd810a67d2b765c820534 * Fix CI Change-Id: I4273938cb972c12b8f43dbd95c736a7d32df040e * Fix CI Change-Id: I80b47bd1d95520a7cd78cacbbc1a85fe0900123d * Fix CI Change-Id: I56542470010e7bc403f62dc8a8991c2fb58d229e * Fix CI Change-Id: If8482fe4da2f3d627dd3cbac8795e021a09a441f * Fix GPU Change-Id: Ic239cbf7aa3d111f2895badd1cac196fce6a1b86 2019-08-29 11:39:33 +08:00			`calib_mode='none',`
			`quantize_mode='full')`
[Quantization] Support zero-size tensor input for quantization flow (#15031) * [Quantization] Support zero-size tensor input for quantization flow * Comment out quantized_act and quantized_sum * retrigger CI * Add test cases 2019-05-23 09:38:57 +08:00			`check_params(arg_params, qarg_params, qsym)`
			`check_params(aux_params, qaux_params)`
			`check_qsym_forward(qsym, qarg_params, qaux_params, dshape, lshape)`
fix quantize graph pass (#14605) 2019-04-03 19:58:22 -07:00
[Quantization] Support zero-size tensor input for quantization flow (#15031) * [Quantization] Support zero-size tensor input for quantization flow * Comment out quantized_act and quantized_sum * retrigger CI * Add test cases 2019-05-23 09:38:57 +08:00			`calib_data = mx.nd.random.uniform(shape=dshape)`
			`calib_data = NDArrayIter(data=calib_data, batch_size=batch_size)`
			`calib_data = DummyIter(calib_data)`
			`qsym, qarg_params, qaux_params = mx.contrib.quant.quantize_model(sym=s,`
			`arg_params=arg_params,`
			`aux_params=aux_params,`
			`excluded_sym_names=excluded_sym_names,`
[Quantization]support exclude operators while quantization (#15910) * skip unregistered op and add exclude_op_names * wrap a function * add testcase for excluded op * skip test for native cpu since all op will be excluded * trigger * trigger again * revert check inner node op name * test subgraph * trigger * trigger 2019-08-21 13:59:57 +08:00			`excluded_op_names=excluded_op_names,`
[Quantization] Support zero-size tensor input for quantization flow (#15031) * [Quantization] Support zero-size tensor input for quantization flow * Comment out quantized_act and quantized_sum * retrigger CI * Add test cases 2019-05-23 09:38:57 +08:00			`ctx=mx.current_context(),`
			`quantized_dtype=qdtype,`
			`calib_mode='naive',`
			`calib_data=calib_data,`
Improve quantization flow (#15961) * Add mkldnn imlementation for quantized flatten and smart quantize mode Change-Id: Id9d7504890852faf4e84fdcd66585d1fd78beeb2 * Add calibrate op Change-Id: I4c82f64dbef501d2560f7ffee93991119a66a5ee * Fix merge Change-Id: I068df3d4f3309bc9b950a3b869da9407282c8577 * Fix lint Change-Id: Ia38369d31c33d0f76a671275910729dfce693950 * Run CI * Run CI * Update test_quantization.py * Run CI * Fix CI Change-Id: I7479327db5ebc7c57b7bd810a67d2b765c820534 * Fix CI Change-Id: I4273938cb972c12b8f43dbd95c736a7d32df040e * Fix CI Change-Id: I80b47bd1d95520a7cd78cacbbc1a85fe0900123d * Fix CI Change-Id: I56542470010e7bc403f62dc8a8991c2fb58d229e * Fix CI Change-Id: If8482fe4da2f3d627dd3cbac8795e021a09a441f * Fix GPU Change-Id: Ic239cbf7aa3d111f2895badd1cac196fce6a1b86 2019-08-29 11:39:33 +08:00			`num_calib_examples=20,`
			`quantize_mode='full')`
[Quantization] Support zero-size tensor input for quantization flow (#15031) * [Quantization] Support zero-size tensor input for quantization flow * Comment out quantized_act and quantized_sum * retrigger CI * Add test cases 2019-05-23 09:38:57 +08:00			`check_params(arg_params, qarg_params, qsym)`
			`check_params(aux_params, qaux_params)`
			`check_qsym_calibrated(qsym)`
			`check_qsym_qdtype(qsym, qdtype)`
			`check_qsym_forward(qsym, qarg_params, qaux_params, dshape, lshape)`
fix quantize graph pass (#14605) 2019-04-03 19:58:22 -07:00
[Quantization] Support zero-size tensor input for quantization flow (#15031) * [Quantization] Support zero-size tensor input for quantization flow * Comment out quantized_act and quantized_sum * retrigger CI * Add test cases 2019-05-23 09:38:57 +08:00			`for qdtype in ['int8', 'uint8']:`
fix quantize graph pass (#14605) 2019-04-03 19:58:22 -07:00			`check_quantize_model(qdtype)`
[MXNET-688] Fix quantization divide by zero errors (#11833) * Fix quantization bug * Added tests and made sure the edge case is now considered correctly without 1 off errors * Changed back to original truncated distribution but with different kl divergence calc * Reorder back to original format * Reorder back to original format (again) * Change comments * Clarified comments * Changed norm division 2018-07-23 22:58:29 -07:00
Add quantization support for GluonCV (#15754) * enhance quantization api * integrate gluoncv solution * support gluon ssd * enhance api * [TODO]split to another PR * enhance example script * add wildcard match for exclude layers * support int8 dtype parameter * enable dataiter api * use try method * add unit test for quantize gluon * fix lint * fix lint 2 * fix temporary directory in python2 * fix lint * fix try import and add todo * trigger 2019-08-07 13:29:43 +08:00			`@with_seed()`
			`def test_quantize_gluon_with_forward():`
			`def check_quantize_net(qdtype):`
			`if is_test_for_native_cpu():`
			`print('skipped testing test_quantize_model_with_forward for native cpu since it is not supported yet')`
			`return`
add int8 bn mkldnn implementation and test (#15664) * add int8 bn mkldnn implementation and test * fix lint * fix ci * enable int8 bn test only in mkldnn backend * disable int8 bn forward test with gpu backend * update int8 bn with reference to comments * fix lint * disable int8 bn gluon forward test with gpu backend * disable uint8 bn forward test with mkldnn backend * restore support mkldnn bn condition * rm duplicate code 2019-08-08 18:20:19 +08:00			`elif is_test_for_gpu():`
Add quantization support for GluonCV (#15754) * enhance quantization api * integrate gluoncv solution * support gluon ssd * enhance api * [TODO]split to another PR * enhance example script * add wildcard match for exclude layers * support int8 dtype parameter * enable dataiter api * use try method * add unit test for quantize gluon * fix lint * fix lint 2 * fix temporary directory in python2 * fix lint * fix try import and add todo * trigger 2019-08-07 13:29:43 +08:00			`print('skipped testing test_quantize_model_with_forward for gpu uint8 since it is not supported yet')`
			`return`

			`data_shape = (32, 3, 224, 224)`
			`data_shapes = [mx.io.DataDesc(name='data', shape=data_shape)]`
			`label_shape = (32, 1)`
			`batch_size = 1`
			`resnet18_v1 = vision.resnet18_v1(pretrained=True)`
			`resnet18_v1.collect_params().reset_ctx(mx.current_context())`
			`excluded_names_match = []`
			`if mx.current_context() == mx.gpu():`
			`excluded_names_match += ['activation', 'relu', 'conv0']`
			`num_calib_examples = 5`

			`random_data = mx.random.uniform(shape=data_shape)`
			`random_label = mx.random.uniform(shape=label_shape)`
			`dataset = mx.gluon.data.dataset.ArrayDataset(random_data, random_label)`
			`calib_data = mx.gluon.data.DataLoader(dataset, batch_size=batch_size)`

			`quantized_resnet18_v1 = mx.contrib.quant.quantize_net(resnet18_v1, quantized_dtype=qdtype,`
			`exclude_layers=None,`
			`exclude_layers_match=excluded_names_match,`
			`calib_mode='none',`
			`data_shapes=data_shapes,`
			`ctx=mx.current_context())`
			`quantized_resnet18_v1.hybridize(static_alloc=True, static_shape=True)`
			`quantized_resnet18_v1(random_data)`

[MKLDNN] Fix _copyto (#17173) * fix_copyto * only exclude _copyto * trigger CI 2020-01-02 09:37:42 +08:00			`for mode in ['naive', 'entropy']:`
			`qdtype = qdtype if mode is 'naive' else 'auto'`
			`quantized_resnet18_v1 = mx.contrib.quant.quantize_net(resnet18_v1, quantized_dtype=qdtype,`
			`exclude_layers=None,`
			`exclude_layers_match=excluded_names_match,`
			`calib_data=calib_data,`
			`calib_mode=mode,`
			`num_calib_examples=num_calib_examples,`
			`ctx=mx.current_context())`
			`quantized_resnet18_v1.hybridize(static_alloc=True, static_shape=True)`
			`quantized_resnet18_v1(random_data)`
Add quantization support for GluonCV (#15754) * enhance quantization api * integrate gluoncv solution * support gluon ssd * enhance api * [TODO]split to another PR * enhance example script * add wildcard match for exclude layers * support int8 dtype parameter * enable dataiter api * use try method * add unit test for quantize gluon * fix lint * fix lint 2 * fix temporary directory in python2 * fix lint * fix try import and add todo * trigger 2019-08-07 13:29:43 +08:00
			`for qdtype in ['int8', 'uint8']:`
			`check_quantize_net(qdtype)`
[Quantization] Support zero-size tensor input for quantization flow (#15031) * [Quantization] Support zero-size tensor input for quantization flow * Comment out quantized_act and quantized_sum * retrigger CI * Add test cases 2019-05-23 09:38:57 +08:00
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00			`@with_seed()`
			`def test_quantize_sym_with_calib():`
[Quantization]support exclude operators while quantization (#15910) * skip unregistered op and add exclude_op_names * wrap a function * add testcase for excluded op * skip test for native cpu since all op will be excluded * trigger * trigger again * revert check inner node op name * test subgraph * trigger * trigger 2019-08-21 13:59:57 +08:00			`if is_test_for_native_cpu():`
			`print('skipped testing quantized_pooling for native cpu since it is not supported yet')`
			`return`

[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00			`sym = get_fp32_sym()`
			`offline_params = [name for name in sym.list_arguments()`
			`if not name.startswith('data') and not name.endswith('label')]`
Improve quantization flow (#15961) * Add mkldnn imlementation for quantized flatten and smart quantize mode Change-Id: Id9d7504890852faf4e84fdcd66585d1fd78beeb2 * Add calibrate op Change-Id: I4c82f64dbef501d2560f7ffee93991119a66a5ee * Fix merge Change-Id: I068df3d4f3309bc9b950a3b869da9407282c8577 * Fix lint Change-Id: Ia38369d31c33d0f76a671275910729dfce693950 * Run CI * Run CI * Update test_quantization.py * Run CI * Fix CI Change-Id: I7479327db5ebc7c57b7bd810a67d2b765c820534 * Fix CI Change-Id: I4273938cb972c12b8f43dbd95c736a7d32df040e * Fix CI Change-Id: I80b47bd1d95520a7cd78cacbbc1a85fe0900123d * Fix CI Change-Id: I56542470010e7bc403f62dc8a8991c2fb58d229e * Fix CI Change-Id: If8482fe4da2f3d627dd3cbac8795e021a09a441f * Fix GPU Change-Id: Ic239cbf7aa3d111f2895badd1cac196fce6a1b86 2019-08-29 11:39:33 +08:00			`qsym, _ = mx.contrib.quant._quantize_symbol(sym, ctx=mx.current_context(),`
			`offline_params=offline_params, quantize_mode='full')`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00			`requantize_op_names = ['requantize_conv', 'requantize_fc']`
			`th_dict = {'conv_output': (np.random.uniform(low=100.0, high=200.0), np.random.uniform(low=100.0, high=200.0)),`
			`'fc_output': (np.random.uniform(low=100.0, high=200.0), np.random.uniform(low=100.0, high=200.0))}`
			`op_name_to_th_name = {'requantize_conv': 'conv_output', 'requantize_fc': 'fc_output'}`
			`cqsym = mx.contrib.quant._calibrate_quantized_sym(qsym, th_dict)`
			`attr_dict = cqsym.attr_dict()`
			`for name in requantize_op_names:`
			`assert name in attr_dict`
			`lhs = float(attr_dict[name]['min_calib_range'])`
			`rhs = th_dict[op_name_to_th_name[name]][0]`
			`assert_almost_equal(np.array([lhs]), np.array([rhs]))`
			`lhs = float(attr_dict[name]['max_calib_range'])`
			`rhs = th_dict[op_name_to_th_name[name]][1]`
			`assert_almost_equal(np.array([lhs]), np.array([rhs]), rtol=1e-3, atol=1e-4)`


[MXNET-688] Fix quantization divide by zero errors (#11833) * Fix quantization bug * Added tests and made sure the edge case is now considered correctly without 1 off errors * Changed back to original truncated distribution but with different kl divergence calc * Reorder back to original format * Reorder back to original format (again) * Change comments * Clarified comments * Changed norm division 2018-07-23 22:58:29 -07:00			`@with_seed()`
			`def test_smooth_distribution():`
			`assert_exception(lambda: mx.contrib.quant._smooth_distribution(np.zeros((2,)), eps=1e-3), ValueError)`
			`dirac_delta = np.zeros((5,))`
			`dirac_delta[2] = 1`
			`smooth_dirac_delta = dirac_delta.copy()`
			`smooth_dirac_delta += 1e-3`
			`smooth_dirac_delta[2] -= 5e-3`
			`assert_almost_equal(mx.contrib.quant._smooth_distribution(dirac_delta, eps=1e-3), smooth_dirac_delta)`


			`@with_seed()`
			`def test_optimal_threshold_adversarial_case():`
			`# The worst case for the optimal_threshold function is when the values are concentrated`
			`# at one edge: [0, 0, ..., 1000]. (histogram)`
			`# We want to make sure that the optimal threshold in this case is the max.`
Improve quantization flow (#15961) * Add mkldnn imlementation for quantized flatten and smart quantize mode Change-Id: Id9d7504890852faf4e84fdcd66585d1fd78beeb2 * Add calibrate op Change-Id: I4c82f64dbef501d2560f7ffee93991119a66a5ee * Fix merge Change-Id: I068df3d4f3309bc9b950a3b869da9407282c8577 * Fix lint Change-Id: Ia38369d31c33d0f76a671275910729dfce693950 * Run CI * Run CI * Update test_quantization.py * Run CI * Fix CI Change-Id: I7479327db5ebc7c57b7bd810a67d2b765c820534 * Fix CI Change-Id: I4273938cb972c12b8f43dbd95c736a7d32df040e * Fix CI Change-Id: I80b47bd1d95520a7cd78cacbbc1a85fe0900123d * Fix CI Change-Id: I56542470010e7bc403f62dc8a8991c2fb58d229e * Fix CI Change-Id: If8482fe4da2f3d627dd3cbac8795e021a09a441f * Fix GPU Change-Id: Ic239cbf7aa3d111f2895badd1cac196fce6a1b86 2019-08-29 11:39:33 +08:00			`hist = []`
			`hist_edges = []`
			`min_val = -2`
			`max_val = 2`
			`for i in range(0, 998):`
			`hist.append(0)`
			`for i in range(0, 999):`
			`hist_edges.append((max_val - min_val) / 999 * i + min_val)`
			`hist.append(1000)`
			`hist_edges.append(max_val)`
			`hist_data = (hist, hist_edges, min_val, max_val, max_val)`
Fix entropy for uint8 (#14150) * Fix entropy for uint8 * Add test * Update test_quantization.py 2019-03-15 14:33:17 +08:00			`for dtype in ['uint8', 'int8', 'auto']:`
Improve quantization flow (#15961) * Add mkldnn imlementation for quantized flatten and smart quantize mode Change-Id: Id9d7504890852faf4e84fdcd66585d1fd78beeb2 * Add calibrate op Change-Id: I4c82f64dbef501d2560f7ffee93991119a66a5ee * Fix merge Change-Id: I068df3d4f3309bc9b950a3b869da9407282c8577 * Fix lint Change-Id: Ia38369d31c33d0f76a671275910729dfce693950 * Run CI * Run CI * Update test_quantization.py * Run CI * Fix CI Change-Id: I7479327db5ebc7c57b7bd810a67d2b765c820534 * Fix CI Change-Id: I4273938cb972c12b8f43dbd95c736a7d32df040e * Fix CI Change-Id: I80b47bd1d95520a7cd78cacbbc1a85fe0900123d * Fix CI Change-Id: I56542470010e7bc403f62dc8a8991c2fb58d229e * Fix CI Change-Id: If8482fe4da2f3d627dd3cbac8795e021a09a441f * Fix GPU Change-Id: Ic239cbf7aa3d111f2895badd1cac196fce6a1b86 2019-08-29 11:39:33 +08:00			`res = mx.contrib.quant._get_optimal_threshold(hist_data, dtype, num_quantized_bins=5)`
Fix entropy for uint8 (#14150) * Fix entropy for uint8 * Add test * Update test_quantization.py 2019-03-15 14:33:17 +08:00			`# The threshold should be 2.`
Improve quantization flow (#15961) * Add mkldnn imlementation for quantized flatten and smart quantize mode Change-Id: Id9d7504890852faf4e84fdcd66585d1fd78beeb2 * Add calibrate op Change-Id: I4c82f64dbef501d2560f7ffee93991119a66a5ee * Fix merge Change-Id: I068df3d4f3309bc9b950a3b869da9407282c8577 * Fix lint Change-Id: Ia38369d31c33d0f76a671275910729dfce693950 * Run CI * Run CI * Update test_quantization.py * Run CI * Fix CI Change-Id: I7479327db5ebc7c57b7bd810a67d2b765c820534 * Fix CI Change-Id: I4273938cb972c12b8f43dbd95c736a7d32df040e * Fix CI Change-Id: I80b47bd1d95520a7cd78cacbbc1a85fe0900123d * Fix CI Change-Id: I56542470010e7bc403f62dc8a8991c2fb58d229e * Fix CI Change-Id: If8482fe4da2f3d627dd3cbac8795e021a09a441f * Fix GPU Change-Id: Ic239cbf7aa3d111f2895badd1cac196fce6a1b86 2019-08-29 11:39:33 +08:00			`print (res)`
			`assert abs(res[2] - 2) < 1e-5`
[MXNET-688] Fix quantization divide by zero errors (#11833) * Fix quantization bug * Added tests and made sure the edge case is now considered correctly without 1 off errors * Changed back to original truncated distribution but with different kl divergence calc * Reorder back to original format * Reorder back to original format (again) * Change comments * Clarified comments * Changed norm division 2018-07-23 22:58:29 -07:00

[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00			`@with_seed()`
			`def test_get_optimal_thresholds():`
			`# Given an ndarray with elements following a uniform distribution, the optimal threshold`
			`# for quantizing the ndarray should be either abs(min(nd)) or abs(max(nd)).`
			`def get_threshold(nd):`
			`min_nd = mx.nd.min(nd)`
			`max_nd = mx.nd.max(nd)`
			`return mx.nd.maximum(mx.nd.abs(min_nd), mx.nd.abs(max_nd)).asnumpy()`

Fix entropy for uint8 (#14150) * Fix entropy for uint8 * Add test * Update test_quantization.py 2019-03-15 14:33:17 +08:00			`for dtype in ['uint8', 'int8', 'auto']:`
Improve quantization flow (#15961) * Add mkldnn imlementation for quantized flatten and smart quantize mode Change-Id: Id9d7504890852faf4e84fdcd66585d1fd78beeb2 * Add calibrate op Change-Id: I4c82f64dbef501d2560f7ffee93991119a66a5ee * Fix merge Change-Id: I068df3d4f3309bc9b950a3b869da9407282c8577 * Fix lint Change-Id: Ia38369d31c33d0f76a671275910729dfce693950 * Run CI * Run CI * Update test_quantization.py * Run CI * Fix CI Change-Id: I7479327db5ebc7c57b7bd810a67d2b765c820534 * Fix CI Change-Id: I4273938cb972c12b8f43dbd95c736a7d32df040e * Fix CI Change-Id: I80b47bd1d95520a7cd78cacbbc1a85fe0900123d * Fix CI Change-Id: I56542470010e7bc403f62dc8a8991c2fb58d229e * Fix CI Change-Id: If8482fe4da2f3d627dd3cbac8795e021a09a441f * Fix GPU Change-Id: Ic239cbf7aa3d111f2895badd1cac196fce6a1b86 2019-08-29 11:39:33 +08:00			`nd = mx.nd.uniform(low=-10.532, high=11.3432, shape=(8, 3, 23, 23), dtype=np.float64)`
			`expected_threshold = get_threshold(nd)`
			`arr = nd.asnumpy()`
			`min_range = np.min(arr)`
			`max_range = np.max(arr)`
			`th = max(abs(min_range), abs(max_range))`
			`hist, hist_edges = np.histogram(arr, bins=8001, range=(-th, th))`
			`hist_dict = {'layer1' : (hist, hist_edges, min_range, max_range, th)}`
			`th_dict = mx.contrib.quant._get_optimal_thresholds(hist_dict, dtype)`
Fix entropy for uint8 (#14150) * Fix entropy for uint8 * Add test * Update test_quantization.py 2019-03-15 14:33:17 +08:00			`assert 'layer1' in th_dict`
			`assert_almost_equal(np.array([th_dict['layer1'][1]]), expected_threshold, rtol=1e-2, atol=1e-4)`
[MXNET-133] Model Quantization with Calibration (#9552) * [Quantization] 8bit Quantization and GPU Support [Quantization] CuDNN 8bit quantized relu v0.1 [Quantization] CuDNN 8bit quantized max_pool v0.1 [Quantization] CuDNN 8bit quantized lrn v0.1 [Quantization] CuDNN 8bit quantized convolution v0.1 [Quantization] CuDNN 8bit quantized fully connected v0.1 [Quantization] Small fix [Quantization] Implement backward method [Quantization] Convolution backward method [Quantization] Add range for matmul and conv [Quantization] New types in ndarray.py [Quantization] 8bit conv works [Quantization] conv support multiple type [Quantization] matmul works now [Quantization] matmul works well [Quantization] efactor quantization operators [Quantization] Op: quantize_down_and_shrink_range [Quantization] Complete quantize_graph_pass [Quantization] Add example [Quantization] Take zero-center quantize, accuracy fixed [Quantization] Multiple layers MLP pass [Quantization] Make quantized_conv same as Convolution [Quantization] quantized_conv works [Quantization] Fix bug [Quantization] lenet works now [Quantization] Add quantized_flatten [Quantization] Quantized max pool works well [Quantization] Make quantized_conv support NHWC [Quantization] add max_pool [Quantization] add ignore_symbols [Quantization] Save change [Quantization] Reorganize tests, 8 layers resnet works on cifar [Quantization] Support for 'NHWC' max pool [Quantization] Support for 'NHWC' quantized max pool [Quantization] Fix speed of quantize_down_and_shrink_range [Quantization] script for resnet on imagenet [Quantization] refactor for quantize offline [Quantization] Fix infershape [Quantization] Update test [Quantization] Update example [Quantization] Fix build error * [Quantization] Add calibration flow and refactor code Rebase with dmlc/master Add quantize_down_and_shrink by threshold Don't assign resource when threshold is available for quantize_down_and_shrink Fix quantize_down_and_shrink saturation Implement pass for setting calib table to node attrs Rebase with upstream master Change threshold to min/max quantized params Add c-api for setting calib table to graph Add calibration front end function Bug fixes and add unit test Add data iter type to calibration Fix bug in calibrate_quantized_model Bug fix and add example Add the second calibration approach and benchmark Fix Fix infer error and add benchmark for conv Add benchmark script Change output names and argument names Remove commented out code Change name Add layout to benchmark_convolution Remove redundant comment Remove common and add soft link More fix and benchmark Add scripts to plot images Minor fix More fix More fix and util tools Tools and support bias in quantized_conv2d Add script for getting the optimal thresholds using kl divergence Add kl divergence for optimizing thresholds Add benchmark scripts Fix compile after rebasing on master Allocate temp space only once for quantized_conv2d Change quantize_down_and_shrink_range to allocate temp space once No temp space for calib model Refactor quantize_down_and_shrink_range into requantize Refactor quantized convolution using nnvm interfaces Fix quantized_conv bug Use ConvolutionParam for QuantizedCuDNNConvOp Refactor quantized fc using nnvm interfaces Change TQuantizationNeedShrink to FNeedRequantize Refactor quantized_pooling Simplify FQuantizedOp interface Better naming Fix shape and type inference for quantized_flatten Clean up quantization frontend APIs and examples Delete quantized lrn and relu Add python script for generating quantized models Add script for running inference Add inference example Remove redundant files from example/quantization Simplify user-level python APIs Add logger Improve user-level python api Fix coding style Add unit test for quantized_conv Fix bugs in quantized_fully_connected and add unit test Add unit test for requantize Fix a bug and add python api unit tests Import test_quantization in test_operator_gpu.py Rebase with master Remove redundant files Fix test case for python3 and fix doc Fix unit tests Fix unit tests for python3 Release used ndarrays in calibration for saving memory usage Simplify releasing memory of used ndarrays for calibration Fix a bug Revert "Fix a bug" This reverts commit f7853f28ae3301f306bb61d6b68b70b21b36e0bb. Revert "Simplify releasing memory of used ndarrays for calibration" This reverts commit 70b9e3863b1c5e42ace47dd294719bffc40a6be2. Clean up benchmark script and improve example Add API and example documentation and fix bugs Remove redundant test file and improve error message Merge quantize and dequantize with master impl Remove commented code Hide monitor interface from users Remove interface from Module Add license header Move quantization unittests to a separate folder so that it can be only run on P3 instances Remove quantization unittests from test_operator_gpu.py Move quantization to contrib Fix lint Add mxnetlinux-gpu-p3 to jenkins Fix jenkins Fix CI build Fix CI Update jenkins file Use cudnn7 for ci Add docker file for quantization unit test only Correctly skip build with cudnn < 6 Add doc for quantize symbol api Fix lint Fix python3 and add doc Try to fix cudnn build problem * Fix compile error * Fix CI * Remove tests that should not run on P3 * Remove unnecessary docker file * Fix registering quantized nn ops * Reformat Jenkinsfile and switch quantization to CUDA 9 (#9) * Address interface change cr * Address comments and fix bugs * Make unit test stable * Improve unit test * Address cr * Address cr * Fix flaky unit test layer_norm * Fix doc 2018-03-26 03:46:32 -07:00

			`if __name__ == "__main__":`
			`import nose`
			`nose.runmodule()`