SIGN IN SIGN UP
apache / mxnet UNCLAIMED

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

0 0 0 C++
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
import ctypes
import os
import time
import argparse
import subprocess
import scipy.sparse as sp
import mxnet as mx
import numpy as np
import numpy.random as rnd
from mxnet.test_utils import rand_ndarray, set_default_device, assert_almost_equal, get_bz2_data
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
from mxnet.base import check_call, _LIB
from util import estimate_density
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
PARSER = argparse.ArgumentParser(description="Benchmark sparse operators",
formatter_class=argparse.ArgumentDefaultsHelpFormatter)
PARSER.add_argument('--num-omp-threads', type=int,
default=1, help='number of omp threads to set in MXNet')
PARSER.add_argument('--gpu', action='store_true',
help="to be run on gpu")
# TODO: Use logging later
PARSER.add_argument('--verbose', action='store_true',
help="Verbose output")
ARGS = PARSER.parse_args()
# some data information
KDDA = {
'data_mini': 'kdda.t.mini',
'data_name': 'kdda.t',
'data_origin_name': 'kdda.t.bz2',
'url': "https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/kdda.t.bz2",
'feature_dim': 20216830,
'm': [1, 8, 32],
'batch_size': [64],
'default_index': {'batch_size': 0,
'output_dim': 2},
'num_batches': 10
}
AVAZU = {
'data_mini': 'avazu-app.t.mini',
'data_name': 'avazu-app.t',
'data_origin_name': 'avazu-app.t.bz2',
'url': "https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/avazu-app.t.bz2",
'feature_dim': 1000000,
'm': [1, 1000, 2000],
'batch_size': [128, 256],
'default_index': {'batch_size': 0,
'output_dim': 1},
'num_batches': 10
}
CRITEO = {
'data_mini': 'criteo.t.mini',
'data_name': 'criteo.t',
'data_origin_name': 'criteo.t.bz2',
'url': "https://s3-us-west-2.amazonaws.com/sparse-dataset/criteo.t.bz2",
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
'feature_dim': 8388621,
'm': [1, 8, 16, 32, 64],
'batch_size': [64, 128],
'default_index': {'batch_size': 1,
'output_dim': 3},
'num_batches': 10
}
SYNTHETIC1 = {
'feature_dim': [1000000],
'm': [256, 1000],
'density': [0.001, 0.005, 0.01, 0.02, 0.05,
0.1, 0.2, 0.5, 0.65],
'batch_size': [64, 128],
'default_index': {'batch_size': 1,
'density': 2,
'output_dim': 1,
'feature_dim': 0},
'num_repeat': 10
}
SYNTHETIC2 = {
'feature_dim': [8000000, 16000000],
'm': [1, 32],
'density': [0.001, 0.005, 0.01, 0.02, 0.05,
0.1, 0.2, 0.5, 0.65],
'batch_size': [64, 128],
'default_index': {'batch_size': 1,
'density': 2,
'output_dim': 1,
'feature_dim': 0},
'num_repeat': 10
}
def measure_cost(repeat, scipy_trans_lhs, scipy_dns_lhs, func_name, *args, **kwargs):
"""Measure time cost of running a function
"""
mx.nd.waitall()
args_list = []
for arg in args:
args_list.append(arg)
start = time.time()
if scipy_trans_lhs:
args_list[0] = np.transpose(args_list[0]) if scipy_dns_lhs else sp.spmatrix.transpose(args_list[0])
for _ in range(repeat):
func_name(*args_list, **kwargs)
mx.nd.waitall()
end = time.time()
diff = end - start
return diff / repeat
def _get_iter(path, data_shape, batch_size):
data_train = mx.io.LibSVMIter(data_libsvm=path,
data_shape=data_shape,
batch_size=batch_size)
data_iter = iter(data_train)
return data_iter
def _line_count(path):
return int(subprocess.check_output('wc -l {}'.format(path), shell=True).split()[0])
def _compare_sparse_dense(data_dir, file_name, mini_file_name, feature_dim,
output_dim, density, batch_size, num_batches=3, num_repeat=5, transpose=False,
rsp=False):
def create_mini_path(mini_path, path, num_batches):
"""Samples batches of size: batch_size, total number: num_batches
from the dataset files for running benchmarks"""
if not os.path.exists(mini_path):
last = _line_count(path) - num_batches * batch_size
last = last if last >= 1 else 1
start = int(rnd.uniform(1, last))
os.system("sed -n '{},{}p' {} > {}".format(
start, start + num_batches * batch_size, repr(path), repr(mini_path)))
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
assert os.path.exists(mini_path)
def run_benchmark(mini_path):
"""Run benchmarks
"""
data_shape = (feature_dim, )
train_iter = _get_iter(mini_path, data_shape, batch_size)
weight_row_dim = batch_size if transpose else feature_dim
weight_shape = (weight_row_dim, output_dim)
if not rsp:
weight = mx.nd.random.uniform(low=0, high=1, shape=weight_shape)
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
else:
weight = rand_ndarray(weight_shape, "row_sparse", density=0.05, distribution="uniform")
total_cost = {}
average_cost = {}
count = 0
total_cost["sparse"] = 0.
total_cost["dense"] = 0.
for _ in train_iter:
csr_data = train_iter.getdata()
dns_data = csr_data.tostype('default')
cost_sparse = measure_cost(num_repeat, False, False, mx.nd.sparse.dot, csr_data, weight, transpose_a=transpose)
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
cost_dense = measure_cost(num_repeat, False, False, mx.nd.dot, dns_data, weight, transpose_a=transpose)
total_cost["sparse"] += cost_sparse
total_cost["dense"] += cost_dense
count = count + 1
average_cost["sparse"] = total_cost["sparse"] / count
average_cost["dense"] = total_cost["dense"] / count
return (average_cost["sparse"], average_cost["dense"])
def print_result(average_cost_sparse, average_cost_dense):
"""Print result of comparison between sparse and dense
"""
ratio = average_cost_dense / average_cost_sparse
fmt = '{:15.4f} {:10d} {:10d} {:10d} {:20.2f} {:15.2f} {:15.2f} {:10} {:10}'
print(fmt.format(density * 100, batch_size, output_dim, feature_dim,
ratio, average_cost_dense*1000, average_cost_sparse*1000,
transpose, rsp))
mini_path = os.path.join(data_dir, mini_file_name)
path = os.path.join(data_dir, file_name)
create_mini_path(mini_path, path, num_batches)
average_cost_sparse, average_cost_dense = run_benchmark(mini_path)
print_result(average_cost_sparse, average_cost_dense)
def test_dot_real(data_dict):
"""Dot operator testing with real datasets"""
data_dir = os.path.join(os.getcwd(), 'data')
path = os.path.join(data_dir, data_dict['data_name'])
if not os.path.exists(path):
get_bz2_data(
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
data_dir,
data_dict['data_name'],
data_dict['url'],
data_dict['data_origin_name']
)
assert os.path.exists(path)
k = data_dict['feature_dim']
m = data_dict['m']
batch_size_list = data_dict['batch_size']
default_output_index = data_dict['default_index']['output_dim']
default_batch_size_index = data_dict['default_index']['batch_size']
density = estimate_density(path, data_dict['feature_dim'])
num_batches = data_dict['num_batches']
assert default_batch_size_index < len(batch_size_list)
assert default_output_index < len(m)
if ARGS.verbose:
print(f"Running Benchmarking on {repr(data_dict['data_mini'])} data")
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
print('{:>15} {:>10} {:>10} {:>10} {:>20} {:>15} {:>15} {:>10} {:>10}'.format('density(%)',
'n',
'm',
'k',
't_dense/t_sparse',
't_dense(ms)',
't_sparse(ms)',
'is_transpose',
'rhs_rsp'))
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
for output_dim in m:
_compare_sparse_dense(data_dir, data_dict['data_name'], data_dict['data_mini'],
k, output_dim, density,
batch_size_list[default_batch_size_index], num_batches)
_compare_sparse_dense(data_dir, data_dict['data_name'], data_dict['data_mini'],
k, output_dim, density,
batch_size_list[default_batch_size_index], num_batches,
transpose=True)
_compare_sparse_dense(data_dir, data_dict['data_name'], data_dict['data_mini'],
k, output_dim, density,
batch_size_list[default_batch_size_index], num_batches, rsp=True)
for batch_size in batch_size_list:
_compare_sparse_dense(data_dir, data_dict['data_name'], data_dict['data_mini'],
k, m[default_output_index], density, batch_size, num_batches)
_compare_sparse_dense(data_dir, data_dict['data_name'], data_dict['data_mini'],
k, m[default_output_index], density, batch_size, num_batches,
transpose=True)
_compare_sparse_dense(data_dir, data_dict['data_name'], data_dict['data_mini'],
k, output_dim, density,
batch_size_list[default_batch_size_index], num_batches, rsp=True)
def test_dot_synthetic(data_dict):
"""benchmark sparse mxnet dot and scipy dot operator with matrices of given density.
`t_sparse` is the runtime of the invoked sparse dot operator in ms, while `t_dense` is the
runtime of dot(dns, dns), with the same matrices except that they are in default storage type.
"""
# Benchmark MXNet and Scipys dot operator
def bench_dot(lhs_shape, rhs_shape, lhs_stype, rhs_stype,
lhs_den, rhs_den, trans_lhs, ctx, num_repeat=10, fw="mxnet", distribution="uniform"):
set_default_device(ctx)
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
assert fw == "mxnet" or fw == "scipy"
# Set funcs
dot_func_sparse = mx.nd.sparse.dot if fw == "mxnet" else sp.spmatrix.dot
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
dot_func_dense = mx.nd.dot if fw == "mxnet" else np.dot
# Create matrix instances
lhs_nd = rand_ndarray(lhs_shape, lhs_stype, density=lhs_den, distribution=distribution)
# only uniform distribution supported for rhs
if rhs_stype == 'csr':
rhs_nd = rand_ndarray(rhs_shape, rhs_stype, density=rhs_den, distribution=distribution)
else:
rhs_nd = rand_ndarray(rhs_shape, rhs_stype, density=rhs_den, distribution="uniform")
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
lhs_dns = None
rhs_dns = None
dense_cost = None
sparse_cost = None
if fw == "mxnet":
lhs_dns = lhs_nd if lhs_stype == 'default' else lhs_nd.tostype('default')
rhs_dns = rhs_nd if rhs_stype == 'default' else rhs_nd.tostype('default')
# One warm up run, verify correctness
out = dot_func_sparse(lhs_nd, rhs_dns, trans_lhs)
out_expected = dot_func_dense(lhs_dns, rhs_dns, trans_lhs)
assert_almost_equal(out.asnumpy(), out_expected.asnumpy(), rtol=1e-1, atol=1e-1)
sparse_cost = measure_cost(num_repeat, False, False, dot_func_sparse, lhs_nd, rhs_nd, trans_lhs)
dense_cost = measure_cost(num_repeat, False, False, dot_func_dense, lhs_dns, rhs_dns, trans_lhs)
else:
lhs_dns = lhs_nd.asnumpy()
rhs_dns = rhs_nd.asnumpy()
lhs_nd = sp.csr_matrix(lhs_nd.asnumpy())
rhs_nd = rhs_nd.asnumpy()
# One warm up run, verify correctness
lhs_nd_copy = sp.spmatrix.transpose(lhs_nd) if trans_lhs else lhs_nd
out = dot_func_sparse(lhs_nd_copy, rhs_dns)
sparse_cost = measure_cost(num_repeat, trans_lhs, False, dot_func_sparse, lhs_nd, rhs_nd)
dense_cost = measure_cost(num_repeat, trans_lhs, True, dot_func_dense, lhs_dns, rhs_dns)
speedup = dense_cost / sparse_cost
# Print results
m = lhs_shape[0]
k = lhs_shape[1]
n = rhs_shape[1]
result_pattern = '{:15.1f} {:15.1f} {:>10} {:8d} {:8d} {:8d} {:13.2f} {:13.2f} {:8.2f}'
results = result_pattern.format(lhs_den*100,
rhs_den*100,
str(ctx),
m,
k,
n,
sparse_cost*1000,
dense_cost*1000,
speedup)
print(results)
def print_benchmark_info(lhs, rhs, lhs_trans, fw):
trans_str = "^T" if lhs_trans else ""
print("========================================================")
print(f" {fw} sparse dot benchmark: dot({lhs}, {rhs}) = {rhs} ")
print(
f" (matrix multiplication: (m x k){trans_str} * (k x n) = m x n) ")
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
print("========================================================")
headline_pattern = '{:>15} {:>15} {:>10} {:>8} {:>8} {:>8} {:>13} {:>13} {:>8}'
headline = headline_pattern.format('lhs_density(%)',
'rhs_density(%)',
'context',
'm', 'k', 'n',
't_sparse(ms)',
't_dense(ms)',
'speedup')
print(headline)
def run_benchmark(ctx=None, lhs="csr", lhs_trans=False, rhs="dns", fw="mxnet", rhs_density=1,
distribution="uniform"):
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
if rhs_density > 1 or rhs_density < 0:
raise ValueError("rhs_density has to be between 0 and 1")
print_benchmark_info(lhs, rhs, lhs_trans, fw)
if rhs == "csr":
lhs_stype = "default"
rhs_stype = "csr"
assert (lhs_stype == 'default'), "Only dot(default, csr) supported"
# Arrange dimensions according to use case. For below csr will have num_rows << num_cols
feature_dim_list = data_dict['batch_size']
batch_size_list = data_dict['m']
output_dim_list = data_dict['feature_dim']
density_list = data_dict['density']
default_output_index = data_dict['default_index']['feature_dim']
default_density_index = data_dict['default_index']['density']
default_feature_index = data_dict['default_index']['batch_size']
default_batch_size_index = data_dict['default_index']['output_dim']
num_repeat = data_dict['num_repeat']
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
else:
lhs_stype = "csr"
rhs_stype = "row_sparse" if rhs == "rsp" else "default"
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
feature_dim_list = data_dict['feature_dim']
output_dim_list = data_dict['m']
batch_size_list = data_dict['batch_size']
density_list = data_dict['density']
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
default_output_index = data_dict['default_index']['output_dim']
default_batch_size_index = data_dict['default_index']['batch_size']
default_feature_index = data_dict['default_index']['feature_dim']
default_density_index = data_dict['default_index']['density']
num_repeat = data_dict['num_repeat']
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
for output_dim in output_dim_list:
if lhs_trans:
output_row_dim = batch_size_list[default_batch_size_index]
else:
output_row_dim = feature_dim_list[default_feature_index]
bench_dot((batch_size_list[default_batch_size_index],
feature_dim_list[default_feature_index]),
(output_row_dim, output_dim),
lhs_stype, rhs_stype,
density_list[default_density_index], rhs_density,
lhs_trans, ctx, num_repeat=num_repeat,
fw=fw, distribution=distribution)
for feature_dim in feature_dim_list:
if lhs_trans:
output_row_dim = batch_size_list[default_batch_size_index]
else:
output_row_dim = feature_dim
bench_dot((batch_size_list[default_batch_size_index], feature_dim),
(output_row_dim, output_dim_list[default_output_index]),
lhs_stype, rhs_stype, density_list[default_density_index], rhs_density,
lhs_trans, ctx, num_repeat=num_repeat, fw=fw, distribution=distribution)
for batch_size in batch_size_list:
if lhs_trans:
output_row_dim = batch_size
else:
output_row_dim = feature_dim_list[default_feature_index]
bench_dot((batch_size, feature_dim_list[default_feature_index]),
(output_row_dim,
output_dim_list[default_output_index]),
lhs_stype, rhs_stype, density_list[default_density_index],
rhs_density, lhs_trans, ctx, num_repeat=num_repeat,
fw=fw, distribution=distribution)
for density in density_list:
if lhs_trans:
output_row_dim = batch_size_list[default_batch_size_index]
else:
output_row_dim = feature_dim_list[default_feature_index]
bench_dot((batch_size_list[default_batch_size_index],
feature_dim_list[default_feature_index]),
(output_row_dim,
output_dim_list[default_output_index]),
lhs_stype, rhs_stype, density, density, lhs_trans, ctx,
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
num_repeat=num_repeat, fw=fw, distribution=distribution)
check_call(_LIB.MXSetNumOMPThreads(ctypes.c_int(ARGS.num_omp_threads)))
context = mx.gpu() if ARGS.gpu else mx.cpu()
# TODO(anirudh): make the data dicts to config which can be passed at runtime
distributions = ["uniform", "powerlaw"]
for distribution in distributions:
run_benchmark(context, lhs="csr",
rhs="default", lhs_trans=False,
fw="mxnet", rhs_density=1,
distribution=distribution)
run_benchmark(context, lhs="csr",
rhs="default", lhs_trans=True,
fw="mxnet", rhs_density=1,
distribution=distribution)
run_benchmark(context, lhs="csr",
rhs="rsp", lhs_trans=False,
fw="mxnet", rhs_density=0.05,
distribution=distribution)
run_benchmark(context, lhs="default",
rhs="csr", lhs_trans=False,
fw="mxnet", rhs_density=0.001,
distribution=distribution)
Sparse Tensor: request for reviews (#7082) * [WIP] Sparse Tensor (#5800) * squash merge with 38f7c5584016e92ba1e0ee1b00ea6632740f67ce compiles on GPU update check alloc: Checkpoint. Pass elem-sum gpu test bug fix for copyfromto. sparse sgd test pass on gpu inefficient implementation for csr copy update submodule fix lint Simple bind with infer storage type (#32) * Symbol binding for sparse tensor development. (#31) * Initial checkin * Add init functions for simple bind in graph_executor * Add simple_bind c_api * Add simple bind c-api * Assign zeros to in_args, arg_grads, and aux_states * Add simple_bind2 python interface * Fix python interface bugs * Interface changes * Fix * Fix core dump * Add bind_ith_exec c_api * Change simple_bind2 * Fix seg fault * Finish simple_bind * Change _bind_ith_exec * Refactor simple_bind initialization flow for bind * Consolidate bind and simple_bind graph init flow * Fix bug * Clean up * Add comments * Clean up * Clean up * Minor correction * Rename APIs in graph executor * Refactor * Rebase * Delete deprecated functions * Move more front-end work to backend * Bug fix * Fix failed tests * Minor fix * Fix lint * Fix lint * Revert unnecessary changes * Revert * Revert * Clean up * Fix lint Conflicts: python/mxnet/symbol.py src/executor/graph_executor.cc * Add inferstorage to graph executor * re-enable tests for sparse embedding with simple_bind * type switch fix in sparse embedding" ; change `default` to `default_storage` for cast storage op (#33) * change default to default_storage * disable cpp test build temporarily attempt to fix windows build error, and fix lint (#34) update nnvm submodule (#37) Scipy build (#38) * update nnvm submodule * add scipy pip install for dockerfile Python3 unit tests (#39) * change xrange to range for python3 compatiblity" * remove more xrange from tests replace long with int for python3 (#40) fix the rest of TShape constructor errors (#41) fix lint (#42) fix wrong usage of mshadow::Shape1" (#43) implementation for Csr slice on cpu (#36) * CPU implementation for CSR remove seg_len from csr slice add some docs for slice csr change indptr, values, etc to be private member bug fix in sparse embedding update nnvm submoduel fix lint update unit test for sparse nd" * add const for SliceCsrIndPtr kernel Fix sparse dot according to the new RSP definition (#35) * Fix csr dot dns * Fix sparse dot * Add fallback and test cases for dot(csr, dns)=dns * Add int type switch * Fix * Fix * Fix update mshadow submodule (#44) Fix dns to rsp (#46) fix lint (#47) add runtime storage fallback detection" (#48) * add runtime storage fallback detection" * replace cast storage ex with cast storage impl Fm example (#45) * update csr slice logic to avoid confusion. add more exmaples. * add hint to module.update * more testcases(fallback) for sparse_nd * add to_csr() and to_rsp() method. More unit test (fallback now) * add fm test. fix lint * register sparse sgd under Optim.SGD * update dmlc-core submoduel * change indptr to _indptr temporarily. add const ref to fname fix lint fix lint; (#51) Guard gpu cast storage (#50) * Clean up * Fix typo Rearrange unit test files (#52) fix lint. add scipy for python_test. fix scipy.sparse import error. fix truediv for python3 fix travis test (#54) * remove pyc files * add verbose for travis nosetests cleanup some testing code and enums (#57) * update Makefile * refactor test_sparse_operator * change `default_storage` back to `default` * remove unused cpp tests port libsvm parser to mxnet as libsvm iter (#55) * copied csv iter to libsvm iter test libsvm iter draft handle round batch == false for csr batch loader code refactoring add get stype, shape interface to iiter separate class for sparse iter add missing file fix mem corruption' rename variables add comments also read label from libsvm add test. update docs. update submodule Conflicts: python/mxnet/sparse_ndarray.py * update submodule * fix lint * update test * revert naming change add benchmark scritp for dot (#59) * add benchmark scritp for dot add gpu option for bench add get_data funciton for benchmark print t_sparse, too; add comment change nnz to dnesity add backward * add comment update fm test (#62) introduce CSRNDarray and rowsparseNDarray to python frontend api (#58) * introduce CSRNDarray and rowsparseNDarray to python frontend api * temporarily disable fm_module test fix lint (#64) fix typo. disable libsvm io test (#65) Improve dot (#61) * Init checkin * Fix * Adjust dot parallelization methods * Set num_omp_threads for benchmark from command line * Fix omp thread number * Clean up * Add scipy as dot baseline * Fix format sparse_retain op (#66) * Initial checkin * Fix bugs * Add unit test for sparse_retain * Add example and modify test add storage cast for outputs that have non-default storage (#67) fix gpu build (#69) Fix test_sparse_retain python3 issue (#68) revert nnvm version * draft for sgd rsp rsp (#75) support sgd(rsp, rsp) support dot(csr, rsp) when rsp is full add ref to const ndarray params support sparse embedding with rsp weight' fix lint modify embedding backward to produce dense grad remove invalid_rid for rsp->dns remove previous embedding op changes pass sparse embedding test add STORAGE_TYPE_ASSIGN_CHECK remove backward storage infer * fix lint (#78) * fix lint (#79) * serial elemwise sum impl (#80) update module kvstore interface add other missing params and functions revert some interface changes revert some more changes reomve explicit casting for gradients on kvstore update Comm interface update fm example Conflicts: python/mxnet/model.py python/mxnet/ndarray.py * bug fix for initializing module with row_sparse weight (#81) * bug fix for initializing module with row_sparse weight * update log message * Sparse ndarray serialization and deserialization (#77) * Initial checkin * Add unit tests * Fix lint * Fix lint (#84) * Sgd with row_sparse weight, dns gradient (#83) * sgd rsp dns draft * support sgd_mom(rsp, dns, rsp) * update doc * remove cast storage for kv updater * code refactoring * update mshadow version (#88) * csr slice bug fix (#90) * benchmark dot code refactor (#87) * q^x6x add some code in benchmark * refactor * minor fixes * fix * lint fix * Add unit test (#91) * add unittest * minor fix * remove commented lines * change test func name * add test rsp * kvstore push row sparse (#93) * Add multi-thread cpu elemwise sum for rsps * Minor fix * Add flag to switch between serial and multi-thread kvstore push * Fix lint in sparse_ndarray.py * Revert "Fix lint in sparse_ndarray.py" This reverts commit d7225ec267a1e8c0c3c8074d25af5844ed39a14d. * Fix ndarray init in copy(ctx) * Add env var to control the flow of serial/parallel reduce * Refactor * Fix copy ndarray bug * Fix lint * Refactor * Fix windows openmp build failure (#94) * update mshadow submoduel (#95) * Revert "update mshadow submoduel (#95)" (#96) This reverts commit 1a129e4cc39514a6c7b3aa1189949969b818aec3. * Refactor sparse tensor code (#99) * Initial checkin test_sparse_ndarray passes * Fix test failure * Clean up * Clean up * Move init backend op to ndarray_utils * Fix lint * Eliminate circular dependency on headers * More refactor * Fix gpu build and consolidate Slice for dense and sparse * Clean up * More refactor * Clean up * Fix gpu build * Fix comment * fix pylint (#100) * Fix refactor sparse gpu test (#104) * Fix gpu build * Fix * Fix gpu test failure * change idx types from int32 to int64 (#101) Conflicts: python/mxnet/test_utils.py tests/python/unittest/test_sparse_operator.py update mshadow submodule fix extra quotes in test script change indptr type to int64 better err message for rsp" * revert LOG(DEBUG) change (#105) * fix undefined zeros in optimizer.py (#106) * move init dns zeros to init_op.h for kvstore to use (#107) * Refactor cast storage (#109) * Refactor cast_storage * Add cast_storage cc and cu files * Remove redundant comments * Replace std::accumulate with ParallelAccumulate * Clean up * Fix windows build * Rowsparse kv (#111) * update kvstore unit test Conflicts: tests/python/unittest/test_kvstore.py update model/module.py Conflicts: python/mxnet/model.py python/mxnet/module/module.py fix lint resolve conflict remove int keys in kvstore update cast to str function * fix failed dist_sync_kv test * bug fix in comm to ensure merged gradient is of the right type bug fix in comm * row sparse dist kvstore draft (push only) row_sparse pull * add ndarray row sparse shared mem constructor * code refactoring * add test for row_sparse weight bug fix for kv server slicing add async support rsolve race condition in kvstore * resolve error after reb ase * fix lint (#113) * rename some python funciton (#114) * _to_rsp * _to_csr. raise NotImplementedError * todense * fix lint (#115) enable libsvm uniit test (#6839) remove shared mem slice for csr add csr ndarray iter test make osx nose test verbose disable libsvm iter test Move InferAttr to mxnet from nnvm (#6830) * Move InferAttr to mxnet from nnvm Replace nnvm infer attr functions in c_api Initial checkin Clean up Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType Add new interface for InferStorageType Revert "Remove nnvm namespace for FInferShape, FInferType, and FInferStorageType" This reverts commit 8aedf054bfe29b076c6fcb6f54d996fd2752e4de. Fix and clean up Fix lint Add nnvm changes Change infer function interface to accept only rvalue reference of graph Clean up Flush commits to show up in PR Add error handling for storage type inference failure Update nnvm * Fix pylint Change idx type switch for aux data (#6860) * Change idx type switch for aux data * Add mshadow commit Sparse dot enhancement (#6842) * Initial checkin Initial checkin Fix sparse dot test Fix unitest and add fallback for sparse dot * Add benchmark code * Revert "Add benchmark code" This reverts commit be009fe4c5a2a321aa92e99ac6e9cc511198c742. * Fix bug * Fix storage shape * Remove unnecessary test code * Use idx type switch Implement dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp and refactor (#6902) * Initial checkin Add dot(csr.T, rsp)=rsp2 Add infer storage for dot(csr, rsp)=dns and dot(csr.T, rsp)=rsp2 * Fix comments * Replace std::lower_bound with own impl for gpu use too * Add time profiling * Revert "Add time profiling" This reverts commit 8f5bb982867731df0305148b1b150b05661f8529. * Move dot and batch_dot to a single file * Move dot gpu impl to a .cuh file * More refactor * Fix include error LibsvmIter fix (#6898) * fix bug in libsvm iter which causes mem corruption * add test for news dataset * fix wrong path in test * fix import error for urllib * update url * replace bz command with bz module Optimized gpu dot kernels (#6937) * pulled update to mshadow * mshadow update * added optimized gpu kernels for dot(csr,dns)=dns and dot(csr.T,dns)=dns, and unit test * added __syncwarp to vector kernel and reduced number of writes to shared memory Refactor sparse tensor code (#6955) * Save stype in frontend to avoid c-api call for stype * Change storage_type to stype * Revert "Change storage_type to stype" This reverts commit 90db7d18b624f3ee4ffd37bf5680205e77ca2763. * Revert "Revert "Change storage_type to stype"" This reverts commit 09328382e926b92a42ba5b3df6f169f825975d88. Move ndarray.py, sparse_ndarray.py, ndarray_utils.py, and _ndarray_internal to ndarrary folder More refactor Move elementwise sum for rsp to ndarray_function.cc Remove unnecessary import in ndarray module Fix pylint Remove redundant code Remove _stype from slots Fix cpp-package build error caused by the change to imperative invoke interface Use relative import Remove print line Rename _ndarray_internal.py to _internal.py * Relaunch test... minor bug fix in warp synchronous code (#7029) * move storage type vector from nnvm to mxnet (#7054) * move storage type vector from nnvm to mxnet * update nnvm * update nnvm * Improve copy sparse tensors (#7003) * Use cast_storage when copying ndarrays of different stypes on same context * Relaunch test * fix failed tests. add back 64bit support for dot fix lint * bug fix for IdentityComputeRsp * fix lint fix lint fix lint * add data partition for libsvm iter (#7027) * remove sparse embedding (#7165) * fix ndarray namespace * remove untested gpu operators (#7172) * skip sparse dot gpu tset. add sparse_nd_zeros gpu test * remove sparse_retain gpu Conflicts: tests/python/gpu/test_operator_gpu.py * Fix ndarray aux data issue (#7098) * Fix getting sparse ndarray data/aux_data issues * Add tests for func csr and row_sparse * Make get/set data/aux_data thread safe * Fix a bug * Fix typo and comment * More comments * Correct comment Conflicts: tests/python/gpu/test_operator_gpu.py * Support K-dimensional row-sparse tensor (#7179) * remove check for k dimensional rowsparse tensor * change var name for rsp sgd operator * add checks for sparse dot * bug fix for kdim rowsparse cast storage cpu * update IdentityLikeRhsComputeEx interface * remove set_storage_shape from ndarray. support elemwise_add with kdim row_sparse tensor * use get_with_shape instead of reshape * update according to comments Conflicts: src/operator/tensor/elemwise_unary_op.h * Improve sparse ndarray error message (#7181) * add test for broadcast_to * add comments Conflicts: python/mxnet/base.py * construct row_sparse ndarray for dist-async fix bug in rsp add rsp sync push race condition for push fix bug in rsp pull. refactor test cleanup comments refactor dist server fix lint fix storage shape issue with the new ndarray constructor data sharding draft; fix lint. add comment add support for zeros gradients use std::upper_bound/lower_bound remove special init function for rowsparse dist kvstore temporary support for inplace operators for sparse add test. fix return type store kRowSparseNDArray in kv server remove fcomp_ex sgd with dns weight and rsp gradient bug fix in sparse retain sparse pull c_api revise rowsparse pull api use engine to compute unique to ensure thread safety add rowsparse pull to dist-kv fix lint add example for rsp_pull remove name2idx; add sparse_pull_dict param to module fix unit test and c rowid conversion support str key type in kvstore (#6765) * update kvstore unit test * update model/module.py * fix lint * remove int keys in kvstore * update cast to str function * remove _cast_to_str_keys * fix lint * always cast to str Conflicts: include/mxnet/c_api.h include/mxnet/kvstore.h python/mxnet/kvstore.py python/mxnet/model.py python/mxnet/module/module.py src/c_api/c_api.cc src/kvstore/kvstore_local.h tests/python/unittest/test_kvstore.py update module API for other submodules update stypes in kvstore after refactoring change type of size from size_t to int64_t add sparse linear regression example remove sparse_pull_dict from module fix init_optim for seq_module. update sparse example resolve conflict for binary add rsp rsp Conflicts: python/mxnet/kvstore.py tests/python/unittest/test_kvstore.py * fix DotCsrRspRspImpl error message (#7191) * GPU implementation of cast_storage (dense to csr) (#7081) * Added gpu implementation for cast_storage dense to csr, unit tests, and benchmark. Additionally, cast_storage interface change to accommodate the need of temporary storage in cuda kernels. * fixed whitespace * minor unittest update * removed whitespace * add cast storage benchmark params info Conflicts: tests/python/gpu/test_operator_gpu.py * Sparse square sum (#7206) * Add square_sum op * Add unit test and fix check_numeric_gradient * Add .cu file and example * Fix lint * Remove gpu registration * Use square_sum in test_module_fm * Modify and Add documentation for mx.nd.zeros (#7197) * Modify and Add documentation for mx.nd.zeros * Change context to cpu * Change stype to optional * Change ordering and remove optional for _zeros_sparse_ndarray * Expose kWriteInplace for imperative execution (fcompute_ex and fstatefulcompute_ex) (#133) * expose kWriteInplace to FComputeEx and FStatefulComputeEx * refactor ccode * remove duplicated test * Operator add_n for row sparse ndarrays (#7244) * Add add_n op for row-sparse ndarrays and identity FComputeEx * Fix bug in square_sum * Remove test_cast_storage_ex from gpu test since it's not implemented yet * Fix according to the cr Conflicts: src/operator/tensor/elemwise_sum.cc src/operator/tensor/elemwise_unary_op.cc tests/python/gpu/test_operator_gpu.py resolve conflict * GPU implementation of cast_storage (dense to rsp) (#7223) * CastStorageDnsRsp GPU Implementation * updating function doc and some variable types and names * adding cuda_get_device_prop() util function * added rand_shape function for n-dimensional tensors * updated cast storage unit test * added dns_to_rsp to cast storage benchmark script * removing redundant unit test * fix lint * minor change in benchmark script * fix lint * correct function description * change storage_type to stype * changed scope of using namespaces * changed variable types from index_t to dim_t * resolve merge conflict in ndarray.load * Improve StatefulOp/FCompute storage fallback (#134) * test for fcomp fallback add storage fallback test and optimize fallback logic rename function, add comments use std size() * add autograd test with sparse inputs * update sparse ndarray api (#139) * support mx.nd.empty for sparse ndarray Change SparseNDArray to BaseSparseNDArray support mx.nd.array with BaseSparseNDArray inputs. Update documentation with explicit subclasses of NDArrays Conflicts: python/mxnet/ndarray/__init__.py python/mxnet/ndarray/ndarray.py python/mxnet/ndarray/sparse_ndarray.py tests/python/unittest/test_sparse_ndarray.py * fix print msg in test * Handle ograd_stype='row_sparse' for square_sum backward (#143) * Add one kernel for square_sum backward pass to take rsp ograd * Add kNullOp and change to use type_assign in infer stype fallback * Sparse retain improvement (#138) * Add one more kernel for sparse retain * Fix compile * Change STORAGE_TYPE_ASSIGN_CHECK to type_assign for fallback * Fix * Add gpu compile * ignoring variables in SimpleBind that is used on python's sparse branch for now. (#135) * add bias term to fm test (#145) * update ndarray.nd, remove `invoke` from excluded members (#137) remove __weakref__ from SparseNDArray add data indice to doc revert dlpack update revert mxdoc changes move methods from BaseSparseNDarray to csrndarray and rwosparse ndarray * support storage fallback with mutable inputs (#147) * include mutatable inputs in storage fallback. refactor executor add fallback test for rms prop and adam fix lint fix lint fix test in optimizer * update according to comments * fix unit tests * fix gpu compilation err * Code changes based on reviews (#144) * code changes according to review comments remove executor debug. add doc to optimizer update sparse sgd test add dtype option to rand_sparse_ndarray * overhauled reqs for sparse operators * patch FCompExFallback with mutable inputs. update test_optimizer with more fallback cases * change executor debug macro to env var * add comment * update doc * change ndarray.aux_shape() to return const reference * remove todense to_rsp to_csr. replace with tostype * replace manual calls to cast_storage with tostype * disable gpu fallback test for optimizer * fix lint * add backward pass for cast_storage. refactor cast_storage test * rand_sparse_ndarray bug fix * fix cast_storage for gpu * disable csr test for fp16 * update row sparse ndarray doc * update doc * small edits according to reviews (#151) * fix lint (#152) * add license to all new files in sparse brnach (#154) * Allocate temp data on the fly for some casting operations (#149) * fix utf8 encoding in sparse ndarray * Extending the GPU dot operator (#7226) * Added GPU DotCsrRspDnsImpl declaration and TODOs * cleaning up function doc, variable types, and code-style * minor bug fixes * enable GPU dot(csr,rsp)=dns unit test * extend sparse dot unit test * adding GPU impl of DotCsrRspDns and its kernels * add TODO * changed variable types from index_t to dim_t * fix function description * added DotCsrRspRspImpl and its kernels (baseline, functionality) * added DotCsrDnsRspImpl and its kernels (baseline, functionality); plus code documentation * refactored dot benchmark * optimized DotCsrTransDnsRsp GPU kernel * change of dot impl interface to include OpContext, for temp storage * removing __device__ flag from CPU kernels * minor fixes and changing variable data types * minor fixes based on code reviews Conflicts: benchmark/python/sparse_op.py tests/python/gpu/test_operator_gpu.py tests/python/unittest/test_sparse_operator.py * Add get_synthetic_dataset function to util (#146) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Merge with dtype specific changes in test_utils * temporary fix for batch norm storage fallback (#156) * support random_uniform/normal/gamma with row_sparse output (#155) * add support for initilazer with rowsparse output * add scalar assignment to row_sparse * add setitem test to gpu * Revert "add scalar assignment to row_sparse" This reverts commit 8aef7a56c44038f67bbec93811977ea2f9fa3c30. * Revert "add setitem test to gpu" This reverts commit 3b969ac0980e8d7166a1cf46878ed2bd457986ed. * Square sum backward support one more case (#161) * Add documentation for sparse ops (#148) * draft doc for sparse op * add more stype doc for operators * add doc for cast_storage * see also cast_storage. remove base sparse ndarray. fix aux_types comemtn * grammar / spelling fix * A few fixes (#163) * fix batch norm gpu kernel. register random operators on gpu * register sparse random op on gpu, too * Minor fixes sparse ops (#160) * change CPU kernel inline directives, data types, and function doc * update dot dtype switch to use 32 and 64bit floating point only * use type_assign instead of STORAGE_TYPE_ASSIGN_CHECK * added tensor_util-inl.cuh file for common tensor operator GPU kernels * sparse Adam optimizer (#164) * add sparse adam * register gpu op * add comments * cr comments * kvstore.row_sparse_pull for GPU and end-to-end benchmark: CPU vs. multi-GPUs (#150) * Add gpu support for BroadcastRowSparse * Fix bugs * Add benchmark script * Increase output dim size * Update weight on CPU using single GPU for sparse tensors * More fix * Optimize sparse_retain for special case * Change row sparse pull locations * Avoid sparse retain on cpu if possible * Use acc for metric * Fix misc * fix bug in adam update (#167) fix a bug in adam update * change sparse example from regression to classification (#165) * fix python import (#166) * Add waitall to sparse_end2end.py (#169) * Add waitall() * Add dummy metric option * Add header license * Dot script changes (#159) * Add get_synthetic_datasets * Move to test_utils * Remove _get_uniform_dataset * Move validation to its own function * Refactor the validation code for csr generation * Make test_powerlaw a nested function * Change SparseNDArray to CSRNDArray * Refactoring changes to dot.py * Fix mxnet test_utils changes * Remove pdb statement * Add distribution parameter * Refactor benchmarking script * Remove unused code * Make style changes and remove unused code * Change typo in comment * Add transpose support * Change typo * 4 decimal points needed for density * Add rsp support for real datasets * Correct variable name mini_file_name * Move wait_to_read outside if * Seperate out scipy and mxnet logic in bench_dot * Fix lhs_trans issue * Move transpose outside measure_cost * Compute transpose inside measure_cost * Remove unused variables * Transpose only if trans_lhs (#171) * fix default val for distribution (#172) * fix lint (#175) * avoid cast_storage in dist-kvstore-server (#174) * avoid cast_storage in dist-kvstore-server * add stream arg to mshadow;;copy * fix copy order * Add sparse namespace to ndarray and symbol (#177) * Register dot, cast_storage, and sparse_retain under mxnet.ndarray.sparse * Add sparse to symbol namespace * Delete commented code * mv sparse_ndarray.py sparse.py * Clean up * Change docstring * changes based on code reviews (#176) * remove scipy dependency * move kvstore checks to backned * add const to lambda * temp fix to ndarray.md (#178) * Fix sparse namespace pylint (#179) * add comments and error msg (#181) * add clarification for csr (#182) * add clarification for csr * cr comments * revert change in test util (#183) * fix amalgamation (#184) * fix lint
2017-08-22 14:56:33 -07:00
if not ARGS.gpu:
run_benchmark(context, lhs="csr",
rhs="default", lhs_trans=False,
fw="scipy", rhs_density=1,
distribution=distribution)
run_benchmark(context, lhs="csr",
rhs="default", lhs_trans=True,
fw="scipy", rhs_density=1,
distribution=distribution)
if __name__ == "__main__":
begin_time = time.time()
test_dot_real(KDDA)
test_dot_real(AVAZU)
test_dot_real(CRITEO)
test_dot_synthetic(SYNTHETIC1)
test_dot_synthetic(SYNTHETIC2)
total_time = time.time() - begin_time
print(f"total time is {total_time}")