Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

AMP issue in _gen_atomic_symbol #15725

@mycpuorg

Description

@mycpuorg

Hi @fierceX,

I'm referencing you, based on another amp issue.
I'm following https://github.com/apache/incubator-mxnet/blob/master/docs/tutorials/amp/amp_tutorial.md to optimize the training times in my RNN model which is pretty similar to this:
https://github.com/apache/incubator-mxnet/blob/master/example/recommenders/demo1-MF.ipynb

Can you please help out?

E           mxnet.base.MXNetError: [21:17:47] src/c_api/c_api_symbolic.cc:857: Check failed: source->outputs.size() == 1U (2 vs. 1) : Generating atomic symbol from other 
symbol only works for nongrouped symbol.
E           Stack trace:
E             [bt] (0) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x4a357b) [0x7f2648abb57b]
E             [bt] (1) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libmxnet.so(MXGenAtomicSymbolFromSymbol+0x19d) [0x7f264ac2afbd]
E             [bt] (2) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/lib-dynload/../../libffi.so.6(ffi_call_unix64+0x4c) [0x7f26cdc1cec0]
E             [bt] (3) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/lib-dynload/../../libffi.so.6(ffi_call+0x22d) [0x7f26cdc1c87d]
E             [bt] (4) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(_ctypes_callproc+0x2ce) [0x7f26cde31e2e]
E             [bt] (5) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(+0x12865) [0x7f26cde32865]
E             [bt] (6) /home/ubuntu/anaconda3/envs/mxnet_p36/bin/python(_PyObject_FastCallDict+0x8b) [0x5596d2bb2d7b]
E             [bt] (7) /home/ubuntu/anaconda3/envs/mxnet_p36/bin/python(+0x19e7ce) [0x5596d2c427ce]
E             [bt] (8) /home/ubuntu/anaconda3/envs/mxnet_p36/bin/python(_PyEval_EvalFrameDefault+0x2fa) [0x5596d2c64cba]

../../../anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/base.py:253: MXNetError

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                32
On-line CPU(s) list:   0-31
Thread(s) per core:    2
Core(s) per socket:    16
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 79
Model name:            Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
Stepping:              1
CPU MHz:               2699.894
CPU max MHz:           3000.0000
CPU min MHz:           1200.0000
BogoMIPS:              4600.17
Hypervisor vendor:     Xen
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              46080K
NUMA node0 CPU(s):     0-31
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single kaiser fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx xsaveopt
----------Python Info----------
Version      : 3.6.5
Compiler     : GCC 7.2.0
Build        : ('default', 'Apr 29 2018 16:14:56')
Arch         : ('64bit', '')
------------Pip Info-----------
Version      : 19.2.1
Directory    : /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/pip
----------MXNet Info-----------
Version      : 1.5.0
Directory    : /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet
Commit Hash   : 75a9e187d00a8b7ebc71412a02ed0e3ae489d91f
Library      : ['/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libmxnet.so']
Build features:
✔ CUDA
✔ CUDNN
✔ NCCL
✔ CUDA_RTC
✖ TENSORRT
✔ CPU_SSE
✔ CPU_SSE2
✔ CPU_SSE3
✔ CPU_SSE4_1
✔ CPU_SSE4_2
✖ CPU_SSE4A
✔ CPU_AVX
✖ CPU_AVX2
✖ OPENMP
✖ SSE
✔ F16C
✖ JEMALLOC
✖ BLAS_OPEN
✖ BLAS_ATLAS
✖ BLAS_MKL
✖ BLAS_APPLE
✔ LAPACK
✔ MKLDNN
✔ OPENCV
✖ CAFFE
✖ PROFILER
✔ DIST_KVSTORE
✖ CXX14
✖ INT64_TENSOR_SIZE
✔ SIGNAL_HANDLER
✖ DEBUG
----------System Info----------
Platform     : Linux-4.4.0-1085-aws-x86_64-with-debian-stretch-sid
system       : Linux
node         : ip-172-31-30-250
release      : 4.4.0-1085-aws
version      : #96-Ubuntu SMP Tue Jun 11 09:08:32 UTC 2019
----------Hardware Info----------
machine      : x86_64
processor    : x86_64
----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0018 sec, LOAD: 0.6379 sec.
Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.0007 sec, LOAD: 0.3731 sec.
Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.0003 sec, LOAD: 0.3364 sec.
Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0003 sec, LOAD: 0.1988 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0015 sec, LOAD: 0.2147 sec.
Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0004 sec, LOAD: 0.1079 sec.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions