[FFI] Add new containers and Implementations by barry-jin · Pull Request #19685 · apache/mxnet

barry-jin · 2020-12-16T23:48:30Z

Description

This is the follow up PR for RFC #19672. Map container is added and more data types are supported by new FFI, like dictionary, list of strings.

Make ADT container and MAP container support NDArray type.
Adopt PackedFunc based FFI on CachedOp.
- Some CachedOp functions are implemented: create, free, invoke, get_optimized_symbol

Checklist

Essentials

PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage
Code is well-documented

mxnet-bot · 2020-12-16T23:48:35Z

Hey @barry-jin , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

To trigger all jobs: @mxnet-bot run ci [all]
To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [clang, edge, website, windows-gpu, sanity, windows-cpu, unix-gpu, unix-cpu, centos-cpu, centos-gpu, miscellaneous]

Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

…r-mxnet into ffi-container

barry-jin · 2020-12-22T17:17:29Z

@mxnet-bot run ci [unix-cpu, unix-gpu]

mxnet-bot · 2020-12-22T17:17:36Z

Jenkins CI successfully triggered : [unix-cpu, unix-gpu]

python/mxnet/_ctypes/cached_op.py

barry-jin · 2021-01-18T19:22:57Z

@mxnet-bot run ci [unix-cpu, unix-gpu]

mxnet-bot · 2021-01-18T19:23:04Z

Jenkins CI successfully triggered : [unix-gpu, unix-cpu]

python/mxnet/_ctypes/cached_op.py

barry-jin · 2021-02-16T18:27:47Z

After benchmarking on GluonNLP, I have got some improvements in the single forward step. I have pasted the average improvements as follows. (The latency is the average number with different batch_size, sequence_length as input)

model	training latency withou this PR (s)	training latency with this PR (s)	Improvement (s)
google_en_uncased_bert_base	0.09161326	0.09133351	0.00027974
google_en_uncased_bert_base	0.3565172	0.35624171	0.000275489
google_en_uncased_bert_large	0.91762223	0.9173615	0.000260731
google_albert_base_v2	0.38036531	0.38022336	0.00014195
google_albert_large_v2	0.74285129	0.74271887	0.000132424
google_albert_xlarge_v2	1.53808278	1.53795535	0.000127428
google_albert_xxlarge_v2	2.49918614	2.49904376	0.000142379
google_electra_small	0.07791454	0.07770361	0.000210933
google_electra_base	0.35639018	0.35617552	0.000214658
google_electra_large	0.91575478	0.9154471	0.000307674
google_uncased_mobilebert	0.1725719	0.17218696	0.000384942
fairseq_bart_base	0.43927581	0.43899117	0.00028464
fairseq_bart_large	0.70489126	0.70455636	0.0003349

Also, I have compared the Training and inferencing time with the real workloads:
Running google_electra_small model on SQuAD dataset and will get the following results.

Training/Inferencing	Latency without this PR	Latency with this PR	Throughput without thie PR (samples/s)	Throughput with this PR (samples/s)
Training	1.59179 h	1.48754 h	70	75
Inferencing	55.566 s	55.41125 s	216.35	216.96

Environment

python_version	3.6.9
instance	g4dn.2x
system	Linux
cpu	x86_64
architecture	64bit
fp16	FALSE
cpu_ram_mb	63622
use_gpu	TRUE
num_gpus	1
gpu	Tesla T4
gpu_ram_mb	15079
gpu_power_watts	70
gpu_performance_state	0

barry-jin · 2021-02-16T21:27:23Z

@mxnet-bot run ci [windows-cpu, unix-gpu]

mxnet-bot · 2021-02-16T21:27:29Z

Jenkins CI successfully triggered : [unix-gpu, windows-cpu]

barry-jin · 2021-02-18T01:55:15Z

I have replaced the backend APIs(MXInvokeCachedOp, MXNET_REGISTER_GLOBAL("cached_op.invoke")) with simple or dummy implementation so that we can fully expose the overhead of the API call with/without this PR by removing the computational costs. The results is shown as follows:

For CachedOp invocation call without this PR, it takes around 7.22 us and the most of the overhead is in making cython/python args; For CachedOp invocation call with this PR, it takes around 4.041 us and the most of the overhead is in type translation/checking in packedfunc system.

CachedOp Invocation in cython code:

CachedOp Invocation with new FFI implementation(accelerated by cython):

leezu · 2021-02-22T19:07:28Z

include/mxnet/runtime/packed_func.h

+    // } else if (type_code_ == kStr) {
+    //   return std::string(value_.v_str);
+    // } else {
+    //   CHECK(IsObjectRef<tvm::runtime::String>());
+    //   return AsObjectRef<tvm::runtime::String>().operator std::string();
+    // }


Let's remove the unused code?

barry-jin · 2021-02-23T04:49:36Z

@mxnet-bot run ci [windows-cpu]

mxnet-bot · 2021-02-23T04:49:40Z

Jenkins CI successfully triggered : [windows-cpu]

.github/workflows/os_x_staticbuild.yml

include/mxnet/runtime/object.h

src/api/cached_op_api.cc

barry-jin · 2021-02-24T02:12:28Z

@mxnet-bot run ci [windows-cpu]

mxnet-bot · 2021-02-24T02:12:32Z

Jenkins CI successfully triggered : [windows-cpu]

barry-jin · 2021-02-25T18:46:48Z

@mxnet-bot run ci [windows-cpu]

mxnet-bot · 2021-02-25T18:46:54Z

Jenkins CI successfully triggered : [windows-cpu]

szha · 2021-02-26T00:14:56Z

@mxnet-bot run ci [windows-cpu]

mxnet-bot · 2021-02-26T00:15:01Z

Jenkins CI successfully triggered : [windows-cpu]

szha · 2021-02-26T18:08:05Z

@mxnet-bot run ci [windows-cpu]

mxnet-bot · 2021-02-26T18:08:11Z

Jenkins CI successfully triggered : [windows-cpu]

python/mxnet/_ctypes/_api_internal.py

[FFI] Add new containers and tests

95d13ad

barry-jin requested review from eric-haibin-lin and szha as code owners December 16, 2020 23:48

lanking520 added the pr-work-in-progress PR is still work in progress label Dec 16, 2020

barry-jin and others added 9 commits December 16, 2020 15:53

add license

7384831

fix sanity

4c1ff74

sanity

db80bf7

strlen -> stold

4086c6c

update base.pxi

76452d3

set @use_np in test

1ef24bf

fix clang-tidy

7197c4a

fix clang-tidy

ede0977

Merge branch 'ffi-container' of https://github.com/barry-jin/incubato…

83aaff8

…r-mxnet into ffi-container

barry-jin added 10 commits January 12, 2021 12:12

make containers support NDArray

a5c5259

fix sanity

f56acbe

Adopt PackedFunc Based FFI on CachedOp

bdbee75

fix pylint

3f78892

fix sanity

c37de95

update ndarray_handle.h

a37c9a2

remove convert.pxi

ee4d2d6

update

50f19ef

update _internal.py

b7933f5

convert ADT to list

8c961dc

leezu reviewed Jan 15, 2021

View reviewed changes

python/mxnet/_ctypes/cached_op.py Show resolved Hide resolved

Merge remote-tracking branch 'upstream/master' into ffi-container

595a84e

lanking520 added pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Feb 13, 2021

barry-jin added 2 commits February 15, 2021 15:41

Merge remote-tracking branch 'upstream/master' into ffi-container

6cbd771

update

7b4af9a

lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test and removed pr-work-in-progress PR is still work in progress labels Feb 16, 2021

fix

8025fc1

szha reviewed Feb 16, 2021

View reviewed changes

python/mxnet/_ctypes/cached_op.py Outdated Show resolved Hide resolved

update cached_op.py

a36f61e

leezu reviewed Feb 22, 2021

View reviewed changes

update packed_func.h

713f46e

leezu reviewed Feb 23, 2021

View reviewed changes

.github/workflows/os_x_staticbuild.yml Outdated Show resolved Hide resolved

include/mxnet/runtime/object.h Show resolved Hide resolved

src/api/cached_op_api.cc Outdated Show resolved Hide resolved

update cached_op_api.cc

47e3f29

Merge remote-tracking branch 'upstream/master' into ffi-container

3c77022

szha reviewed Mar 4, 2021

View reviewed changes

python/mxnet/_ctypes/_api_internal.py Show resolved Hide resolved

szha approved these changes Mar 4, 2021

View reviewed changes

Conversation

barry-jin commented Dec 16, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Essentials

Uh oh!

mxnet-bot commented Dec 16, 2020

Uh oh!

barry-jin commented Dec 22, 2020

Uh oh!

mxnet-bot commented Dec 22, 2020

Uh oh!

Uh oh!

barry-jin commented Jan 18, 2021

Uh oh!

mxnet-bot commented Jan 18, 2021

Uh oh!

Uh oh!

barry-jin commented Feb 16, 2021

Uh oh!

barry-jin commented Feb 16, 2021

Uh oh!

mxnet-bot commented Feb 16, 2021

Uh oh!

barry-jin commented Feb 18, 2021

Uh oh!

leezu Feb 22, 2021

Choose a reason for hiding this comment

Uh oh!

barry-jin commented Feb 23, 2021

Uh oh!

mxnet-bot commented Feb 23, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

barry-jin commented Feb 24, 2021

Uh oh!

mxnet-bot commented Feb 24, 2021

Uh oh!

barry-jin commented Feb 25, 2021

Uh oh!

mxnet-bot commented Feb 25, 2021

Uh oh!

szha commented Feb 26, 2021

Uh oh!

mxnet-bot commented Feb 26, 2021

Uh oh!

szha commented Feb 26, 2021

Uh oh!

mxnet-bot commented Feb 26, 2021

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

barry-jin commented Dec 16, 2020 •

edited

Loading