Fix a memory misalignment in topk operator#15948
Conversation
|
@access2rohit @ChaiBapchya Please also help review. Thanks |
35da47d to
b431a59
Compare
| * \return the corresponding dimension size | ||
| */ | ||
| MSHADOW_XINLINE index_t &operator[](index_t idx) { | ||
| MSHADOW_XINLINE index_t &operator[](int idx) { |
There was a problem hiding this comment.
Can you describe this change a bit more in detail ? Why is this required ?
There was a problem hiding this comment.
This is not directly fixing this bug. However, while checking the tensor struct, I noticed this data type should be int because it is indexing the dimension (which is set to int)
e658902 to
2bd2cbc
Compare
| // Temp space needed by the full sorts. | ||
| size_t temp_size = std::max( | ||
| mxnet::op::SortByKeyWorkspaceSize<index_t, DType, xpu>(src.Size()), | ||
| mxnet::op::SortByKeyWorkspaceSize<DType, index_t, xpu>(src.Size())); |
There was a problem hiding this comment.
I think we should also include mxnet::op::SortByKeyWorkspaceSize<index_t, index_t, xpu>(src.Size()).
| * \return the corresponding dimension size | ||
| */ | ||
| MSHADOW_XINLINE const index_t &operator[](index_t idx) const { | ||
| MSHADOW_XINLINE const index_t &operator[](int idx) const { |
There was a problem hiding this comment.
There are other places in the same file that assumes idx have index_t type. we need to fix all of them.
3a1d25a to
040c8bc
Compare
b7f9786 to
f6f0750
Compare
|
Thank you for the fix, @apeforest . Will this be picked to the v1.5.x branch? |
* fix alignment * use correct type for shape index * clean up unnecessary space in topk * fix lint * add additional temp space * address reviewer comment * fix incorrect nidex type
|
@apeforest Could you please open a dummy PR against the v1.5.x branch to make sure the branch can properly pass the CI? |
|
See #15999 |
This reverts commit 42746bc.
Description
Current memory alignment in topk operator is incorrect if index_t is using int64_t. This PR fixes the potential issue. It partially fixes #15703
The PR also fixes an incorrect data type usage in mshadow/tensor.h
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Bug fix
Comments