Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

Gpu samplers#8179

Merged
piiswrong merged 2 commits intoapache:masterfrom
asmushetzel:gpu_samplers
Oct 10, 2017
Merged

Gpu samplers#8179
piiswrong merged 2 commits intoapache:masterfrom
asmushetzel:gpu_samplers

Conversation

@asmushetzel
Copy link
Contributor

Provide a generic implementation for all random samplers that can run on cpu and gpu.

Remarks:

  • This will fail until PR297 is merged into mshadow

  • The design pattern adopted for GPU is to generate as many random seeds as samples should be drawn and then draw each sample using its own randstate. This is a pattern that is considered to be ok according to Nvidia documentation. There is a theoretical chance that the generated random sequences by the individual randstates are somehow correlated but Nvidia claims that they never observed such an effect and just want to mention that they can't prove that it can never happen theoretically. So in practice, this should generated high-quality uncorrelated samples.

  • Above design pattern is also ok w.r.t. setup time of the randstates as we use consistently sequence 0 of the randstate (so the setup does not have to skip through multiple sequences when initializing). Again according to Nvidia.

  • I decided to use a consistent design pattern even for uniform/normal distributions as this makes the code more consistent.

  • The design pattern allows easy implementation of all types of rejection sampling methods, so we should be able to add other distributions easily whenever we need them.

  • Above pattern will automatically enable multi-threaded sampling on CPU if openMP is on.

  • The sampling methods for exponential/gamma/Poisson/negative Binomial are all standard and mostly the same as STL uses. The rejection method for big lambdas for the Poisson-distribution is slightly different but also theoretically sound (reference in the code)

  • There is a bit of a problem for the case of fp16. Basically there do not exist any samplers that natively work on this limited precision. I don't know why we recently added fp16 as a valid output for random sampling operators. We should have never done this but instead insist that an explicit casting operator must be used that then can also handle all issues of overflow/underflow in a centralized way. Anyway, somehow this got added which causes the problem that we have to convert fp32 samples to fp16 and as these are samples from a distribution, there is always the chance of an overflow. Ugly. This was already the case in the prior implementation, so it is not newly introduced here. And IMO, we should not solve it within the samplers but instead by a separate casting operator.

  • Adding additional outputs/functionalities to the random sampler such as re-parametrization, CDF etc is not part of this PR. We should add it as a separate step.

@asmushetzel asmushetzel force-pushed the gpu_samplers branch 2 times, most recently from d8fd75e to d25943d Compare October 8, 2017 17:52
assert np.abs(check_func(ret1, params)) < tol, "ndarray test: %s check for `%s` did not pass" % (check_name, name)

# check multi-distribution sampling, only supports cpu for now
if device.device_type == 'cpu':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for testing imperative.

@asmushetzel asmushetzel force-pushed the gpu_samplers branch 2 times, most recently from ebfd5f2 to eeaf4dd Compare October 9, 2017 11:10
@asmushetzel
Copy link
Contributor Author

pls do not yet merge. investigating one test failure related to test_loss.py (which is related to randoms).

@asmushetzel
Copy link
Contributor Author

Found the issue. Can you please integrate PR 300 for mshadow.

BTW: in test_loss.py there are a lot of calls to np.random.seed() where I think you want to call mx.random.seed() instead. I realized that you changed it recently Eric, so you may still working on it.

@piiswrong
Copy link
Contributor

Could you put the imperative tests back? Then we can merge

@asmushetzel
Copy link
Contributor Author

Concerning imperative tests, I just enabled them also for GPU, but they are still in. Just had to change intendation of the whole code block. Can you take a second look pls.

Other than that, I'm done.

BTW: Mid-term, I would like to get rid of any dependency on mshadow::random and move the remaining stuff (having some generators centrally allocated) to src/operator/random. What is your take on that?

@piiswrong
Copy link
Contributor

Sure, sounds good to me.

@piiswrong piiswrong merged commit d22373b into apache:master Oct 10, 2017
mbaijal pushed a commit to mbaijal/incubator-mxnet that referenced this pull request Oct 12, 2017
* Samplers for GPU

* adjustments for GPU samplers
crazy-cat pushed a commit to crazy-cat/incubator-mxnet that referenced this pull request Oct 26, 2017
* Samplers for GPU

* adjustments for GPU samplers
@asmushetzel asmushetzel deleted the gpu_samplers branch October 29, 2017 13:59
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants