Considered Harmful Stephan T Lavavej Steh fin Lah wah wade Senior Developer Visual C Libraries stlmicrosoftcom 1 Version 11 September 5 2013 Whats Wrong With This Code ID: 570631
Download Presentation The PPT/PDF document "rand()" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
rand() Considered Harmful
Stephan T. Lavavej ("Steh-fin Lah-wah-wade")Senior Developer - Visual C++ Librariesstl@microsoft.com
1
Version 1.1 - September 5, 2013Slide2
What's Wrong With This Code?
#include <stdio.h>#include <stdlib.h>#include <
time.h>
int
main() {
srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}
2Slide3
What's Right With This Code?
#include <stdio.h>#include <stdlib.h>#include <
time.h>
int
main() {
srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}
3
All required headers are included!
All included headers are required!
Headers are sorted!
One True Brace Style!
Unnecessary
argc
,
argv, return 0; omitted!
%d
is correct for
int
!Slide4
What's Wrong With This Code?
#include <stdio.h>#include <stdlib.h>#include <
time.h>
int
main() {
srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");}
4Slide5
What's Wrong With This Code?
#include <stdio.h>#include <stdlib.h>#include <
time.h>
int
main() {
srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");
}
5
ABOMINATION!Slide6
What's Wrong With This Code?
#include <stdio.h>#include <stdlib.h>#include <
time.h>
int
main() {
srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");
}
6
ABOMINATION!
Frequency: 1 Hz!Slide7
What's Wrong With This Code?
#include <stdio.h>#include <stdlib.h>#include <
time.h>
int
main() {
srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); } printf("\n");
}
7
ABOMINATION!
Frequency: 1 Hz!
warning C4244: 'argument' : conversion from '
time_t
' to 'unsigned
int
', possible loss of data
32-bit seed!Slide8
What's Wrong With This Code?
#include <stdio.h>#include <stdlib.h>#include <
time.h>
int
main() {
srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100); }
printf
("\n");
}
8
ABOMINATION!
Frequency: 1 Hz!
warning C4244: 'argument' : conversion from '
time_t
' to 'unsigned int
', possible loss of data
32-bit seed!
Range:
[0, 32767]
Linear
congruential
low quality!Slide9
What's Wrong With This Code?
#include <stdio.h>#include <stdlib.h>#include <
time.h>
int
main() {
srand(time(NULL)); for (int i = 0; i < 16; ++i) { printf("%d ", rand() % 100);
}
printf
("\n");
}
9
ABOMINATION!
Frequency: 1 Hz!
warning C4244: 'argument' : conversion from 'time_t
' to 'unsigned
int
', possible loss of data
32-bit seed!
Range:
[0, 32767]
Linear
congruential
low quality!
Non-uniform distribution!Slide10
Modulo Non-Uniform Distribution
int src = rand(); // Assume uniform [0, 32767]int
dst
=
src
% 100; // Non-uniform [0, 99]// [0, 99] src [0, 99] dst// [100, 199] src [0, 99] dst// ...// [
32700, 32767
]
src
[0,
67]
dstThis is modulo's fault, not
rand()'sTrigger: input range isn't exact multiple of output range
10Slide11
Floating-Point Treachery
int src = rand(); // Assume uniform [0, 32767]int
dst =
static_cast
<
int>( // As seen on (src * 1.0 / RAND_MAX) * 99 // StackOverflow); // Hilariously non-uniform [0, 99]Only one input produces the output 99:static_cast<
int
>((
32765
* 1.0 / 32767) * 99) ==
98
static_cast<
int>((
32766 * 1.0 / 32767) * 99) == 98static_cast<
int
>((
32767
* 1.0 / 32767) * 99) ==
99
11Slide12
Floating-Point Double Treachery
int src = rand(); // Assume uniform [0, 32767]int
dst =
static_cast
<
int>( (src * 1.0 / (RAND_MAX + 1)) * 100); // Subtly non-uniform [0, 99]Less likely outputs (327/32768 vs. 328/32768):3, 6, 9, 12, 15, 18, 21, 24, 28, 31, 34, 37, 40, 43, 46, 49, 53, 56, 59, 62, 65, 68, 71, 74, 78, 81, 84, 87, 90, 93, 96, 99Same problem as src % 100Nothing can uniformly map 32768 inputs to 100 outputs
12Slide13
Floating-Point Triple TreacheryWhat if the input is
[0, 232) or [0, 264)?Non-uniformity is reduced, but not eliminated, when the input is much larger than the outputWhat if IEEE runs out of bits?Example:
[0, 264
)
input
[0, 1018 ≈ 259.8) outputdouble has only 53 bits of significand precisionSay you have a problem, so you use floating-pointNow you have 2.000001 problemsDO NOT MESS WITH FLOATING-POINT13Slide14
<random> URNGs(Uniform Random Number Generators)
Engine templates:linear_congruential_enginemersenne_twister_engine
subtract_with_carry_engine
Engine adaptor templates:
discard_block_engine
independent_bits_engineshuffle_order_engineNon-deterministic:random_deviceEngine (adaptor) typedefs:minstd_rand0minstd_randmt19937mt19937_64
ranlux24_base
ranlux48_base
ranlux24
ranlux48
knuth_b
default_random_engine
14Slide15
<random> Distributions
Uniform distributionsuniform_int_distributionuniform_real_distributionPoisson distributionspoisson_distributionexponential_distribution
gamma_distribution
weibull_distribution
extreme_value_distribution
Sampling distributionsdiscrete_distributionpiecewise_constant_distributionpiecewise_linear_distributionBernoulli distributionsbernoulli_distributionbinomial_distributiongeometric_distributionnegative_binomial_distributionNormal distributionsnormal_distribution
lognormal_distribution
chi_squared_distribution
cauchy_distribution
fisher_f_distribution
student_t_distribution
15Slide16
Hello, "Random" World!#include <
iostream>#include <random>int main() { std::mt19937
mt
(1729);
std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i < 16; ++i) { std::cout <<
dist
(
mt
) << " ";
}
std::
cout << std
::endl;
}16Slide17
Hello, "Random" World!#include <
iostream>#include <random>int main() { std::mt19937
mt
(
1729
); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i < 16; ++i) { std::
cout
<<
dist
(
mt
) << " "; }
std::cout
<< std::
endl;}17
Deterministic 32-bit seedSlide18
Hello, "Random" World!#include <
iostream>#include <random>int main() { std::
mt19937
mt
(1729); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i < 16; ++i) {
std
::
cout
<<
dist
(mt) << " ";
}
std::cout
<< std::endl;}
18
Deterministic 32-bit seed
Engine:
[0, 2
32
)Slide19
Hello, "Random" World!#include <
iostream>#include <random>int main() { std::
mt19937
mt
(1729); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i < 16; ++i
) {
std
::
cout
<< dist
(mt) << " ";
} std
::cout << std::endl;
}
19
Deterministic 32-bit seed
Engine:
[0, 2
32
)
Distribution:
[0, 99]Slide20
Hello, "Random" World!#include <
iostream>#include <random>int main() { std::
mt19937
mt
(1729); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i
< 16; ++
i
) {
std
::cout
<< dist(
mt) << " "; }
std::cout << std::
endl
;
}
20
Deterministic 32-bit seed
Engine:
[0, 2
32
)
Distribution:
[0, 99]
Note: [inclusive, inclusive]Slide21
Hello, "Random" World!#include <
iostream>#include <random>int main() { std::
mt19937
mt
(1729); std::uniform_int_distribution<int> dist(0, 99); for (int i = 0; i
< 16; ++
i
) {
std
::cout
<< dist(
mt) << " ";
} std::cout <<
std
::
endl
;
}
21
Deterministic 32-bit seed
Engine:
[0, 2
32
)
Distribution:
[0, 99]
Note: [inclusive, inclusive]
Run engine,
viewed through distributionSlide22
Hello, Random World!#include <
iostream>#include <random>int main() { std::
random_device
rd
; std::mt19937 mt(rd()); std::uniform_int_distribution<int> dist(0, 99); for (int
i
= 0;
i
< 16; ++
i
) { std
::cout <<
dist(mt
) << " "; } std::cout <<
std
::
endl
;
}
22
Non-deterministic 32-bit seedSlide23
mt19937 vs. random_device
mt19937 is:Fast (499 MB/s = 6.5 cycles/byte for me)Extremely high quality, but not cryptographically secureSeedable (with more than 32 bits if you want)Reproducible (Standard-mandated algorithm)random_device is:Possibly slow (1.93 MB/s =
1683 cycles/byte for me)Strongly platform-dependent (GCC 4.8 can use IVB RDRAND)Possibly crypto-secure (check documentation, true for VC)
Non-
seedable
, non-reproducible23Slide24
uniform_int_distribution
Takes any Uniform Random Number GeneratorUsually [0, 232) or [0, 264) but
[1701, 1729] worksIf your URNG does that, you are bad and you should feel badEmits any desired range of integers [low, high]
signed
/
unsigned short/int/long/long longWhy not char/signed char/unsigned char? Standard Says SoTMPreserves perfect uniformityRequires obsessive implementersUses bitwise/etc. magic, invokes URNG repeatedly (rare)Runs fairly quickly (34% raw speed for me)Deterministic, but not invariantWill vary across platforms, may vary across versions24Slide25
random_shuffle() Considered
Harmfultemplate <typename RanIt> void
random_shuffle
(
RanIt
f, RanIt l);May call rand()C++ Standard Library, I trusted you!template <typename RanIt, typename RNG>void random_shuffle(RanIt f, RanIt l, RNG&& r);Not evil, but highly inconvenientKnuth shuffle needs r(n) to return
[0, n)
25Slide26
shuffle() Considered Awesome
template <typename RanIt, typename URNG> void
shuffle(
RanIt
f,
RanIt l, URNG&& g);Takes URNGs directly (e.g. mt19937)Shuffles perfectlyAll permutations are equally likelyInvokes the URNG in-place (can't copy)Other algorithms can copy functors, like generate()Special exception: for_each() moves functors26Slide27
Random <random> Notes
Running mt19937 is fast, constructing/copying isn'tConstructing/copying engines often is already undesirableURNG/distribution function call ops are non-constMultiple threads cannot simultaneously call a single objectWhen is it safe to skip uniform_int_distribution?
mt19937's [0, 2
32
)
or mt19937_64's [0, 264) [0, 2N)In this case, masking is safe, simple, and efficientIn all other cases, use uniform_int_distribution27