0% found this document useful (0 votes)

26 views103 pages

Lecture06 BasicSampleGeneration CMU

This document discusses Monte Carlo methods and their applications, focusing on sample generation and random number generation techniques. It covers numerical and computational issues, including floating point errors, compensated summation, and challenges in parallel implementations. The document also explores the nature of randomness, pseudorandom number generators, and measures of randomness relevant to Monte Carlo simulations.

Uploaded by

Ge Cao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views103 pages

Lecture06 BasicSampleGeneration CMU

Uploaded by

Ge Cao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 103

MONTE CARLO

METHODS
AND APPLICATIONS

Carnegie Mellon University | 21-387 | 15-327/627 | 15-860 | Fall 2023

LECTURE 6
SAMPLE GENERATION

MONTE CARLO METHODS

AND APPLICATIONS
Carnegie Mellon University | 21-387 | 15-327/627 | 15-860 | Fall 2023
“Jedenfalls bin ich überzeugt, daß “God does not play dice with the
der nicht würfelt.” (In any case, I universe, but I will.”
am convinced that [God] does not
roll the dice.) —WALTER MATTHAU
—ALBERT EINSTEIN playing Einstein in the 1994
in a letter to Max Born (1926) romantic comedy “I.Q.”
Overview
• Some quick comments about numerics & computation
• Random number generation
• Low-discrepancy sequences
• Sampling special distributions
• General techniques for discrete distributions
• General techniques for continuous distributions
– inversion
– rejection sampling
Numerics & Computation
Just in case you forgot… :-)

basic Monte Carlo estimator

Taking Large Sums
• Easy task for a computer: summing up a bunch of
numbers. N sum
harmonic
series
• Let’s give it a try: consider the harmonic series 1 1
10 2.92897
• Approximate by taking first N terms, for large N
100 5.18738
• Seems to converge to about 15.4037 1k 7.48548
10k 9.78761
• Q: Anyone know what it should converge to? 😅
100k 12.0909
• A: Infinity! (diverges) 1M 14.3574
float sum = 0.f;
for( int k = 1; k <= N; k++ ) 10M 15.4037
{
What went sum += 1.f/(float)k;
100M 15.4037
wrong? } 1B 15.4037
Taking Large Sums—Part II
• Easy task for a computer: summing up a bunch of
numbers.
• Suppose we now have a convergent sum, and want
to compute it in parallel on a GPU.
• The first phase is easy: if we have N terms in our
sum and M threads, just ask each thread to evaluate
N/M terms and add them up.
• On a recent GPU, M can be something like 160,000.
🤯
• So now we have 160k partial sums stored in
different places. How do we sum them up without
negating the benefit of running the first phase in
parallel?
Taking Large Products
• Easy task for a computer: multiplying P = 1
a big list of numbers for i in range(500):
P *= a*b
• Example. Let a = 10, b = 1/10
print( P )
• Q: What do you get if you multiply # output: 1.0
abababab… 500 times?
– A: 1.0 P = 1
for i in range(500):
• Q: What do you get if you multiply P *= a
aaaaa… 500 times, then bbbbb… 500 for i in range(500):
times? P *= b
What does this print( P )
– A: NaN even mean…? # output: NaN
Numerical & Computational Issues
• In practice, lots to consider beyond formal proofs of convergence
• Floating point
– computers don’t actually operate on real numbers!
• Parallel implementation
float sum = 0.f;
for( int k = 1; k <= n; k++ )
– basic Monte Carlo is “embarrassingly parallel” {
sum += 1.f/(float)k;
}
– but, can get tripped up by behavior of integrand
• In general, relationship between theory & practice is a two-way street
– Practical obstacles may require changes to mathematical formulation
– E.g., different sampling strategies may be better suited to different
architectures (CPU vs. GPU vs. distributed over network vs. …)
Floating Point Numbers
• How do we approximate real number computation so that it’s fast & accurate?
• Several solutions: fixed point, floating point, rational arithmetic, …
• Floating point by far the most common for Monte Carlo methods
– fixed number of bits of storage (32/64 = “single/double precision”)
– exponent takes some bits sign mantissa (a) exponent (b)
+
– leading term/“mantissa” takes the rest –

x=±a 2b

Of course, represents only a small subset of the reals:

0
Floating Point—Representable Numbers (2D)
Errors in Floating Point
• Sequences of arithmetic operations can lose running
total
information/accuracy
• Can also go out of range
• Many common misconceptions, e.g.:
smallest number representable with same
– “dividing by small numbers is bad for accuracy” exponent as running total

– simply not true!

(ignored bits)
• Division/multiplication ⟹ overflow/underflow
• Addition/subtraction ⟹ cancellation error (bits in
smaller term ignored)
• Often have to be more careful about addition/
subtraction
– Monte Carlo is a big sum!
Compensated Summation
• A variety of ways to deal with cancellation in large sums
• One simple strategy: compensated summation
float sum = 0.f; // running total
float c = 0.f; // cancellation error from most recent addition

for( int k = 1; k <= n; k++ )

{
float y = (1.f/(float)k) - c; // subtract previous error from this term
float t = sum + y; // compute a temporary sum
c = (t - sum) - y; // compute the cancellation error
sum = t; // set the new sum to the temporary sum
}

Key idea: pile up “little bits” until they’re big enough to add to the sum.
Harmonic Series—revisited
• Let’s try our harmonic series again, this
time with compensated summation harmonic
N sum
series 1 1
• This time, the partial sum keeps 10 2.92897
increasing, as expected 😮💨 100 5.18738
1k 7.48548

float sum = 0.f; // running total

10k 9.78761
float c = 0.f; // cancellation error from most recent addition
100k 12.0909
for( int k = 1; k <= n; k++ )
1M 14.3927
{
float y = (1.f/(float)k) - c; // subtract previous error from this term 10M 16.6953
float t = sum + y; // compute a temporary sum
c = (t - sum) - y; // compute the cancellation error 100M 18.9978
sum = t; // set the new sum to the temporary sum
} 1B 21.3004
Log-sum-exp trick
• Now that we have a handle on mitigating
cancellation error in large sums, how do we P = 1
for i in range(500):
avoid underflow/overflow in large products? P *= a
• One way: carefully order multiplication so you for i in range(500):
tend to multiply “small terms” with “large terms” P *= b
print( P )
– e.g., sort list of terms, multiply pairs with # output: NaN
roughly reciprocal magnitudes, repeat…
– that’s a lot of effort—and only works if we Might lose
know all our terms ahead of time! S = 0 a little
for i in range(500): accuracy—
– Easier way: log-sum-exp S += log(a) but much
better than
– take logarithm of each term for i in range(500):
failure!
S += log(b)
– sum logarithms print( exp(S) )
#output: 1.0000000000001608
– exponentiate final sum
Parallel Implementation—Challenges
• At first glance, Monte Carlo seems “embarrassingly parallel”: just ask each thread/
core/node/etc., to evaluate its fair share of the terms in the sum.
• Several problems, though:

1. (Easier.) The partial sums have to be collected and aggregated into the final sum,
which means there’s some communication overhead.
2. (Harder.) Each summand may take a very different amount of work, or use
divergent memory accesses, making load balancing and cache management
challenging.
3. (Also harder.) Distributing MCMC doesn’t help with mixing time. 😞

• Some well-established solutions for, e.g., parallel sum/parallel prefix sum… the other
problems are much harder! (And problem-specific.)
Parallel Implementation—Challenges
• At first glance, Monte Carlo seems “embarrassingly parallel”: just ask each thread/
core/node/etc., to evaluate its fair share of the terms in the sum.
• Several problems, though:

• Some well-established solutions for, e.g., parallel sum/parallel prefix sum… the other
problems are much harder! (And problem-specific.)
Random Number Generation
Random Number Generation—Overview
• Our most basic task will be to sample uniform random numbers
– canonically on interval [0,1]; can be transformed to other distributions
• Lots of questions:
– what does “random” really mean?
– do random numbers really exist?
– how can we simulate numbers that appear random?
– how do we measure/test quality of random numbers?
– what features of randomness are important for Monte Carlo?
• Will try to at least get a brief glimpse at some answers!
How do you generate (good) random numbers?

(This is probably not the way.)

Is this a good random number?

47,692

Randomness is a property of collections of numbers…

(discussion cribbed from Avi Wigderson)
Is a coin toss random?
• Suppose I have a coin that can land “heads” or
“tails” with equal probability
• Q: At best, how often can you predict which way
it will land?
• A: 50% of the time (just always guess “heads”)
• Q: Suppose I now let you take as many
measurements as you want right before I flip it,
and do as much computation as you want?
• A: Probably around 100% of the time…
Key idea: randomness is not an objective property—
depends on how much information and
computational power you have.
Randomness & Determinism
One perspective: if all phenomena in the universe follow deterministic
laws (e.g., Newton’s laws of motion) then with enough computational
power you can always predict exactly what will happen.

If everything is predictable, can anything be truly “random”?

Quantum Indeterminacy & Randomness
Another perspective: aren’t some (e.g., quantum mechanical) phenomena in
the universe inherently random?

…maybe? But what if I have quantum information & quantum computation?

Answer: I don’t know—ask Ryan O’Donnell!
Honestly, this stuff does not matter for Monte Carlo*.

We don’t need to simulate “perfect” random numbers!

We merely want to take advantage of useful properties

of random numbers.

(But which ones?)

*…but if you like this kind of thing, there’s a great Abel Prize lecture by Avi Wigderson
Good vs. Bad Random Numbers
• Does the following list of numbers seem like a “good” or “bad” random
sequence for use in Monte Carlo methods?
– 2, 5, 8, 31, 32, 36, 54, 56, 59, 60, 68, 71, 72, 78, 82, 89
• How about this one—is this a “better” or “worse” sequence?
– 31, 36, 8, 5, 82, 89, 72, 60, 54, 2, 71, 32, 56, 78, 68, 59
• If we use these numbers to generate sample points for Monte Carlo
integration, does order even matter? Key idea: “quality” of
random numbers depends
– or just distribution over the sample space? on what you want to use
them for!
Good vs. Bad Random Numbers
Are these “good” random numbers / points?

How can you tell which ones are best suited for Monte Carlo?
“Deterministic random numbers…?”

“Any one who considers arithmetical methods of producing

random digits is, of course, in a state of sin.”

— JOHN VON NEUMANN (1951)

Pseudorandom Number Generation
• A pseudorandom number generator (PRNG) is a deterministic user-provided, or based
algorithm for producing a sequence of numbers x1, x2, … that on time, “entropy” like
“behave like” uniform random numbers cursor motion…
– e.g., iterative application of function F(x) to initial “seed” x1
• Many criteria we might care about for Monte Carlo methods:
– statistical randomness — want algorithms to match theory
– long periods — once a PRNG repeats, subsequent integration is useless!
– cheap to compute — we need a lot of them! Note: a l l P R N G s
n i t e p e r i o d ,
have a fi
– parallel “streams” — doesn’t help if all threads compute same thing! s e t h e y h a v e
becau
e r n a l s t a t e !
– repeatability — especially useful for debugging :-) finite int

– computationally hard to predict cryptography, not Monte Carlo!

Measures of Randomness
• Many possible ways to measure “randomness" “good” PRNG

• E.g., statistical tests:

– (1969) Knuth — Art of Computer Programming
– (1996) DIEHARD battery of tests
– (2007) TestU01 library
• Basically checks whether PRNG agrees with
established (and usually arcane…) facts about “bad” PRNG
behavior of random numbers
– e.g., gaps between random numbers on an interval
should follow exponential distribution
– e.g., number of random points that land in regular
grid cells should follow multinomial distribution
The dark past of random number generators…
1 9 6 0 s , a t r u l y
“…in t h e l a t e
a n d o m n u m b e r “Monte Carlo simulations
ho r r i b l e r on [a model] for which the
A N D U w a s answers are known have s exact
t o r c a l l e d R hown that ostensibly high
genera t o f t h e random number generator -quality
l y u s e d o n m o s s may lead to subtle, but d
common systematic errors … ramatic
p u t e r s … ”
world’s com
a l d K n u t h …these results offer a ster
—Don n warning … a specific alg
o f C o m p u t e r be tested together with the orithm must
The Art random number generator
n g ( Vo l . 2 ) regardless of the tests whic being used,
Programmi h the generator has passed
.”

—Ferrenberg, Landau, &

Wong (1992)
First Random Number Generator
1
• Earliest PRNG was perhaps
suggested by John von Neumann:
– square the current number; take the
middle 10 digits; repeat 0
distribution of digits
• At first glance, looks ok…
• Big problem: when we hit “0”, just pairs as points
repeats “0” forever! Wisdom: when
designing a
random numbe
r generator,
don’t just try so
• More generally: many ad-hoc mething at
random!
schemes get stuck in short cycles
Linear Congruential Generators (LCG)
repeats after
• A linear congruential generator is a 320 points! (“bad” parameters)
sequence of numbers obtained by iteratively
applying a linear map & taking the modulus
– xk+1 = mod (axk + b, m) f
l l w h i c h o
Can you te i s L C G a = 123
– a, b, m are integers, with a, b < m the two be l o w
i s a “ h i g h b = 456
and which G ?
• Surprisingly easy and common way to quali t y ” P R N m = 123456
generate “ok but not great” quality
pseudorandom numbers (e.g., used provided in
C++11!
historically for rand() in many mainstream
systems—like the gcc C++ compiler!)
– need to carefully choose a, b, m
– various conditions (e.g., Hull–Dobell a = 48271
theorem) ensure good behavior Mersenne b=0
31
Twister m = 2 -1
Digits of π
• Already used random numbers to compute π—how
about using π to compute random numbers? 😁
• Conjecture: π is a “normal number,” i.e., for every k ≥ 1,
every string of digits of length k appears with equal
frequency (in any base b)
• Bonus: infinite period/never repeats (π is irrational)
– caveat: must still store index of digit (finite memory)
– Downsides:
– (much) more expensive than other PRNGs
– loading precomputed values is not cache friendly
– fun idea, but there are better ways…
digits of π
Mersenne Twister
• Mersenne Twister often cited as a “good” PRNG since:
– it’s found in many popular languages (Python, C++,
R, PHP, …)
Q: Ok, but how do
– it has a cool name es it work?
A: There’s a truly m
arvelous
• On the other hand: description, which
this box is
– it’s not particularly fast too small to contain
…
– it fails more recent statistical tests of randomness
(e.g., some from TestU01)
– for unlucky seeds, can be a long “burn in” time
before achieving good statistical randomness
– bad for parallel Monte Carlo, since using different
seeds on each thread still yields correlation

See: Vigna, “It Is High Time We Let Go Of The Mersenne Twister” (2019)
Permuted Congruential Generator
• A permuted congruential generator (PCG) is the complete implementation (C++)
most recent hope for a “good” PRNG
#include <stdint.h>

– passes all the statistical tests uint64_t state = 0x4d595df4d0f33173;

uint64_t const multiplier = 6364136223846793005u;
uint64_t const increment = 1442695040888963407u;

– fast, long periods, works well in parallel… uint32_t rotr32(uint32_t x, unsigned r) {

return x >> r | x << (-r & 31);

• How does it work? }

uint32_t pcg32(void) {

– Basically just a LCG! mod(ax + b, m) uint64_t x = state;

unsigned count = (unsigned)(x >> 59);
state = x * multiplier + increment;

– But with a twist: ordinary LCG exhibits short x ^= x >> 18;

return rotr32((uint32_t)(x >> 27), count);
period in low-order bits. So, use high-order bits }

to “shuffle” low-order bits void pcg32_init(uint64_t seed) {

state = seed + increment;
(void)pcg32();
– shuffling means things like XOR or rotating bits }

1 0 0 1 0 0 1 1
Python: numpy.random.PCG64
Comparison of PRNGs
From https://www.pcg-random.org/

Take with a grain of salt—“marketing material” for PCG!

Can AI do better?
• A: I don’t know… maybe?
• Here’s what I get if I ask ChatGPT for random numbers:

“Please give me 64
two-dimensional
points sampled from
the uniform
distribution on the
unit square.”
Stratified Random Numbers
• We discussed stratified sampling as method of
variance reduction
f (x)
• Easy win: why not “bake in” (1D) stratification to
random number generator itself?
• Caveat: to avoid bias, must always request & use
numbers in batches of a known size x

– e.g., initialize PRNG with a maximum sample size,

and stop stratifying after this number has been
reached
• Not something you’d ever do for cryptographic
applications… great for Monte Carlo! :-)
Low-Discrepancy Sequences
Low-Discrepancy Sequences
• Also discussed quasi-Monte Carlo when discussing variance reduction
• Here, replace random numbers with low-discrepancy sequences
– e.g., Halton or Hammersley sequences
• Algorithmically, generation of low-discrepancy points not much different
from pseudorandom number generators (even similar cost)
– goal is different: achieve low discrepancy, rather than modeling uniform
distribution
• How do we actually generate these points algorithmically?
(see Owen chapter on Quasi Monte Carlo for more in-depth treatment)
Review: Discrepancy
• What makes a set of sample points
PN := {p1, …, pN} “good”?
1
• One feature we might like: number of points
covered by a “random” region is proportional
to the area of the region.
• Several ways to formalize—e.g., for points P in
d
unit cube Cd := [0,1] , B pi
B ∩ PN
D(PN ) := sup − vol(B) for some
B∈J N 0
family of subsets J of Cd 0 1

– e.g., J could be all intervals, half intervals, …

Low Discrepancy Sequences—Warm Up
• How can we deterministically generate “well-spaced” points on [0,1]?
• One idea: just subdivide [0,1] into equal size intervals (midpoint rule!)
– unfortunately not progressive…
• Instead, suppose we recursively subdivide interval into pieces:

e l a rg e s t How do we d
“fill in th o this
algorithmical
remain i n g g a p ” ly?

8 4 12 2 10 6 14 1 9 5 13 3 11 7 15
Radical Inverse
• Many low-discrepancy sequences built up using radical inverse ϕb(i), also known
(more or less…) as the van der Corput sequence
– pick a base b, “reflect” digits of integer i across decimal place
∞ k ∞ −1−k
– more precisely, if i = ∑k=0 ikb , then ϕb(i) := ∑i=0 ikb
• Example. Consider integer i = 6. In base 2 (binary), i has digits 110, i.e., two plus
four. Radical inverse in binary is 0.011. In decimal, then, the base 2 radical inverse
of 6 is one fourth plus one eighth, or 1/4 + 1/8 = 3/8

base 10: 4 2 1 1/2 1/4 1/8 4 2 1 1/2 1/4 1/8

ϕb
base 2: 1 1 0.0 0 0 0 0 0.0 1 1
Halton Sequence
• Halton sequence is sequence of d-dimensional points
xi := (ϕb1(i), …, ϕbd(i)) where bases bk are relatively prime
avoids correlation
– most typically: b1, …, bd are just the first d primes between dimensions
– e.g., in 2D xi = (ϕ2(i), ϕ3(i))

b2, b3 = 2,2 b2, b3 = 2,3 b2, b3 = 2,4 b2, b3 = 2,5

Halton sequences are naturally stratified
a b
E.g., in 2D if we generate N = 2 3 Halton points, will fall into
a b
x-intervals of size 1/2 , y-intervals of size 1/3 .
1 1 1

0 0 0
0 1 0 1 0 1
a=2, b=1 a=3, b=2 a=5, b=3
Hammersley Sequence
• Hammersley sequence also builds low-

Halton
discrepancy sample points x1, …, xn from
radical inverse
– in particular, xi1 := (i − 1)/n, and
xij := (ϕi−1, pj) for remaining j, where pj is
the jth prime number

Hammersley
– achieves lower discrepancy than Halton
– …but, requires that total number of
samples n be fixed in advance
Special Distributions
Special Distributions—Overview
• Many continuous distributions beyond uniform come up repeatedly:
– normal, exponential, …
– or uniform distribution on other domains (sphere, simplex, …)
• Well-known formulas for sampling from these
• Basic approach:
– sample uniform random numbers
– apply closed-form expression that “warps” uniform to nonuniform
• Can more generally apply warping to any source of randomness
– stratified, low-discrepancy, …
• Also need to sample discrete distributions (integers, permutations…)
Sampling Integers
• Q: Suppose I have a random number 0%7 = 0 5%7 = 5
generator rand() that provides integers 1%7 = 1 6%7 = 6 0, 1, 2 appea
uniformly sampled from the interval [0, M]. 2%7 = 2 7%7 = 0 twice as ofte r
n!
How can I get a random integer in the 3%7 = 3 8%7 = 1
range [0, a), where a < M? 4%7 = 4 9%7 = 2
# uniformly sample integer
• A: Just compute rand()%a. wrong # from the range [0,a-1],
# assuming rand() gives
• No! “Obvious” strategy doesn’t sample # uniform values in [0,M]
uniformly. def uniformRand( a ):
m = a * floor(M/a)
while True:
• Q: Why not? :-) x = rand()
if x < m:
• A: Consider the case M = 10, a = 7… return x % a
right
• Q: How can we get it right? (two-word j u s t c a l l a
Or
answer) e d ! ) l i b r a r y
(trust
• A: Rejection sampling. function
Sampling Real Values
• As you might imagine, you also have to
think carefully about numerical issues
when sampling other distributions (e.g.,
floating point values in [0,1]).
• Lots of stuff written on how to do it
right…
• Rule of thumb: trust, but verify!

“The conventional method for generating pseudorandom floating point values is to divide a
pseudorandom integer by a floating-point constant. The problems with this approach are (1) the
possible outcomes are only about 7% of representable floating point values (2) subsequent
transformations may not yield the expected distributions (e.g., applying a log transformation to a
uniform distribution does not yield an exponential distribution).”
Sampling a Random Permutation
• For many algorithms, need to sample random
permutations Ω 1
η
– e.g., Latin hypercube sampling 2 5
• Recall that a permutation of a discrete set Ω is a
bijection η : Ω → Ω
3 4
– intuitively, a “shuffling” of the elements

• For a set with n elements, there are n! distinct

permutations
Short a n s w e r : d r a w
• Q: How do we sample permutations uniformly at “tiles” 1– n r a n d o m l y
random, i.e., such we have a 1/n! chance of from a b a g u n t i l
sampling each distinct permutation? none remain!
Slow Shuffle
• One algorithm: starting with the list Ω
Ω = (1,…, n) pick a number k uniformly 1 2 3 4 5
from 1,…, | Ω | ; remove the kth element
from Ω; repeat until Ω is empty.
k: 3 1 2 3 4 5
k: 2 1 2 3 4 5
• Q: How much does this algorithm cost?
k: 3 1 2 3 4 5
• A: To locate/remove the kth element
takes O(n) time; we repeat this process n k: 1 1 2 3 4 5
2
times ⟹ O(n )
k: 1 1 2 3 4 5
– same cost for array, linked list, …
• Q: How can we do better? 3 2 5 1 4
Fisher-Yates Shuffle
• Fisher-Yates shuffle*:
Ω
– start with list Ω = (1,…, n)
1 2 3 4 5
– iterate i from left to right
– swap element i with element j uniformly 4 2 3 1 5
sampled from i, …, n
• Q: What’s the cost?
4 3 2 1 5
• A: O(n): one swap, one call to PRNG per iteration
• Q: Why does this generate uniformly sampled
permutations? 4 3 5 1 2
• A: Each time, we uniformly sample from the
remaining values (same as “slow shuffle”)
4 3 5 1 2
*Warning: assuming 1-based indexing!
Sampling Continuous Distributions
One approach to generating samples from a continuous distribution:
– sample points from the uniform distribution
– apply “warping” function to get target distribution
Warping other collections of samples
Can of course apply warping to any other collection of samples:

uniform stratified Halton blue noise

May or may not preserve strengths of a given sampling strategy

(e.g., points may no longer be nicely stratified after warping).
Review: Normal Distribution
A random variable X on the real line (Ω = ) is normally distributed with
mean µ and standard deviation σ if its probability density function is a
Gaussian of the form

integrates to 1 mean
standard
deviation

eas i l y g e n e r a l i z e s
to n dimensions
Sampling from the Normal Distribution
The Box-Muller transformation provides a simple way to warp uniformly-
2
distributed samples (u, v) ∈ [0,1] into normally-distributed ones:

each component follows

1D normal distribution
Box-Muller Distribution
• Box-Muller gives samples from the standard normal distribution,
with mean μ = 0 and standard deviation σ = 1
– to get other distributions, we can just scale & shift

def SampleNormal( μ, σ ):
u = uniform(0,1) * 2*pi
v = uniform(0,1)
return σ*sqrt(v)*cos(u) + μ

histogram of 10k samples, for μ=5, σ=2

Multivariate Normal Distribution
• In 2D, the standard multivariate normal
1 −(x 2+y 2)/2
p(x, y)
distribution given by p(x, y) = e
2π
y
• Q: How can we sample from it?
• A: Sample x, y from two independent 1D
standard normal distributions x
• Q: Why does this work?
∞
• A: One way to see it: marginal distribution
∫−∞
1 −x 2/2
with respect to both x and y is standard 1D p(x, y) dy = e
2π
normal distribution ensional
To s a m p l e d - d i m ∞

∫−∞
distributio n , j u s t s a m p l e 1 −y 2/2
p(x, y) dx = e
X1, …, Xd ∼ 𝒩0,1 2π
Sampling the Unit Circle
• Plenty of other standard / simple
distributions we need to sample from
• Q: Easy example: how do we draw
samples uniformly from the unit circle
1 2
S := {x ∈ ℝ : ∥x∥ = 1}?
• A: Just sample a value u uniformly from
[-1,1], then evaluate (cos(πu), sin(πu))
• Q: How do you same from [-1,1]? :-)
• A: Sample [0,1], multiply by 2, subtract 1
Sampling a 2D disk
• How do we get uniform samples on the
2 2
unit disk D := {x ∈ ℝ : ∥x∥ ≤ 1}?
wrong
• First idea: uniformly sample r, u from
(nonuniform)
[0,1], then plug in to r(cos(2πu)sin(2πu))
• Doesn’t work!
– too many samples near the origin…
right
• Simple modification:
(uniform)
u(cos(2πv), sin(2πv)) for u, v ∼ U[0,1]
– we’ll derive this formula later!
Sampling a d-Dimensional Sphere
• Q: How do we uniformly sample the d-
dimensional sphere
n n+1
S := {x ∈ ℝ : ∥x∥ = 1}?
2 2
– Hint: since the L2 norm x1 +⋯+ xd
doesn’t depend on the choice of
coordinate system, a d-dimensional
−∥x∥2
Gaussian e exhibits spherical
symmetry
• A: Just generate d+1 normally-distributed
coordinates x1, …, xd ∼ 𝒩0,1 and
normalize them (y := x/∥x∥).
Sampling a d-Dimensional Ball
• First sample a point y uniformly
from the d-dimensional sphere wrong
• Then draw a value a uniformly (nonuniform)
from [0,1]
• Uniform point on the ball is then
1/d
a y
• Food for thought: can you right
derive this formula (using the (uniform)
techniques we’ll see later)?
Sampling a Random Rotation
• In general, normal distribution is starting
point for sampling many other basic
distributions
• E.g., to sample
⊤
a d-dimensional rotation
matrix Q Q = I, det(Q) > 0:
– sample a d × d matrix A where each entry
is sampled from the standard (1D) normal
distribution
– compute the
⊤
spectral decomposition
A = QΛQ
– then Q is a matrix uniformly sampled*
from the rotation group

*with respect to the Haar measure

Beyond Basic Distributions

s w e c a n
e a t t r i c k
L o t s o f n e b a s i c
t o s a m p l
play u t i o n s …
dis t r i b
…but just l i k e i n t e g r a l s ,
there a re o n l y s o m a n y
distrib u t i o n s w e c a n
sample in c l o s e d f o r m .
p(x)
How can we hand
le more
general distributio
ns?

x
Discrete Distributions
Sampling Discrete Distributions
• Consider a discrete random variable X with
outcomes x1, …, xn, and associated probabilities probability 0.29

– p1, …, pn ∈ [0,1] 0.22

– p1 + ⋯ + pn = 1 0.13 0.14
0.11 0.11
• How do we construct an algorithmic procedure
that models this random variable?
• I.e., probability of sampling xi is pi
– repeatedly running the procedure N times
yields each outcome xi with fraction
approaching pi as N → ∞
• Several methods
– inversion, rejection, resampling, alias table…
Review: Cumulative Distribution
For any discrete random variable X, the cumulative distribution function
P(x) is the monotonically increasing function given by eac h cumulati
probability v e
the sum of
Pk i s just
the first k
probabilitie
s pi
1

3
4
1
2
1
4

0
2 3 4 5 6 7 8 9 10 11 12
Computing the CDF
k
• Want to compute Pk := ∑i=1 pi for k
∈ 1,…, n
• Q: What’s the asymptotic cost of evaluating all p1 p2 p3 p4 p5 p6
n sums directly?
• A: Have to sum one term for the 1st sum, two
terms for the 2nd…
2 2
– 1 + 2 + ⋯ + n = (n + n)/2 ⟹ O(n )
• Q: Can you do it faster, in O(n) time?
• A: Sure, just store the “partial sums”
– P1 = p1, P2 = P1 + p2, …
– known as cumulative sum/prefix sum/scan
Computing the CDF
k
• Want to compute Pk := ∑i=1 pi for k
∈ 1,…, n
• Q: What’s the asymptotic cost of evaluating all p2 p3 p4 p5 p6
n sums directly?
• A: Have to sum one term for the 1st sum, two
terms for the 2nd…
2 2
– 1 + 2 + ⋯ + n = (n + n)/2 ⟹ O(n )
• Q: Can you do it faster, in O(n) time?
• A: Sure, just store the “partial sums”
– P1 = p1, P2 = P1 + p2, …
– known as cumulative sum/prefix sum/scan
p1 p1
Computing the CDF
k
• Want to compute Pk := ∑i=1 pi for k
∈ 1,…, n
• Q: What’s the asymptotic cost of evaluating all p3 p4 p5 p6
n sums directly?
• A: Have to sum one term for the 1st sum, two
terms for the 2nd…
2 2
– 1 + 2 + ⋯ + n = (n + n)/2 ⟹ O(n )
• Q: Can you do it faster, in O(n) time?
• A: Sure, just store the “partial sums”
– P1 = p1, P2 = P1 + p2, …
– known as cumulative sum/prefix sum/scan p2 p2
p1 p1 p1
Computing the CDF
k
• Want to compute Pk := ∑i=1 pi for k
∈ 1,…, n
• Q: What’s the asymptotic cost of evaluating all p4 p5 p6
n sums directly?
• A: Have to sum one term for the 1st sum, two
terms for the 2nd…
2 2
– 1 + 2 + ⋯ + n = (n + n)/2 ⟹ O(n )
• Q: Can you do it faster, in O(n) time?
• A: Sure, just store the “partial sums”
– P1 = p1, P2 = P1 + p2, … p3 p3
– known as cumulative sum/prefix sum/scan p2 p2 p2
p1 p1 p1 p1
Computing the CDF
k
• Want to compute Pk := ∑i=1 pi for k
∈ 1,…, n
• Q: What’s the asymptotic cost of evaluating all p5 p6
n sums directly?
• A: Have to sum one term for the 1st sum, two
terms for the 2nd…
2 2
– 1 + 2 + ⋯ + n = (n + n)/2 ⟹ O(n )
• Q: Can you do it faster, in O(n) time?
• A: Sure, just store the “partial sums” p4 p4
– P1 = p1, P2 = P1 + p2, … p3 p3 p3
– known as cumulative sum/prefix sum/scan p2 p2 p2 p2
p1 p1 p1 p1 p1
Computing the CDF
k
• Want to compute Pk := ∑i=1 pi for k
∈ 1,…, n
• Q: What’s the asymptotic cost of evaluating all p6
n sums directly?
• A: Have to sum one term for the 1st sum, two
terms for the 2nd…
2 2
– 1 + 2 + ⋯ + n = (n + n)/2 ⟹ O(n )
p5 p5
• Q: Can you do it faster, in O(n) time?
• A: Sure, just store the “partial sums” p4 p4 p4
– P1 = p1, P2 = P1 + p2, … p3 p3 p3 p3
– known as cumulative sum/prefix sum/scan p2 p2 p2 p2 p2
p1 p1 p1 p1 p1 p1
Computing the CDF parallel scan
can
k build CDF
• Want to compute Pk := ∑i=1 pi for k
∈ 1,…, n O(log n) tim in
e
• Q: What’s the asymptotic cost of evaluating all
n sums directly?
• A: Have to sum one term for the 1st sum, two P6 = 1
terms for the 2nd…
2 2 p6
– 1 + 2 + ⋯ + n = (n + n)/2 ⟹ O(n ) P5
p5 p5
• Q: Can you do it faster, in O(n) time? P4

• A: Sure, just store the “partial sums” p4 p4 p4

P3
– P1 = p1, P2 = P1 + p2, … P2
p3 p3 p3 p3
– known as cumulative sum/prefix sum/scan p2 p2 p2 p2 p2
P1
p1 p1 p1 p1 p1 p1
• Q: Can you do it even faster, in O(log n) time? 0
Inverting the CDF
A: Interval [0,1]
Suppose you’ve already computed the into pieces prop i s s p l i t
ortional
CDF (array with values P1, …, Pn). to probabilities
pk
Q: How do you draw samples from the 1 P6 = 1
discrete distribution p?
u
A: Two steps: p6
P5
p5 p5
– sample u uniformly from [0,1] P4

– find the first k such that Pk > u p4 p4 p4

P3
p3 p3 p3 p3
(Then just return event xk) P2
p2 p2 p2 p2 p2
P1
Q: Why do e s t h i s w o r k ? p1 p1 p1 p1 p1 p1
0 0
Cost of Inversion Method
• Q: Given u, what’s the cost of locating the corresponding event?
– O(n): at worst, have to search whole list
• Q: How can you do better?
– use binary search O(log n) per sample
• Q: What’s the overall cost to take a single sample?
– O(n log n + log n) = O(n log n)
– But since we need only build CDF table once, cost is shared by all
subsequent samples
– In particular, for N ≫ n, approach amortized cost of O(log n) per sample
If distribution is large (n big), or need just a few samples
(N small), might want to take a different approach…
Alias Table
• Can get O(1) amortized sampling via alias table

rich ⟶
i c h c a n
• Construct via Robin Hood algorithm: o t e : r !
N e p o o r
b e c o m
– first compute the mean probability
p := (p1 + ⋯ + pn)/n
– anybody above p is “rich”, anyone below is
“poor”, anyone equal is “fair” p
– iterate over events from left-to-right
– then just rob from the rich, give to the poor…
– keep track of who you robbed from

⟵ poor
Alias Table
• Can get O(1) amortized sampling via alias table : s i n c e a r e a
Key idea k i s
o v e re d b y r e g i o n
c
• Construct via Robin Hood algorithm:
pr o p o r t i o n a l t o p k, c a n
a m p l e u n i f o r m l y
– first compute the mean probability ju s t s
p := (p1 + ⋯ + pn)/n r e c t a n g l e .
from this
– anybody above p is “rich”, anyone below is
“poor”, anyone equal is “fair”
– iterate over events from left-to-right
– then just rob from the rich, give to the poor…
– keep track of who you robbed from
• To sample from alias table:
– sample index k uniformly from 1, …, n
– use biased coin flip to pick event in kth column
Inversion
Inversion Method—Overview
• How do we get formulas for special distributions (normal, etc.)?
• Can be via continuous version of the inversion method
• Inversion method for continuous distributions:
– start with probability distribution p(x) we want to sample from
– construct the associated cumulative distribution P(x) via integration
−1
– find an explicit form for the inverse CDF, P (y)
−1
– apply P to uniformly-distributed samples
• Basic idea is to “warp” uniform samples into nonuniform ones
Inversion Method—1D
• More explicitly, consider a probability
density p : [a, b] → [0,1]
• Corresponding cumulative distribution
x

∫0
function given by P(x) := p(y) dy a b l e
a re t o b e
Qui t e r g i n
v e r y t h i n
−1 d o e
• The random variable Y := P (U[a,b])
to f o r m !
clo s e d
then has probability density p
– i.e., draw samples uniformly, then
apply inverse CDF
Inversion Method—Example
π
Consider the probability density p(x) := sin(πx) on x ∈ [0,1]
2
p(x)
Step I: integrate PDF to get the CDF
x

[ 2 cos(πy)] =
x

∫0
π 1 1 1
P(x) = 2
sin(πy) dy = − −2 cos(πx)+ 2
0
x
a b
P(x)
Step II: invert the CDF
1 1
y= 2
(1 − cos(πx))
⟺ cos(πx) = 1 − 2y
⟺ x = arccos(1 − 2y)/π
now just plug in random values y ∼ U[0,1] 0
a
x
Inversion Method—2D
Can extend inversion method to multiple dimensions, via the multivariable CDF.
2
E.g., to sample from a two-dimensional region Ω ⊂ ℝ : Step IV: invert the
distributions
Step I: define a map Step II: measure the
−1
to the region change in density f(u) := F (u)
−1
2
ϕ : [0,1] → Ω ρ(u, v) := det Jϕ(u, v) g(u, v) := Gu (v)

Step III: define 2D cumulative distributions Then “warp”

uniform samples:
1 s t in “column” s,
∫0 ∫0 ρ(u, v) dudv ∫0 ρ(s, v) dv
F(s) := Gs(t) :=
what fraction is ϕ( f(u), g(u, v))
1 1 1 covered by 1D
∫0 ∫0 ρ(u, v) dudv ∫0 ρ(s, v) dv interval [0,t]? u, v ∼ U[0,1]

what fraction of target area is covered

by 2D rectangle ϕ([0,s] × [0,1])? see: Arvo, “Stratified Sampling of 2-Manifolds” (2001)
Inversion Method—2D Disk
2 Step III. Integrate CDFs:
Example. Consider the 2D unit disk Ω := {x ∈ ℝ : ∥x∥ ≤ 1}.
F(s) = s
2
Step I. Define map from unit
Gs(t) = t
Step II. Calculate change in
2
square u, v ∈ [0,1] to unit disk: density due to “squashing”: Step IV. Invert CDFs:
ϕ(u, v) = u(cos(2πv), sin(2πv)) f(u) = u
[ 2πv cos(2πu) sin(2πu) ]
−2πv sin(2πu) cos(2πu) g(u, v) = v
Jϕ =
Step V. Compose with mapping:
1/2
ρ(u, v) = (det Jϕ) = − 2πv ϕ( f(u), g(u, v)) =
v(cos(2πu), sin(2πu))

ϕ(u, v) ϕ( f(u), g(u, v))

Inversion Method—2D Disk
nonuniform warping uniform warping reparameterization

ϕ ϕ∘η η
Q: Does this story remind e s a s e n s e o f
A: It’s what we ideally want to Maybe giv
you of anything? a t ’ s h a rd t o d o …
do with importance sampling! why th
Rejection Sampling
Rejection Sampling—Overview
• Inversion method is great when it works…
– …can’t always invert & integrate a distribution!
• In general, will need more universal techniques
• Rejection sampling
– stupid but simple
– essentially just (repeatedly) “guess and check”
– example of Las Vegas algorithm: always yields
correct result, but run time is nondeterministic
Rejection Sampling—Region
• Suppose we want to rejection sample
2
a region Ω ⊂ ℝ
A
– only requirement: must be able to
answer query “is a point x inside Ω”
• Let A ⊂ Ω be a region that we know
how to (efficiently) sample uniformly
– e.g., a rectangle bounding Ω
• until we find a point x ∈ Ω:
Ω
– draw a point x ∼ UA
Example—Unit Disk
• Suppose our region is the unit disk (1,1)
2
Ω := {x ∈ ℝ : ∥x∥ ≤ 1}
A
• Q: If I have a function rand() that gives uniform Ω
values in [0,1], how do I uniformly sample the
(smallest) box around Ω?
• A: Evaluate
u,v = [2*rand()-1, 2*rand()-1]

• Q: How do I check if this point is in the region Ω?

(-1,1)
• A: Check if u*u + v*v <= 1
Rejection Sampling—Distribution
• Rather than a region Ω, suppose our p(x)
goal is to draw samples from a given pmax
probability distribution p : Ω → [0,1]
reject
– must be able to sample Ω uniformly
– must have upper bound pmax on p accept

• until we pass the test: x

0 1
– sample a point x ∼ UΩ “Taller” regions
more likely to i n g
– sample a value u ∼ U[0,pmax] a l l y j u s t a p p l y
be sampled Basi c p l i n g ”
n r e j e c t i o n s a m
“ re g i o g r a p h !
e r t h e
– test whether p(x) < u t o re g i o n u n d
Rejection Sampling—Examples
For which of these examples will rejection sampling be least efficient?
A
A Ω A
Ω Ω

For which of these examples will rejection sampling be most efficient?

p(x) p(x)
pmax pmax

x x
0 1 0 1
Analysis of Rejection Sampling
• How many trials should we expect rejection sampling
to take on average? A
– each trial costs us more PRNG invocations
• Q: What’s probability p1 that we accept the 1st sample? Ω
• A: p1 = | Ω | / | A |
• Q: What’s the probability we accept the 2nd sample?
• A: p2 = (1 − p1)p1
• Q: What’s the probability we accept the kth sample? expected number of trials:
k−1
• A: pk = (1 − p1) p1
u n i t d i s k :
e.g., f o r
• Q: What’s the expected number of trials? ) ≈ 1 . 27
∞ ∞
1 / (π / 4
k−1
• A: ∑k=1 kpk = ∑k=1 k(1 − p1) p1 = 1/p1
Disk—Rejection Sampling vs. Inversion
• Earlier, saw how to sample from the disk using inversion
• Which approach is better: rejection or inversion?
• Rejection
2 2
– evaluate x + y multiple times (~1.27 on average)
– call PRNG multiple times (~2 x 1.27 = 2.54 on average)
• Inversion
– evaluate square root, cosine, and sine once
– call PRNG twice (radius and angle)
• Winner: …depends on a lot of factors!
– speed of PRNG, cost of special functions,
multithreading…
• Universal rule: profile, profile, profile!
– i.e., which gives best results for actual application? inversion rejection
Rejection Sampling the n-Dimensional Ball
• Still, overall, rejection sampling is volume

looking pretty attractive for 2D disk

• But what if we go up in dimension?
• Fraction of d-dimensional box covered
by d-dimensional ball goes up… then d
5 10 15 20
back down! trials
40
• Hence, number of trials goes up 30
dramatically 20
r e i s M u s t t h i n k a b o
e n e r a l t h e u t
I n g l l e t ! y o u r p r o b l e m 10
v e r b u ,
no sil pick the right t
ool. d
5 10 15 20
Resampling
Importance Resampling
• Suppose we want samples from a “tough” distribution p(x) but:
– we don’t know how to invert it analytically
p(x)
– we don’t know how to rejection sample it efficiently
• Importance resampling combines discrete & continuous
strategies:
– take samples of “easy” continuous distribution ?
– construct discrete distribution from these samples x
– sample from discrete distribution
• PROS: always works, trivial to implement, just as good as sampling from p(x)
directly (e.g., in terms of variance reduction)
• CONS: adds a bit of bias to any resulting estimator, due to discretization error
Importance Resampling
• Suppose we want to generate N p(x) q(x)
samples Xi from a distribution
p : Ω → [0,1]
• Let q : Ω → [0,1] be a distribution we
can (efficiently) draw samples from x x
– ideally, q should approximate p w
sam
ple
ei s
gh
• Step I. Draw M ≫ N samples Yi from q ts

• Step II. Compute weights m p l e

wi = p(Yi)/q(Yi) resa

frequency
• Step III. Draw N samples Xi from the frequency
discrete distribution with events Yi and
normalized weights Wi := wi / ∑j wj. X Y
Summary
Takeaway—Monte Carlo Integration
If you forget everything else from this class, remember this:

If you ever encounter a really tricky integral—

or just want to quickly prototype something—
99% of the time you can just write some
Ev e r y t h i n g
“quick and dirty” code that does: t a n
else i s j u s
optim i z a t i o n !
rejection sampling
+
basic Monte Carlo

Pick a point x at random. Check if it’s in the

domain. If so, add f(x) to your average. Repeat.
Thanks!

MONTE CARLO METHODS

AND APPLICATIONS
Carnegie Mellon University | 21-387 | 15-327/627 | 15-860 | Fall 2023

R Numeric Programming
100% (1)
R Numeric Programming
124 pages
04 Numerics
No ratings yet
04 Numerics
124 pages
Unit 4 - 2
No ratings yet
Unit 4 - 2
21 pages
Computational Physics I: Luigi Scorzato Lecture 2: Floating Point Arithmetic
No ratings yet
Computational Physics I: Luigi Scorzato Lecture 2: Floating Point Arithmetic
7 pages
Numerical Methods Notes
100% (1)
Numerical Methods Notes
553 pages
Digital Signal Processing With Field Programmable Gate Arrays
No ratings yet
Digital Signal Processing With Field Programmable Gate Arrays
42 pages
Summary of Integer Arithmetic and ALU: - Addition
No ratings yet
Summary of Integer Arithmetic and ALU: - Addition
22 pages
IE 403 Ch03 RNG RVG With Comments
No ratings yet
IE 403 Ch03 RNG RVG With Comments
27 pages
Numerical Methods
No ratings yet
Numerical Methods
72 pages
BiD 09
No ratings yet
BiD 09
56 pages
Unit Ii
No ratings yet
Unit Ii
10 pages
ME3105 Notes 1
No ratings yet
ME3105 Notes 1
10 pages
Chapter 2: Computer Memory and Storage, Representing Numbers, Random Numbers
No ratings yet
Chapter 2: Computer Memory and Storage, Representing Numbers, Random Numbers
7 pages
A Short Course in Python For Number Theory: Jim Carlson
No ratings yet
A Short Course in Python For Number Theory: Jim Carlson
17 pages
Floating Point Arithmetic Class
No ratings yet
Floating Point Arithmetic Class
24 pages
Numerical Analysis Lecture Notes: 1. Computer Arithmetic
No ratings yet
Numerical Analysis Lecture Notes: 1. Computer Arithmetic
6 pages
Lect 13
No ratings yet
Lect 13
41 pages
Numerical Analysis Essentials
No ratings yet
Numerical Analysis Essentials
68 pages
Operations On Floating Point Numbers
No ratings yet
Operations On Floating Point Numbers
16 pages
This Unit: Arithmetic and ALU Design Floating Point Arithmetic
No ratings yet
This Unit: Arithmetic and ALU Design Floating Point Arithmetic
8 pages
Summary I - 2018-2019 Edition
No ratings yet
Summary I - 2018-2019 Edition
39 pages
GSC-320 Numerical Computing: Lecturer:Fasiha Ikram
No ratings yet
GSC-320 Numerical Computing: Lecturer:Fasiha Ikram
17 pages
Floating-Point Numbers and Round-Off Errors by Kusal Kaluarachchi Medium
No ratings yet
Floating-Point Numbers and Round-Off Errors by Kusal Kaluarachchi Medium
2 pages
Floating Point: - We Need A Way To Represent
No ratings yet
Floating Point: - We Need A Way To Represent
14 pages
Math For Programmers PDF
100% (1)
Math For Programmers PDF
127 pages
ENGG1811: Lab 02 Work Book
No ratings yet
ENGG1811: Lab 02 Work Book
13 pages
Errors and Computer Arithmetic
No ratings yet
Errors and Computer Arithmetic
7 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
Aula Ch3
No ratings yet
Aula Ch3
21 pages
Posit Arithmetic: A Superior Alternative
No ratings yet
Posit Arithmetic: A Superior Alternative
16 pages
BeatingFloatingPoint PDF
No ratings yet
BeatingFloatingPoint PDF
16 pages
8.3 Floating Point Numbers
No ratings yet
8.3 Floating Point Numbers
19 pages
Floating Point
No ratings yet
Floating Point
3 pages
CS 179: GPU Computing: Lecture 16: Simulations and Randomness
No ratings yet
CS 179: GPU Computing: Lecture 16: Simulations and Randomness
61 pages
Computer Organization & Architecture - CHAPTER-6.Pdf - 3580
No ratings yet
Computer Organization & Architecture - CHAPTER-6.Pdf - 3580
32 pages
Numerical Algorithms for Engineers
No ratings yet
Numerical Algorithms for Engineers
87 pages
Num Math
No ratings yet
Num Math
87 pages
Term Paper On Monte Carlo Method
No ratings yet
Term Paper On Monte Carlo Method
10 pages
π on a computer!), but a piece of paper or
No ratings yet
π on a computer!), but a piece of paper or
79 pages
PCP 2022 6 ParallelAlgorithms PartI
No ratings yet
PCP 2022 6 ParallelAlgorithms PartI
39 pages
Computer Science Formulas Guide
No ratings yet
Computer Science Formulas Guide
10 pages
Unit 2
No ratings yet
Unit 2
9 pages
Demystifying Floating Point - John Farrier - CppCon 2015
No ratings yet
Demystifying Floating Point - John Farrier - CppCon 2015
61 pages
Numerical Analysis 1 Notes
No ratings yet
Numerical Analysis 1 Notes
111 pages
Computer Organization and Architecture: William Stallings
No ratings yet
Computer Organization and Architecture: William Stallings
7 pages
L2-Variables and Floating Point Number System
No ratings yet
L2-Variables and Floating Point Number System
38 pages
Representing Data On The Computer: Ramani Duraiswami Amsc/Cmsc 662 Fall 2009
No ratings yet
Representing Data On The Computer: Ramani Duraiswami Amsc/Cmsc 662 Fall 2009
34 pages
IEEE 754 Floating Point Notes
No ratings yet
IEEE 754 Floating Point Notes
4 pages
Lecture Notes On Numerical Methods For Engineering (?) : Pedro Fortuny Ayuso
No ratings yet
Lecture Notes On Numerical Methods For Engineering (?) : Pedro Fortuny Ayuso
104 pages
Lecture Notes On Numerical Methods For Engineering (?) : Pedro Fortuny Ayuso
No ratings yet
Lecture Notes On Numerical Methods For Engineering (?) : Pedro Fortuny Ayuso
104 pages
Kactl
No ratings yet
Kactl
34 pages
IEEE 754 Floating Point Guide
No ratings yet
IEEE 754 Floating Point Guide
14 pages
Floating Point Guide 2015-10-15
No ratings yet
Floating Point Guide 2015-10-15
16 pages
Carry Lookahead Adders & IEEE 754
No ratings yet
Carry Lookahead Adders & IEEE 754
10 pages
Introduction To Scientific Computing With Maple Programming
100% (1)
Introduction To Scientific Computing With Maple Programming
215 pages
Lecture05 VarianceReduction CMU
No ratings yet
Lecture05 VarianceReduction CMU
73 pages
Lecture12 PDEsStochasticProcesses CMU
No ratings yet
Lecture12 PDEsStochasticProcesses CMU
80 pages
ACFD - Chapter 6
No ratings yet
ACFD - Chapter 6
34 pages
ACFD - Chapter 5
No ratings yet
ACFD - Chapter 5
4 pages
Viusal Pathway: DR Raghuveer Choudhary Dept. of Physiology DR S.N.Medical College Jodhpur
No ratings yet
Viusal Pathway: DR Raghuveer Choudhary Dept. of Physiology DR S.N.Medical College Jodhpur
35 pages
Islamic Banking vs. Conventional Banking
No ratings yet
Islamic Banking vs. Conventional Banking
4 pages
EEN 443 - Exam 2 - Closed Book (1 Hour) Nov. 23, 2021: S (T) S (T) A
No ratings yet
EEN 443 - Exam 2 - Closed Book (1 Hour) Nov. 23, 2021: S (T) S (T) A
2 pages
Somalia's Economic Challenges
No ratings yet
Somalia's Economic Challenges
14 pages
Dariagan, Mia Faye, Tuazon: Essay Part I
No ratings yet
Dariagan, Mia Faye, Tuazon: Essay Part I
5 pages
Some Obstacles To Conducting Scientific Research in The Management Area
No ratings yet
Some Obstacles To Conducting Scientific Research in The Management Area
8 pages
Prepositions: A Comprehensive Guide
No ratings yet
Prepositions: A Comprehensive Guide
13 pages
Differential Equation - Flow Through An Orifice
No ratings yet
Differential Equation - Flow Through An Orifice
12 pages
Aruna Salon Group 6
No ratings yet
Aruna Salon Group 6
13 pages
UPPC V ACROPOLIS
No ratings yet
UPPC V ACROPOLIS
2 pages
Rozvi
No ratings yet
Rozvi
5 pages
SAP HANA Client Installation Update Guide en
No ratings yet
SAP HANA Client Installation Update Guide en
16 pages
al 术语英文缩写
No ratings yet
al 术语英文缩写
10 pages
Blades Sheets v8 2 Leech
No ratings yet
Blades Sheets v8 2 Leech
1 page
Mammal Status Analysis in Lalitpur Zoo
No ratings yet
Mammal Status Analysis in Lalitpur Zoo
50 pages
Music Activities and Resources Guide
No ratings yet
Music Activities and Resources Guide
1 page
Document
No ratings yet
Document
483 pages
SOCI1001A L02 Sociological Perspective I 2024 - 25 - A
No ratings yet
SOCI1001A L02 Sociological Perspective I 2024 - 25 - A
29 pages
Somali Peace Concerns and Arta Outcome
No ratings yet
Somali Peace Concerns and Arta Outcome
10 pages
Hip Arthros
No ratings yet
Hip Arthros
8 pages
ResEngCh10 PDF
50% (2)
ResEngCh10 PDF
131 pages
Lead Proposal For Dwchs
No ratings yet
Lead Proposal For Dwchs
4 pages
Insulin Injection: Uses and Guidelines
No ratings yet
Insulin Injection: Uses and Guidelines
1 page
The Princess - Walkthrough Revised V0.9.1
No ratings yet
The Princess - Walkthrough Revised V0.9.1
10 pages
SyberVision - Workbook - Leaders, The Strategies of Taking Charge - Leadersstudyguide
67% (3)
SyberVision - Workbook - Leaders, The Strategies of Taking Charge - Leadersstudyguide
48 pages
Cardiogenic Shock
No ratings yet
Cardiogenic Shock
3 pages
Final Lab Grouping For ME2142, ME3112, ME3122, ME3162 For Semester 1 of AY1415 As at 05082014
No ratings yet
Final Lab Grouping For ME2142, ME3112, ME3122, ME3162 For Semester 1 of AY1415 As at 05082014
9 pages
Principles of Good Customer Service - Business Queensland1
No ratings yet
Principles of Good Customer Service - Business Queensland1
2 pages
Imagination of Power!
No ratings yet
Imagination of Power!
2 pages
Natural PH Indicators - Make A PH Indicator Using Beetroot or Cabbage! PDF
No ratings yet
Natural PH Indicators - Make A PH Indicator Using Beetroot or Cabbage! PDF
10 pages

Lecture06 BasicSampleGeneration CMU

Uploaded by

Lecture06 BasicSampleGeneration CMU

Uploaded by

MONTE CARLO

Carnegie Mellon University | 21-387 | 15-327/627 | 15-860 | Fall 2023

MONTE CARLO METHODS

basic Monte Carlo estimator

Of course, represents only a small subset of the reals:

– simply not true!

for( int k = 1; k <= n; k++ )

float sum = 0.f; // running total

(This is probably not the way.)

Randomness is a property of collections of numbers…

If everything is predictable, can anything be truly “random”?

…maybe? But what if I have quantum information & quantum computation?

We don’t need to simulate “perfect” random numbers!

We merely want to take advantage of useful properties

(But which ones?)

“Any one who considers arithmetical methods of producing

— JOHN VON NEUMANN (1951)

– computationally hard to predict cryptography, not Monte Carlo!

• E.g., statistical tests:

—Ferrenberg, Landau, &

– passes all the statistical tests uint64_t state = 0x4d595df4d0f33173;

– fast, long periods, works well in parallel… uint32_t rotr32(uint32_t x, unsigned r) {

• How does it work? }

– Basically just a LCG! mod(ax + b, m) uint64_t x = state;

– But with a twist: ordinary LCG exhibits short x ^= x >> 18;

to “shuffle” low-order bits void pcg32_init(uint64_t seed) {

Take with a grain of salt—“marketing material” for PCG!

– e.g., initialize PRNG with a maximum sample size,

– e.g., J could be all intervals, half intervals, …

base 10: 4 2 1 1/2 1/4 1/8 4 2 1 1/2 1/4 1/8

b2, b3 = 2,2 b2, b3 = 2,3 b2, b3 = 2,4 b2, b3 = 2,5

• For a set with n elements, there are n! distinct

uniform stratified Halton blue noise

May or may not preserve strengths of a given sampling strategy

each component follows

histogram of 10k samples, for μ=5, σ=2

*with respect to the Haar measure

– p1, …, pn ∈ [0,1] 0.22

• A: Sure, just store the “partial sums” p4 p4 p4

– find the first k such that Pk > u p4 p4 p4

Step III: define 2D cumulative distributions Then “warp”

what fraction of target area is covered

ϕ(u, v) ϕ( f(u), g(u, v))

• Q: How do I check if this point is in the region Ω?

• until we pass the test: x

For which of these examples will rejection sampling be most efficient?

looking pretty attractive for 2D disk

• Step II. Compute weights m p l e

If you ever encounter a really tricky integral—

Pick a point x at random. Check if it’s in the

MONTE CARLO METHODS

You might also like