Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explanation of speed difference between random_walk and random_walk_faster looks misleading #101

Open
gu1p opened this issue Feb 6, 2021 · 4 comments

Comments

@gu1p
Copy link

gu1p commented Feb 6, 2021

I was reading the Introduction of the book and I found myself trying to understand why random_walk_faster is ~7 times faster than random_walk. That's a huge difference!

Those are the functions:

def random_walk(n):
    position = 0
    walk = [position]
    for i in range(n):
        position += 2*random.randint(0, 1)-1
        walk.append(position)
    return walk

def random_walk_faster(n=1000):
    from itertools import accumulate
    # Only available from Python 3.6
    steps = random.choices([-1,+1], k=n)
    return [0]+list(accumulate(steps))

Running some tests, I figure out that most of the difference in speed is related to the way we compute the next random step, not because we are using a "vectorized approach" instead of a procedural, as stated in the explanation:

In fact, we've just vectorized our function. Instead of looping for picking sequential steps and add them to the current position, we first generated all the steps at once and used the accumulate function to compute all the positions.

We got rid of the loop and this makes things faster.

This last statement is also not accurate. Actually, we are just replacing an explicit loop for other loops hidden in the machinery of Python (that can, of course, rely on faster functions implemented in C).

Looking at the implementation of random.choices, the trick to gain speed is to use something like population[floor(random.random() * len(population)] to compute each next step.

Applying this in the random_walk, we have:

from math import floor

def random_walk(n):
    population = [-1,+1]
    position = 0
    walk = [position]
    for i in range(n):
        position += population[floor(random.random() * 2)]
        walk.append(position)
    return walk

Now the difference of speed for 10k steps is ~1.6.

@rougier
Copy link
Owner

rougier commented Feb 8, 2021

Yes, when I say we get rid of the loop, I meant that the loop has been externalized at the C level (hopefully) and this is supposed to be faster because we do not have all the Python machinery for loop. To have a fair comparion, we should used the exact same random generator in two cases:

for i in range(1_000_000):
    Z = Z + random.choices([-1,1])[0]

Z = sum(random.choices([-1,1],k=1_000_000))

@gu1p
Copy link
Author

gu1p commented Feb 8, 2021

When I said "hidden in the machinery of Python" I meant vanilla Python code, but that comes from the standard library (not necessarily implemented in C). Most of the difference in speed execution (7 times faster vs 1.6 faster) between both functions is just because we're using two different ways of computing the next random step: 2*random.randint(0, 1)-1 vs [-1, +1][math.floor(random.random() * 2)]. The last one is basically hidden in random.choices (that is a vanilla Python function) that is used in random_walk_faster. So we are making an unfair comparison and my whole point is that we should use the same underlying functions to calculate the next random step in both functions in order to make a fair comparison . random_walk_faster will continue to be faster, but not 7 times faster.

Sorry if all this looks nitpicking. The book is great!

@rougier
Copy link
Owner

rougier commented Feb 8, 2021

Thank and no problem with nitpicking. When comparing the exact same method with an without loop, I still find the 7x factor:

In [12]: %timeit for i in range(1_000_000): random.choices([-1,1])
1.01 s ± 15.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [13]: %timeit random.choices([-1,1], k=1_000_000)
146 ms ± 1.45 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

@gu1p
Copy link
Author

gu1p commented Feb 8, 2021

If you see how random.choices works you will notice that what you actually doing is approximately the following:

population = [-1, 1]

In [39]: %timeit for i in range(1_000_000): [population[floor(random.random() * len(population))] for i in range(1)]
535 ms ± 24.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [40]: %timeit for i in range(1_000_000): population[floor(random.random() * 2)]
173 ms ± 3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

That's obviously faster. But C doesn't play much role here. It's just Python doing more (unnecessary) stuff in one case vs Python doing less stuff in the another. But OK, I think at this point I made myself clear enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants