Speed up Python without effort using generator expressions

Here are two ways to count the numbers from 1 to 1000000 in Python. First, the classical way:

sum([i for i in xrange(1000000)])

Runs 0.8s on an old Linux box. It uses quite a bit of memory.

Second, the better way:

sum((i for i in xrange(1000000)))

Runs 0.2s on the same box. It uses a tiny amount of memory.

That is right. The second option is 4 times faster! According to my tests, the second option remains better even if you only sum the numbers between 1 and 1000, though the benefits are then tiny.

Note: Yes, I am aware that I could simply do 1000000*(1000000-1)/2.

Source: Parand.

3 thoughts on “Speed up Python without effort using generator expressions”

  1. Not only that, but expression 2 doesn’t leak a variable!

    $ python
    Python 2.5.2 (r252:60911, Feb 22 2008, 07:57:53)
    [GCC 4.0.1 (Apple Computer, Inc. build 5363)] on darwin
    Type “help”, “copyright”, “credits” or “license” for more information.
    >>> i
    Traceback (most recent call last):
    File “”, line 1, in
    NameError: name ‘i’ is not defined
    >>> sum([i for i in xrange(100000)])
    4999950000L
    >>> i
    99999
    >>> sum(j for j in xrange(100000))
    4999950000L
    >>> j
    Traceback (most recent call last):
    File “”, line 1, in
    NameError: name ‘j’ is not defined
    >>>

  2. Running code through cProfile is not fair since it adds an overhead for each function call.

    Run these commands instead:

    echo “sum([x for x in xrange(1000000)])” > t1.py

    echo “sum((x for x in xrange(1000000)))” > t2.py

    python t1.py

    python t2.py

  3. Hi Daniel!
    Reading this post I try myself to do it. The results were not the same as you, I post it here. I work on python and I think that is correct that this happens, because generators in some operations are more expensive that simple list. when you have to read a big file generators are the solution, but in other cases lists are the solution!

    cProfile.run(‘sum((x for x in xrange(1000000)))’)
    1000004 function calls in 1.305 CPU seconds

    Ordered by: standard name

    ncalls tottime percall cumtime percall filename:lineno(function)
    1000001 0.565 0.000 0.565 0.000 :1()
    1 0.000 0.000 1.305 1.305 :1()
    1 0.000 0.000 0.000 0.000 {method ‘disable’ of ‘_lsprof.Profiler’ objects}
    1 0.740 0.740 1.305 1.305 {sum}

    cProfile.run(‘sum([x for x in xrange(1000000)])’)
    3 function calls in 0.540 CPU seconds

    Ordered by: standard name

    ncalls tottime percall cumtime percall filename:lineno(function)
    1 0.357 0.357 0.540 0.540 :1()
    1 0.000 0.000 0.000 0.000 {method ‘disable’ of ‘_lsprof.Profiler’ objects}
    1 0.182 0.182 0.182 0.182 {sum}

Leave a Reply

Your email address will not be published. Required fields are marked *