Getting in the Python Mindset: What is Pythonic?

scott_s · on April 10, 2008

He presented the idiomatic C way of applying a function to a list. The idiomatic C++ way for STL style containers is:

  for_each(spam.begin(), spam.end(), eggs);

And for native arrays:

  for_each(spam[0], spam[spam_length], eggs);

avinashv · on April 10, 2008

Thanks for the tip. I'll update the article.

Edit: updated.

Edit 2: apparently there's some controversy. I'm looking for a definitive answer.

anewaccountname · on April 10, 2008

Are you sure about that? Say spam is:

    int[] spam = {1,2,3};

How will for_each know what to do with this argument list? You would be passing in: 1, segfault, and a function object named eggs?

scott_s · on April 10, 2008

Actually, it should be:

  for_each(&spam[0], &spam[spam_length-1], eggs);

I need to take the address of the arrays, not their contents, and if I'm going to use the same definition of spam_length as the author. This kind of mistake is why one should avoid native arrays in C++ - the abstractions provided by the STL are more expressive and no more expensive.

The STL algorithms have no concept of length of a sequence, they only deal with iterators. This is the definition of for_each that Stroustrup provides in The C++ Programming Language:

  template<class In, class Op> Op for_each(In first, In last, Op f)
  {
    while (first != last) f(*first++);
    return f;
  }

Notice that first and last can be pointers or STL style iterators, and that f can be a function object or an actual function. The template parameter In resolves to int* at compile time.

ovi256 · on April 10, 2008

Why not use map?

spam = map (eggs, spam)

Even shorter, more concise and clearer. What is not to like, as Seinfeld would say?

scott_s · on April 10, 2008

Sometimes it is, sometimes it's not. If you allow me to generalize this to list comprehensions vs. standard functional techniques, sometimes the list comprehension is cleaner. Take the list comprehension

  [f for f in os.listdir('.') if f.endswith('.csv') and not f.endswith('_avg.csv')]

The lambda alternative (which would make a great blog name) is

  filter(lambda f: f.endswith('.csv') and not f.endswith('_avg.csv'), os.listdir('.'))

I find the list comprehension cleaner because it introduces less noise.

avinashv · on April 10, 2008

Agreed.

> The lambda alternative (which would make a great blog name)

I'm stealing that.

inklesspen · on April 11, 2008

Dammit!

avinashv · on April 10, 2008

I mentioned it in the article: map was going to be deprecated in 3000, but now will return iterators. Contextually, not what you want.

etal · on April 10, 2008

Py3K goes wild with generators. Basically, everything that used to return a list (e.g. range()) now returns an iterable, and to get a list you just apply the list() constructor to that. It's about lazy evaluation. So these will be equivalent: dinner = [eggs(x) for x in spam] dinner = list(eggs(x) for x in spam) dinner = list(map(eggs, spam)

The real reason list comprehensions are favored over map() and filter() are that the latter two require another Python function call, while list comprehensions are evaluated directly by the interpreter (a bit like rewriting the loop in C or using Pyrex) -- better performance.

jjguy · on April 11, 2008

The insight re: tight integration of data structures into the syntax is pretty brilliant. I've struggled to articulate that several times, but never managed to get it right. It makes C laughable, highlights C++'s improvements were bolt-ons and Java's verbosity.

I do wish regex's were as tightly integrated, a la Perl.

hbien · on April 10, 2008

After a few months of Python programming, I started using list comprehensions more and more. I really think they're an elegant solution for dealing with lists.

danohuiginn · on April 10, 2008

They're the best thing since sliced lists.

That said, I've started to consciously cut back on my use of list comprehensions. They are powerful, but there's a temptation to cram too much into one line - which makes debugging harder, and reduces readability. Breaking things out into a for loop often makes things clearer again.

ken · on April 11, 2008

This is actually one of my big gripes with Python: loops don't scale.

If I've got something small, it looks great to say [f(x) for x in Y], and then I add "...if x.z > m", and then I add something else, and 10 minutes later I say "ugh, too much!", and have to rewrite it as a for-loop. If I want to add a "print" in there while I'm experimenting, I'm completely SOL (yes, I know about 3.0a). The final form of a loop might not look anything like its initial form, even though they're nearly identical, both to me and to the computer.

In Common Lisp, in contrast, I can start with (loop for x in Y collect (f x)), and then add clauses all day -- LOOP, for all of its flaws, scales great from a couple words of pseudo-English surrounded by a pair of parens, up to just about anything that can be expressed as a loop. (CL-ITERATE might be even better, but I haven't gotten around to learning it yet.)

Maybe if I had an Emacs function to convert between list comprehensions and for-loops this wouldn't bug me so much.

etal · on April 10, 2008

True. I think it might also be generally possible to pull a mapping function and a filtering function out of any list comprehension or generator expression. That helps with debugging, too, without hurting performance:

def crazy_mapping(var): ... def crazy_filter(var): ... foo = [crazy_mapping(b) for b in bar if crazy_filter(b)]

scott_s · on April 10, 2008

I'm stealing that sliced lists line.

dood · on April 11, 2008

I don't think there is anything wrong with splitting list comps into multiple lines to improve readability.