Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Format Python Code Using YAPF (leimao.github.io)
47 points by keyboardman on Feb 9, 2020 | hide | past | favorite | 65 comments


I've used Yapf for a year, then switched to Black when it appeared.

Seriously, use Black. I my experience, and to my taste, it works perfectly all the time, and the result is beautiful.

It's also pep8 compliant, for the parts of pep8 that are concerned by a reformatter.


From the experience in our code base, yapf produces higher-quality results, but is much slower.

Black removes any additional parentheses added to enable a line split at a reasonable position (such as "and"). With those removed, black has to split at weird locations like at a function call.

But switching to black just for the speed increase is reasonable.


My perfect formatter would be ridiculous to implement and maintain.

Thus I "like" black's opinionated take.

"I see some rude code and I want to paint it black, " -- Sir Mick Jagger, probably.


Black produces code with more lines than yapf. Sure sometimes yapf makes very weird decision (especially when using a dictionary litteral as an argument to a function) but at least it does not use 4 lines for a list with 2 items that could have fitted on one line.


A somewhat-undocumented feature of black is that in recent versions, by giving the list a trailing comma, you're telling the indenter to always put it on multiple lines (eg `[1,2,]` will always wrap on 4 lines). If you remove the trailing comma to a multi-item list/dict/whatever, black will try to compact the literal to a single line (and if it fails because the line is too long, it will add the trailing comma back)


I’m pretty sure Black doesn’t do that, it will fit them onto a single line if possible, then it will move the kv pairs onto their own line, and failing that, one kv pair per line. In any case, I’m fine with more lines. Clarity is much more important than minimizing line count.


I've seen Black do this (the dict needs to be too long to inline):

  call(
      {
          "key": "value",
          "key": "value",
      }
  )


yeah, can be annoying, but also encourages this, which is nice when the number of values grows:

  items = {
      "key": "value",
      "key": "value",
  }
  call(items)
it's a bit annoying with exception messages, but again, writing long args before works great, and i've become a fan of this (unintended?) nudge:

  if error:
      raise ValueError(
          "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Maecenas vel ligula nec eros finibus metus."
      )

  if error:
      msg = (
        "Lorem ipsum dolor sit amet, consectetur adipiscing elit. "
        "Maecenas vel ligula nec eros finibus metus."
      )
      raise ValueError(msg)
just to be clear, i don't think working around a formatter is good. in this case, i feel like the uncompromising rules were exposing a bit of an anti-pattern. obviously, your opinion on this may vary wildly.


I’m honestly fine with putting the dict in the function call in most cases. It’s not a big deal either way. As for your exceptions, you can just do

    raise ValueError(
        "Lorem ipsum dolor sit amet, consectetur adipiscing elit. "
        "Maecenas vel ligula nec eros finibus metus."
    )
No need for the extra variable.


This puts the lie to the common claim that formatters end formatting decisions. Actually they only move the problem from "how shall I format my code" to "how shall I write my code so that I will like the way my autoformatter formats my code".


You misunderstand the point of formatters. They aren’t meant to automate your personal preference; they automate a standard format so your team doesn’t have to waste time deciding on and enforcing a coding standard and so you don’t have to manually implement the standard. Implicit in using a code formatter is the decision to stop navel gazing and put the team first.


Right, but the parent claims it will do that when it would fit on a single line, unless I misunderstood something.


I dislike a lot of black decisions and still use it for everything.

My aesthetic taste is less important than getting things done.

If you are not using it, stop arguing and nit picking about your preferences, we, as a community, have more important things to do. It will hurt only a little, I promise.


My team has a handful of junior data scientists that really don't have a lot of experience of coding "well". Black was super helpful in 1, getting people used to decent coding format and 2, removing any effort or discussion around formatting in PRs. I added the formatting to pre-commit and the rest is effortless.


The only successful formatter are ones who don't have any knobs to tune. The reason a formatter exists is to make all codebases look consistent, not to cater to your personal style.

This is one area Golang got it right with gofmt.


This is the reason we switched to black for Python about a year ago. There are no options to tweak. At first I had to deal with a lot of complaining developers, but now everyone got accustomed to blacks formatting style and we moved on.


Same. It looked weird at first, and everyone had specific complaints with it, but they were all different complaints. Now we all accept that it’s just what Python code looks like, and can stop thinking about it all.


It has questionable default of 88 characters per line, which is configurable. I wish they chose 79 characters as PEP8 suggests.


That is one of the few things you can set, put it in your pyproject.toml, done.

Oh you don't have a pyproject.toml? Well then we have a different problem :)


> The reason a formatter exists is to make all codebases look consistent

I'm not how this is related to not having any knobs to tune. Aren't there many ways for code to be consistent with pep8? Isn't advocating for only one version of pep8-consistent code in essence an attempt to supercede pep8?


They are most likely talking about the codebase being consistent within a company / project. Not consistent with pep8.

If there are no knobs to turn (like in gofmt) everyone just has the same settings and you don't have to make sure everyone sets the knobs to the same values.


Black formats to a subset of pep8 so I guess it's an attempt to "subcede" pep8.


I think that thing X superceding thing Y is equivalent to thing Y subceding to thing X.

In this case, if black becomes the standard for defining appropriate style then it is superceding pep8 as the standard for style (many styles consistent with pep8 are not black outputs, so if black's style becomes mandatory, it is a ruling against these previously accepted alternatives).


We used YAPF[0] before we used Black[1]

I really liked it, a lot. What set Black above is it is most of the decisions (if not all, really) makes is how we setup YAPF anyway. I do think it does some small things better, like reformatting function arguments in certain cases (as is highlighted elsewhere in this thread)

If you need configurability, YAPF is the best choice, in my opinion. We still use isort though, because it sorts imports in a much more readable way.

I just wish I could find a suitable replacement for C# development. StyleCop is okay, but I find we have to use `<NoWarn></NoWarn>` .csproj settings on so many little rules and it doesn't auto format (in as so far as I can tell). If your editor supports it, it will use it as a formatting source of truth, though. I just want something that also has a runnable console binary we can use in CI. Maybe I haven't looked at it closely enough.

[0]https://github.com/google/yapf

[1]https://black.readthedocs.io/en/stable/


I use black and couldn't be happier. Very fast and produces very readable code.


Some editors or IDEs have pretty good automatic "formatting as you type" systems that can largely remove the need for stand-alone formatters for new code.

Sometimes you can convince them to processes an existing file as if it is being typed in and so apply the "as you type" formatter, effectively giving you a stand-alone formatter.

I used to do this with Emacs for C formatting. I don't remember how since I'm a vim user who only figured out enough Emacs for this one thing, and it was a long time ago, but I remember it worked very well. The Emacs C "as you type" formatter was very configurable and I was able to make it almost perfectly match my employer's style.

How's Emacs "as you type" Python formatting?


I thought the community standardized on black


Black is just slightly too opinionated. It gives you configuration options for benign things such as numeric separators, but it flat out refuses to allow tab indent, which a fairly significant part of the python community uses.

I tried to PR a --use-tabs flag but the PR was rejected without comments. Had to fork Black to be able to use it. Tan is a drop-in replacement that allows --use-tabs (and use-tabs = true in pyproject.toml).

https://github.com/jleclanche/tan


The reason Black is fast becoming standard is precisely because it is so opinionated. Like the other code formatters that have successfully become 'normal', such as gofmt. An option means that there isn't a standard, but several standards. And the people pushing Black in their projects see the benefit of a global standard, even though it is rarely 100% their personal preferences. It is a consensus.


Prettier is more widespread than black, is very opinionated, and still has an option for tab indent. This is a strawman.


How is this argument a strawman?


Prettier's opinion then is that tabs are irrelevant to the formatting. Which is probably fine for the all the languages Prettier seems to support. Unlike Python, where tabs vs spaces matter. Mixing tabs and spaces is a problem, and a formatter cannot automatically support both or risks changing the meaning of code.


I don't know how you arrive to this conclusion… prettier never mixes tabs and spaces, it would be a bug if it did. And there's no meaningful difference between tabs and spaces in Python either. Python has significant indent stops, that's all.

Furthermore, prettier has a python plugin which does support tabs.


Thats a shame, especially as I've heard tabs are more accessible: https://www.reddit.com/r/javascript/comments/c8drjo/nobody_t...


Precisely why I use them and why I'm uncomfortable with Black refusing to add the option (https://github.com/psf/black/pull/513).


This argument sounds quite bizarre to me. Why would a visually impaired programer not manage to set their editor up to show indenting spaces the way they want, or failing that, collaborate with others to do so?

It's not like they can be effective at their job either without the ability to read 3rd party code, which overwhelmingly will be space indented.


How to set an editor to use big visual indents for spaces?


Depends on the editor, of course, but e.g.

https://www.emacswiki.org/emacs/redshift-indent.el

(Haven't tried the above, but I've done fairly extensive display hacks with emacs in the distant past, so I have a fair amount of confidence that it's not hard).


While I prefer tabs whenever I can, I can see why it doesn't support it. If you have to share code, you're going to use spaces. For some reason everyone including Guido insists on spaces. I would rather not have to waste time thinking about formatting or much less which formatting style to use. New languages like Go and Rust solved this problem.


I disagree with a few of Black's decisions, so I forked it to Lavender [0], which I keep up to date with the latest stable Black release (and use actively in all my Python projects).

> Differences from Black

> - The default line length is 99 instead of 88 (configurable with --line-length).

> - Single quoted strings are preferred (configurable with --string-normalization none/single/double).

> - Empty lines between classes and defs are treated no differently from other code. The old behavior, which sometimes inserts double empty lines between them, remains available via --special-case-def-empty-lines.

> - The Vim plugin configuration variable for line length is named g:lavender_line_length instead of g:lavender_linelength, for consistency with the other configuration variable names.

[0] https://github.com/spinda/lavender


Nice to hear. The single quote issue is the main thing that keeps me from using black. Double quotes are just too noisy.


Black can be configured to use single quotes. In fact, I think it's the only style configuration you can make.



I think it's the `--skip-string-normalization` flag which means "Don't normalize string quotes or prefixes".


But I want them normalized. Normalized to single quotes.


Yes, this. I have early-stage cateracs, which by-and-latge is not an issue on my 40 inch monitor. But it does mean that my astigmatism drifts around monthly. My prescription changes faster than I can get new lenses cut. The BIGGEST ergonomic issue for my with Python is single versus double quotes. Double quotes are the difference between happily coding all day versus a headache at lunch time and 5% less done. Fuck Black.


Can you say more about how single quotes vs double quotes is helpful for you? I rarely write python, but I have a passion for accessibility and I'm curious about this.


It doesn't really matter which standard you use imho, as long as you automate the tedious task of code formatting to arbitrary standards. If it's something the computer can do let it do it don't waste your time on it. This goes especially in pull requests where dedicated focus to trivial things like formatting often distracts from the real things that need review, like code architecture.


It would be nice if things converted on a single standard, however.


True, but the order of things will never allow this to happen, either the old standard lags behind, some people disagree on something (just look at the tabs vs spaces discussion) or some people just have inherently different personal preferences. Then there is always the eb and flow of of convergence and deviation. Thats why I've given up on pushing my preferred standard in a team and just enforce the rule that is must be automated so I don't have to spend time on it no matter which way the current standard wind blows.


black is opinionated. If you disagree with black's opinions then a negotiable formatter like YAPF might be better.

I personally love black though.


Black is opinionated enough that you will occasionally mumble about it, but good enough that you will keep using it anyway. That's a remarkable achievement for a formatter. For me it just took a little to get used to double-quoted strings.

Rust's "cargo fmt" tool is similarly good.


I hear you on the double-quoted strings; and that's why the only black configuration I allow myself, is "skip-string-normalization = true" in my pyproject.toml


I'd say that black is way too opinionated, especially since it doesn't produce good output a lot of the time. At my company we consistently have to adjust the way we write code in order for black not to mangle it.


A bad formatter is worse than no formatter, and unfortunately YAPF proves this out in my experience, being unstable, inconsistent, and tricky to configure.

Black however could fall into this category of worse than no formatter. On my team we have a strong style guide with a lot of well-reasoned, detailed, and consistent rules. One of the main differences to other style guides is that we design our style to make review easier. One of the primary ways of doing this is minimising diff noise.

While Black's vision is to reduce diff noise and design for easier review through a consistent style, it creates more diff noise and has a less consistent style than our style guide, and so we've had many discussions internally about whether it's right for us.

I have no doubt that Black is better than weak/no style guide, and for open source projects the automation it brings would absolutely be the right choice. I just wish it was better at what it sets out to do.

Edit: to address some of the questions raised:

- Yes Black does save time over code review picking on style details, but we already have automated linters for most things we'd raise about style anyway, which negates some of the time saving.

- The easiest example of where Black differs from our style guide and falls down on its promises is formatting lists/function calls/definitions.

For example:

  foo = [bar, bar]
When reaching the line length limit, we will turn this into:

  foo = [
    bar,
    baz,
    quux,
  ]
However Black will format this first as:

  foo = [
    bar, baz, quux
  ]
Only when it goes on a few more characters does it then format into the way we'd go straight to. This means that there's more diff noise more of the time, and when reading code there are 3 forms of this construction that one must be aware of, rather than the 2 forms that we have, meaning the code is less consistently formatted.

This is picky, yes, but the point of Black is to be picky, and in a team where we can have a very good shared understanding of a style, and where we do already have that style, Black is much less convincing.


> and so we've had many discussions internally about whether it's right for us

Many discussions? I've never been on a project where code style required anything more than 10 minutes. The tech lead would ask: "everybody okay with the defaults of this linter/editor/whatever" and we'd reply "sure".

How much time did those many discussions take, and what were the reasons you needed to put in that effort?


This is a strange comment. It sounds like you’re using black wrong. First of all, when you pick Black, you throw away your own style guide (at least where black is concerned; keep the bits about function naming, etc). Secondly, you should only have one noisy commit, that being the very first commit in which you integrate Black. Thereafter, Black runs before code is committed so all code in code review is already formatted. Yes, if you already have comprehensive style linters and a team that agrees on style, you won’t see as much benefit, but you won’t need to maintain your custom linters or style guide or have all of these conversations about style (you also don’t need to iterate with a linter).


I have hated every single formatter I've tried, other than Black. It seems to be the only one that's not just trying to put style into PEP8 compliance, but also keep it readable and maintainable.

I'm also curious about the issues you've had with it.


The outout of Black is PEP8 compliant, just not what flake8 think PEP8 mandates.


Weird. I also use flake8 and the only mismatch between the two has been the default line length of black, which I changed to 79. All is good now.


I hate black so much. I know it's somewhat irrational, but it's incredibly ugly and violates established usage in most Python projects.


Any formatter will violate established usage in most Python projects. I find black at least makes your code readable, compared with pprint-esque styles.


Have you got any specific examples of different rules/styles? Have you brought any of them up on the issue tracker?

I’m a very big fan of black, and had disliked a few choices. But it’s saved our team literally dozens of hours of nit picking and style fixes. It’s so good to never have to critique style and just focus on function.


Opening an issue on Black for anything but a true bug is chasing after the wind.


How, specifically, do you minimise diff noise? I can imagine a couple of ways:

- Split to as many lines as possible in order to avoid suddenly going from one line to many lines and vice versa when a line crosses a specified line length

- Add trailing commas wherever possible




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: