More

jakobnissen · 2025-12-03T15:48:47 1764776927

They do, and the order of the passes matter. Sometimes, optimizations are missed because they require a certain order of passes that is different from the one your compiler uses.

On higher optimization levels, many passes occur multiple times. However, as far as I know, compilers don't repeatedly run passes until they've reached an optimum. Instead, they run a fixed series of passes. I don't know why, maybe someone can chime in.

titzer · 2025-12-03T22:15:05 1764800105

It's a long-standing problem in compilers, often referred to as the "phase ordering problem". In general, forward dataflow optimizations can be combined if they are monotonic (meaning, never make the code worse, or at least, never undo a previous step. It's possible to run forward dataflow problems together repeatedly to a fixpoint. In TurboFan a general graph reduction algorithm is [1] instantiated with a number of reducers, and then a fixpoint is run. The technique of trying to combine multiple passes has been tried a number of times. What doesn't seem so obvious is how to run optimizations that are not traditional forward dataflow problems or are indeed backward dataflow problems (like DCE) together with other transformations. Generally compilers get tuned by running them on lots of different kinds of code, often benchmarks, and then tinkering with the order of passes and other heuristics like loop unroll factors, thresholds for inlining, etc, and seeing what works best.

[1]was? TurboFan seems to have splintered into a number of pieces being reused in different ways these days

jakobnissen · 2025-11-29T08:50:49 1764406249

Even if you need your home to live in, you can still borrow against your property value, essentially "eating the bricks".

Alternatively, you can use the value of your house as an emergency fund: If you desperately need money, you can move into something smaller, or more distant from the city, and cash out.

If you do neither of these things, your children will inherit the value of your house. Either way, the money you gain are real money, and you actually gain them.

AnthonyMouse · 2025-11-29T09:11:47 1764407507

> Even if you need your home to live in, you can still borrow against your property value, essentially "eating the bricks".

This is not a source of money, it's only a source of collateral. Anything you borrow not only has to be paid back, you have to pay interest. And interest rates are higher than they used to be and the standard deduction is now large enough that most people don't get to deduct the interest anymore.

Moreover, it only means anything if the difference in value would have made a difference. If you want to borrow $20,000 then a house with $100,000 in equity is quite sufficient and another $100,000 in equity isn't doing much for you.

> If you desperately need money, you can move into something smaller, or more distant from the city, and cash out.

Houses are much worse for this than e.g. stocks, because they're hard (and extremely inconvenient when you live in it) to sell in a hurry unless you want to a lose a lot of value. A lot of people also can't do this anymore because they got a fixed-rate mortgage before rates went up and now they can't move or they'll have to refinance at the higher rate which would eat most if not more than all of the difference in value.

> If you do neither of these things, your children will inherit the value of your house.

But then they need a place to live and then either can't sell it because they're living in it or get the money from selling it but pay that much again to acquire a different place to live.

nikanj · 2025-11-29T09:15:56 1764407756

It's a housing version of the national debt philosophy, where debt doesn't matter as long as you're outpacing it with growth. If last decade you had $25k HELOC on a 400k house and today you have $50k HELOC on a 800k house, your finances have clearly improved

AnthonyMouse · 2025-11-29T09:33:36 1764408816

Except that you still can't sell the house or you won't have a place to live, so they've only improved on paper before you account for the opportunity cost of the higher imputed rent, i.e. the higher cost of living. Meanwhile you could have gotten the $50k HELOC against the $400k house, which was the only part doing anything that would actually affect your life.

carlosjobim · 2025-11-29T12:01:26 1764417686

You can sell your house and then rent, and that's what old house owners should consider before they have completely exterminated the youth.

AnthonyMouse · 2025-11-30T11:14:11 1764501251

If you sell your house and then rent then you'd be paying rent and therefore have direct negative exposure to the high housing costs. That also doesn't create any new supply. You're still living somewhere and therefore still need somewhere to live. For someone else to have a housing unit while you still have one, you have create more, not just play musical chairs.

carlosjobim · 2025-11-30T11:44:07 1764503047

When you are old maybe you don't need to live in a house that was made for a family? Or you can give your house to your children, and then let them pay your rent. They're already paying their own rent.

nikanj · 2025-11-30T16:46:27 1764521187

Sure, but the only thing available anywhere in your neighbourhood is big houses made for families.

That's the whole missing middle thing, there's luxury pied-à-terre sold to millionaires downtown, and then there's the endless stretch of large single-family houses in the suburbs.

PopAlongKid · 2025-11-29T14:40:54 1764427254

>the standard deduction is now large enough that most people don't get to deduct the interest anymore.

They don't get to itemize their interest deduction, but they still get to deduct from their taxable income an amount equal to or greater than the interest they paid.

The standard deduction was not significantly increased in order to reduce total deductions, it was simply to remove the need to itemize them as often. (And incidentally, to replace the personal exemption deduction which was removed.)

This is in reference to changes to U.S. income tax beginning in 2017.

AnthonyMouse · 2025-11-30T11:09:20 1764500960

> They don't get to itemize their interest deduction, but they still get to deduct from their taxable income an amount equal to or greater than the interest they paid.

But they get to deduct that amount regardless of whether they paid any interest, so if they take the loan they're paying all of the interest themselves relative to what happens if they don't take the loan.

fragmede · 2025-11-29T17:43:10 1764438190

I know this solution has you interacting with the disgusting poors, but if you have multiple bedrooms, you can rent them out and have roommates defray the cost of the mortgage and property tax, possibly for a profit. Crazy idea, I know, but just something to keep in mind, should one find themselves in that situation.

jakobnissen · 2025-11-27T10:20:44 1764238844

That's an interesting definition, but it does have some issues.

Is an infertile animal (which can't reproduce) dead? What about a nerve cell (which have differentiated too far to become a reproductive cell)? Or a red blood cell (which has no genome)?

From the other end, is a genetic algorithm alive? What about a manuscript? Manuscripts are copied (so they reproduce), and have frequent copying errors, which propagate.

jakobnissen · 2025-11-27T05:43:15 1764222195

Is it really? In this example, could you not see anything wrong with calling employees losers and monkeys, until someone linked you the CoC?

hshdhdhj4444 · 2025-11-27T05:55:40 1764222940

Code of Conduct cannot stop someone from doing something.

It’s just a document.

However, in this case, the presence of the code of conduct has made it trivially easy to point out the language as wrong in a way whoever wrote this for Zig cannot refute.

It’s working exactly as it should.

kelnos · 2025-11-27T07:24:30 1764228270

How is it working? The post is still there, referring to people as "losers" and "monkeys". Was the author of the post chastised? Have they edited the post and apologized?

fulafel · 2025-12-01T04:49:19 1764564559

Seems there was some latency, now the post has been updated and those words are not there anymore.

KingMob · 2025-11-27T08:08:57 1764230937

Heh. You've rediscovered Critical Race Theory, which was a graduate-level theory about how rules/laws are systematically applied to minorities/the powerless, and not applied to the powerful/project leaders.

Holding the powerful to the law is unfortunately, a separate issue to whether it's worth it to have written rules/laws in the first place.

A CoC could still be better than no CoC, even if it fails to rein in abuse from the top.

IshKebab · 2025-11-27T07:52:56 1764229976

They don't have to refute it; they have the power to ignore it.

reactordev · 2025-11-27T07:54:45 1764230085

Which suffice to say is not at all

serial_dev · 2025-11-27T05:48:34 1764222514

To add to it, the post is still calling people losers and monkeys, so the CoC is clearly not working properly.

amake · 2025-11-27T07:32:21 1764228741

Might as well get rid of laws against murder because sometimes people commit murder anyway?

ifh-hn · 2025-11-27T08:04:23 1764230663

Not the same thing at all. There's consequences for murder, absolutely none for not abiding by this CoC; as clearly seen by the fact the posted remains as is.

graemep · 2025-11-27T08:43:54 1764233034

A better analogy would be getting rid of laws against murder if its unevenly applied so people from a particular group always got away with it.

prmoustache · 2025-11-27T07:29:46 1764228586

Yes the same way laws don't eradicate delinquency and crime magically. Humans are humans.

jakobnissen · 2025-11-25T19:43:01 1764099781

IQ scores are calibrated to be normally distributed with a standard deviation of 15. So 15 is one standard deviation. That's the difference between average, and being in the smartest 16% of the population. Or being in the smartest 16%, and being in the smartest 2% of the population.

jakobnissen · 2025-11-25T18:40:21 1764096021

Excellent article - except that the author probably should have gated their substantiation of the claim behind a cliffhanger, as other commenters have mentioned.

The author's priorities are sensible, and indeed with that set of priorities, it makes sense to end up near R. However, they're not universal among data scientists. I've been a data scientist for eight years, and have found that this kind of plotting and dataframe wrangling is only part of the work. I find there is usually also some file juggling, parsing, and what the author calls "logistics". And R is terrible at logistics. It's also bad at writing maintainable software.

If you care more about logistics and maintenance, your conclusion is pushed towards Python - which still does okay in the dataframes department. If you're ALSO frequently concerned about speed, you're pushed towards Julia.

None of these are wrong priorities. I wish Julia was better at being R, but it isn't, and it's very hard to be both R and useful for general programming.

Edit: Oh, and I should mention: I also teach and supervise students, and I KEEP seeing students use pandas to solve non-table problems, like trying to represent a graph as a dataframe. Apparently some people are heavily drawn to use dataframes for everything - if you're one of those people, reevaluate your tools, but also, R is probably for you.

ActorNightly · 2025-11-25T21:18:04 1764105484

>Excellent article

Except its not. Data science in python pretty much requires you to use numpy. So his example of mean/variance code is a dumb comparison. Numpy has mean and variance functions built in for arrays.

Even when using raw python in his example, some syntax can be condesed quite a bit:

groups = defaultdict(list) [groups[(row['species'], row['island'])].append(row['body_mass_g']) for row in filtered]

It takes the same amount of mental effort to learn python/numpy as it does with R. The difference is, the former allows you to integrate your code into any other applicaiton.

dragonwriter · 2025-11-25T23:49:09 1764114549

> Numpy has mean and variance functions built in for arrays.

Even outside of Numpy, the stdlib has the statistics packages which provides mean, variance, population/sample standard deviation, and other statistics functions for normal iterables. The attempt to make Python out-of-the-box code look bad was either deliberately constructed to exaggerate the problems complained of, or was the product of a very convenient ignorance of the applicable parts of Python and its stdlib.

ModernMech · 2025-11-25T23:41:08 1764114068

I dunno. Numpy has its own data types, its own collections, its own semantics which are all different enough from Python, I think it's fair to consider it a DSL on its own. It'd be one thing if it was just, operator overloading to provide broadcasting for python, but Numpy's whole existence is to patch the various shortcomings Python has in DS.

puzzlingcaptcha · 2025-11-26T11:12:28 1764155548

The second part of the article is right here: https://blog.genesmindsmachines.com/p/python-is-not-a-great-...

a_bonobo · 2025-11-26T06:35:58 1764138958

>I find there is usually also some file juggling, parsing, [...]

I'd say I'm 50/50 Python/R for exactly this reason: I write Python code on HPC or a server to parse many, many files, then I get some kind of MB-scale summary data I analyse locally in R.

R is not good at looping over hundreds of files in the gigabytes, Python is not good at making pretty insights from the summary. A tool for every task.

jakobnissen · 2025-11-06T21:53:09 1762465989

The function on that slide is dominated by the call to rand, which uses quite different implementations in Julia and Python, so may not be the best example.

Julia is compiled and for simple code like that example code will have performance on par with C, Rust etc.

Qem · 2025-11-07T23:13:50 1762557230

I tested how PyPy performs on that. Just changing the implementation of Python drops the runtime from ~16.5s to ~3.5s in my computer, approximately a 5x speedup:

  xxxx@xxxx:~
  $ python3 -VV
  Python 3.11.2 (main, Apr 28 2025, 14:11:48) [GCC 12.2.0]
  xxxx@xxxx:~
  $ pypy3 -VV
  Python 3.9.16 (7.3.11+dfsg-2+deb12u3, Dec 30 2024, 22:36:23)
  [PyPy 7.3.11 with GCC 12.2.0]
  xxxx@xxxx:~
  $ cat original_benchmark.py
  #-------------------------------------------
  import random
  import time
  
  def monte_carlo_pi(n):
      inside = 0
      for i in range(n):
          x = random.random()
          y = random.random()
          if x**2 + y**2 <= 1.0:
              inside += 1
      return 4.0 * inside / n
  
  # Benchmark
  start = time.time()
  result = monte_carlo_pi(100_000_000)
  elapsed = time.time() - start
  
  print(f"Time: {elapsed:.3f} seconds")
  print(f"Estimated pi: {result}")
  #-------------------------------------------
  xxxx@xxxx:~
  $ python3 original_benchmark.py
  Time: 16.487 seconds
  Estimated pi: 3.14177012
  xxxx@xxxx:~
  $ pypy3 original_benchmark.py
  Time: 3.357 seconds
  Estimated pi: 3.14166756
  xxxx@xxxx:~
  $ python3 -c "print(round(16.487/3.357, 1))"
  4.9

I changed the code to take advantage of some basic performance tips that are commonly given for CPython (taking advantage of stardard library - itertools, math; prefer comprehensions/generator expressions to loose for loops), and was able to get CPython numbers improve by ~1.3x. But then PyPy numbers took a hit:

  xxxx@xxxx:~
  $ cat mod_benchmark.py
  #-------------------------------------------
  from itertools import repeat
  from math import hypot
  from random import random
  import time
  
  def monte_carlo_pi(n):
      inside = sum(hypot(random(), random()) <= 1.0 for i in repeat(None, n))
      return 4.0 * inside / n
  
  # Benchmark
  start = time.time()
  result = monte_carlo_pi(100_000_000)
  elapsed = time.time() - start
  
  print(f"Time: {elapsed:.3f} seconds")
  print(f"Estimated pi: {result}")
  #-------------------------------------------
  xxxx@xxxx:~
  $ python3 mod_benchmark.py
  Time: 12.998 seconds
  Estimated pi: 3.14149268
  xxxx@xxxx:~
  $ pypy3 mod_benchmark.py
  Time: 12.684 seconds
  Estimated pi: 3.14160844
  xxxx@xxxx:~
  $ python3 -c "print(round(16.487/12.684, 1))"
  1.3

dragonwriter · 2025-11-07T23:58:22 1762559902

I tested staying in CPython but jitting the main function with numba (no code changes but adding the jit decorator and expected type signature, and adding the same jit warmup call before the benchmark that the Julia version uses), and its about an 11× speedup. Code:

    import random
    import time
    from numba import jit, int32, float64


    @jit(float64(int32), nopython=True)
    def monte_carlo_pi(n):
        inside = 0
        for i in range(n):
            x = random.random()
            y = random.random()
            if x**2 + y**2 <= 1.0:
                inside += 1
        return 4.0 * inside / n

    # Warm up (compile)
    monte_carlo_pi(100)

    # Benchmark
    start = time.time()
    result = monte_carlo_pi(100_000_000)
    elapsed = time.time() - start
    print(f"Time: {elapsed:.3f} seconds")
    print(f"Estimated pi: {result}")

Base version (using the unmodified Python code from the slide):

   $ python -m monte
   Time: 13.758 seconds
   Estimated pi: 3.14159524

Numba version:

   $ python -m monte-numba
   Time: 1.212 seconds
   Estimated pi: 3.14143924

jakobnissen · 2025-10-31T19:25:01 1761938701

That's an overly cynical take. Obviously it means that a criminal organization would need to recruit officials before they could issue fake passports. Which is already pretty hard.

And maybe they would need to recruit multiple officials across multiple agencies. And if these agencies has internal policing, then even if they manage to do that, they now have another vulnerability where the criminal operation can be discovered and sabotaged.

jakobnissen · 2025-10-30T08:47:57 1761814077

It also has poor tooling when compared to Python. Julia's package manager is good, and so is it's tools for performance optimisation, but for type checking, app/cli creation, semver checking and IDE integration, the tooling is quite bad. Also, the compile times are shit, and the type system makes it very hard to make a type checker in the first place.

jakobnissen · 2025-10-19T07:51:10 1760860270

I worked with Nanopore data about four years ago, and I found that that's mostly true, but for some reason at some sites, there was systematic errors where more than half of reads were wrong.

I can't 100% prove it wasn't a legit mutation but our lab did several tests where we sequenced the same sample with both Illumina and Nanopore, and found Nanopore to be less than perfect even with exteme depth. Like, out depth was so high we routinely experienced overflow bugs in the assembly software because it stored the depth in a UInt16.

bonsai_spool · 2025-10-19T13:07:50 1760879270

What was the DNA source? At the same time (4 years ago) there were issues with specific species - Birds and some metagenome species were the worst if I remember correctly.

jakobnissen · 2025-11-02T12:40:11 1762087211

Influenza virus