Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Can someone share insight into what was technically done to enable this? What replaced the global lock? Is the GC stopping all threads during collection or an other locking mechanism?


The most interesting idea in my opinion is biased reference counting [0].

An oversimplified explanation (and maybe wrong) of it goes like this:

problem:

- each object needs a reference counter, because of how memory management in Python works

- we cannot modify ref counters concurrently because it will lead to incorrect results

- we cannot make each ref counter atomic because atomic operations have too large performance overhead

therefore, we need GIL.

Solution, proposed in [0]:

- let's have two ref counters for each object, one is normal, another one is atomic

- normal ref counter counts references created from the same thread where the object was originally created, atomic counts references from other threads

- because of an empirical observation that objects are mostly accessed from the same thread that created them, it allows us to avoid paying atomic operations penalty most of the time

Anyway, that's what I understood from the articles/papers. See my other comment [1] for the links to write-ups by people who actually know what they're talking about.

[0] https://dl.acm.org/doi/10.1145/3243176.3243195

[1] https://news.ycombinator.com/item?id=42059605


AFAIK the initial prototype called nogil was developed by a person named Sam Gross who also wrote a detailed article [0] about it.

He also had a meeting with Python core. Notes from this meeting [1] by Łukasz Langa provide more high-level overview, so I think that they are a good starting point.

[0] https://docs.google.com/document/u/0/d/18CXhDb1ygxg-YXNBJNzf...

[1] https://lukasz.langa.pl/5d044f91-49c1-4170-aed1-62b6763e6ad0...


The key enabling tech is thread safe reference counting. There are many other problems that Sam Gross solved in order to make it happen but the reference counting was one of the major blockers.


Is this implemented with lockless programming? Is it a reason for the performance drop in single thread code?

Does it eliminate the need for a GC pause completely?


You should probably just read the PEP, which explains these things:

https://peps.python.org/pep-0703/#reference-counting

If by GC you mean the cyclic GC, free-threaded Python currently stops all threads while the cyclic GC is running.


Thank you:)


Lots of little locks littered all over the place.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: