Removing GIL naively would decrease single thread performance. Every project aim...

rich_sasha · on May 18, 2022

Sam Gross worked on this, this made rounds recently in Python user world: https://docs.google.com/document/d/18CXhDb1ygxg-YXNBJNzfzZsD...

It's a fork of Python 3.9, takes out GIL and introduces optimisations to speed up both single- and multi-threaded execution (since the bar set by PSF is that no-GIL implementations must be at least as fast as GIL single threaded programs). He ends up with a net 10% speed improvement.

If he does these optimisations, and also doesn't remove the GIL, the performance boost is even larger. So, depending on how you look at it, it's either:

- A bunch of optimisations, plus a GILectomy which slows Python down, or

- A bulk change that removes GIL and speeds things up

Since these improvements were in a similar ballpark, my fear was that the improvements are taken off the branch, with GIL left in place...

sumtechguy · on May 18, 2022

Removing the GIL is an idea (and as you point out not working very well). When optimizing do not depend on 'that one cool trick' to fix everything. In this case it looks like they are removing extra work and doing work once and keeping a copy around (caching).

riyadparvez · on May 18, 2022

Why would it decrease single thread performance? How is python different than other languages that support native full-fledged multi-threading, eg Java, Go, C#?

kortex · on May 18, 2022

A big part is that Python uses reference counting GC. Java, Go, C# all use tracing GC. Py_INCREF and Py_DECREF are responsible for inc/decreasing the reference count, and are not atomic. The GIL ensures refcount safety by allowing only one thread access to changing refcount. The naive approach to parallelization would require locking each ref inc/dec. There are some more sophisticated approaches (thanks to work by Sam Gross et al) that avoid a mutex hit for every inc/dec.

Tracing GC does not run into this problem. Why Python doesn't use tracing GC is not something I am qualified to answer.

Sam Gross' work: https://docs.google.com/document/d/18CXhDb1ygxg-YXNBJNzfzZsD...

The GIL code: https://github.com/python/cpython/blob/main/Python/ceval_gil...

Py_INCREF: https://github.com/python/cpython/blob/a4460f2eb8b9db46a9bce...

kaba0 · on May 18, 2022

I am by no means knowledgeable enough on the topic, but Swift has similar problem domain, and afaik only uses atomic ref counts for objects that “escape” from a given thread - is there a reason something like that wouldn’t work for python as well?

adgjlsfhk1 · on May 18, 2022

python made it's C api visible, so things like reference counting are widely observed by C libraries that interop with python. This makes it much harder to make changes since you can't change the implementation in ways that programs rely on.