Hacker Newsnew | past | comments | ask | show | jobs | submit | rogerbinns's commentslogin

In 1994 at the second WWW conference we presented "An API to Mosaic". It was TCL embedded inside the (only![1]) browser at the time - Mosaic. The functionality available was substantially similar to what Javascript ended up providing. We used it in our products especially for integrating help and preferences - for example HTML text could be describing color settings, you could click on one, select a colour from the chooser and the page and setting in our products would immediately update. In another demo we were able to print multiple pages of content from the start page, and got a standing ovation! There is an alternate universe where TCL could have become the browser language.

For those not familiar with TCL, the C API is flavoured like main. Callbacks take a list of strings argv style and an argc count. TCL is stringly typed which sounds bad, but the data comes from strings in the HTML and script blocks, and the page HTML is also text, so it fits nicely and the C callbacks are easy to write.

[1] Mosaic Netscape 0.9 was released the week before


Another excellent GUI is gitg. You can select specific lines for staging, but also for discarding. The latter is especially useful for temporary debug only changes that you want to throw away.

Heap allocation does also allow other things. For example stack frames (PyFrame) are also heap allocated. When there is an exception, the C stack gets unwound but the heap allocated frames are retained forming the traceback. You can then examine the values of the local variables at each level of the traceback.

And it also allows async functions, since state is held off the C stack, so frames can be easily switched when returning to the event loop.

The other thing made easy is C extension authoring. You compile CPython without free lists and an address sanitizer, and getting reference counting wrong shows up.


Note that free threaded compatible doesn't necessarily mean the package supports free threading (concurrent execution), just that it can be loaded into a free threaded interpreter.

This is the case with my own package which is on the hugovk list (apsw) which will cause the GIL to be re-enabled if you load it into a free threaded Python. The reason I provide a binary wheel is so that you don't have to keep separate GIL full and free threaded interpreters around. They have a different ABI so you can't use extensions compiled against one with the other.

Free threading is at the beginning of its journey. There is a *lot* of work to on all C code that works with Python objects, and the current documentation and tools are immature. It is especially the case that anyone doing Python concurrent object mutation can cause corruption and crashes if they try, and that more auditing and locking need to be done in the C code. Even modules in the standard library have only been partially updated.

You can see a lot details and discussion in the comments at https://news.ycombinator.com/item?id=45633311


Why do you provide it at all then if it's not working as intended yet?


As I stated:

> so that you don't have to keep separate GIL full and free threaded interpreters around

It means the user doesn't have to keep two Pythons around, install packages in both of them, etc.

It is also possible with the free threaded Python to keep the GIL disabled even if a package such as mine says it needs the GIL. And my package will indeed work just fine, until you supply it with mutable data and concurrently modify it in another thread.


But the users install the free threaded python to do free threaded stuff. The second they use your package they have a GIL again, which entirely defeats the point.

Wouldn't it be much better to just not support it if it's not supported?


What is better:

1) Saying your package supports free threading, but it isn't safe - ie concurrent mutation can result in corruption and crashes

2) Allowing the package to be loaded into a free threaded Python, which immediately enables the GIL. Concurrent mutation does not result in corruption and crashes because of the GIL. The user doesn't have to maintain two Python installations. They can set the environment variable PYTHON_GIL=0 or start Python with -Xgil=0 which will keep the GIL disabled, and they will be fine if they avoid concurrent mutation.

I chose 2. The stdlib json package (along with many others) picked 1. Heck I'll guarantee that most that picked 1 aren't 100% safe either, because doing the changes is hard work, *every* case has to be covered, and tools like thread sanitizers don't work.

The reason I chose 2 is because I care about data integrity. I will eventually reach 1, but only once I can be certain the code is correct.


3) Saying your package doesn't support free threading instead of adding a GIL and force the users to stick to regular python.


You aren't forced to use a GIL as I keep stating. You can set an environment variable or a command line flag to Python and the GIL will remain disabled. My package will work just fine if you do that, unless you provide it with data you concurrently modify in which case you can get corruption and crashes.


Do you atleast warn users? This sounds like madness.


Yes. The interpreter warns by default, and requires steps to disable the warning. My release notes say that the GIL will be enabled when the package is loaded.

Is it madness that other packages claim they support running without the GIL, yet it is possible to cause corruption and crashes just by writing concurrent Python code? That is the case with the standard library. Compiler thread sanitizers don't work with free threaded Python. Diligent code inspection by humans is the only way to update C code so far.

Free threading is at the beginning of the project. It works. You get slow downs in single threaded performance due to extra locking, and speedups in concurrent performance due to threading. But it is possible to cause corruption and crashes via Python code. Don't expose it to untrusted data and code.

But do investigate it and see what works well and what doesn't. See what code patterns are now possible. Help improve the tools and documentation. It had to start somewhere, and the current state is somewhere.


I think what you are doing is hiding problems. I think crashes and bugs are preferable to find the issues at this point. People who want the safe option will run the regular python.


I provide 3 options:

1) Use regular GIL Python and you get the highest levels of integrity and correctness of operation of my package

2) Use a free threaded Python, the GIL will be enabled at load time, and you get the highest levels of integrity and correctness

3) Use a free threaded Python, and set $PYTHON_GIL=0 or start with -Xgil=0 to keep the GIL disabled, and providing you do not do concurrent mutation of data provided to my package, you get the highest levels of integrity and correctness

BTW I did not randomly choose to provide the free threaded builds. I specifically asked the setuptools maintainers (under the Python Packaging Authority) how to prevent free threaded builds for PyPI. They encouraged me to do the free threaded builds so that a user doesn't to have maintain parallel regular and free threaded Python installations. And it allows option 3 above.


C code needs to be updated to be safe in a GIL free execution environment. It is a lot of work! The pervasive problem is that mutable data structures (lists, dict etc) could change at any arbitrary point while the C code is working with them, and the reference count for others could drop to zero if *anyone* is using a borrowed reference (common for performance in CPython APIs). Previously the GIL protected where those changes could happen. In simple cases it is adding a critical section, but often there multiple data structures in play. As an example these are the changes that had to be done to the standard library json module:

https://github.com/python/cpython/pull/119438/files#diff-efe...

This is how much of the standard library has been audited:

https://github.com/python/cpython/issues/116738

The json changes above are in Python 3.15, not the just released 3.14.

The consequences of the C changes not being made are crashes and corruption if unexpected mutation or object freeing happens. Web services are exposed to adversity so be *very* careful.

It would be a big help if CPython released a tool that could at least scan a C code base to detect free threaded issues, and ideally verify it is correct.


I think Java got this mostly right. On the threading front, very little is thread-safe or atomic (x += 1 is not thread-safe), so as soon as you expose something to threads, you have to think about safe access. For interacting with C code, your choices are either shared buffers or copying data between C and Java. It's painful, but it's needed for memory safety.


The core Python data structures are atomic to Python developers. eg there is no way you can corrupt a list or dictionary no matter how much concurrency you try to use. This was traditionally done under the protection of the global interpreter lock which ensured that only one piece of C code at a time was operating with the internals of those objects. C code can also release the GIL eg during I/O, or operations in other libraries that aren't interacting with Python objects, allowing concurrency.

The free threaded implementation adds what amounts to individual object locks at the C level (critical sections). This still means developers writing Python code can do whatever they want, and they will not experience corruption or crashes. The base objects have all been updated.

Python is popular because of many extensions written in C, including many in the standard library. Every single piece of that code must be updated to operate correctly in free threaded mode. That is a lot of work and is still in progress in the standard library. But in order to make the free threaded interpreter useful at this point, some have been marked as free thread safe, when that is not the case.


So its the worst of all possible worlds then. It has the poorest performance due to forced locking even when not necessary and if you load a library in another language (C), then you can still get corruptions. If you really care about performance, probably best to avoid Python entirely, even when its compiled like it is in CPython.

PS For extra fun, learn what the LD_PRELOAD environmental variable does and how it can be used to abuse CPython (or other things that dynamically load shared objects).


It is multiple fine grained locking versus a single global lock. The latter lets you do less locking, but only have a single thread of execution at a time. The former requires more locking but allows multiple concurrent threads of execution. There is no free lunch. But hardware has become parallel so something has to be done to take advantage of that. The default Python remains the GIL version.

The locking is all about reading and writing Python objects. It is not applicable to outside things like external libraries. Python objects are implemented in C code, but Python users do not need to know or care about that.

As a Python user you cannot corrupt or crash things by code you write no matter how hard you try with mutation and concurrency. The locking ensures that. Another way of looking at Python is that it is a friendly syntax for calling code written in C, and that is why people use it - the C code can be where all the performance is, while retaining the ergonomic access.

C code has to opt in to free threading - see my response to this comment

https://news.ycombinator.com/item?id=45706331

It is true that more fine grained locking can end up being done than is strictly necessary, but user's code is loaded at runtime, so you don't know in advance what could be omitted. And this is the beginning of the project - things will get better.

Aside: Yes you can use ctypes to crash things, other compiled languages can be used, concurrency is hard


It depends on how you define "corruption". You can't get a torn read or write, or mess up a collection to the point where attempts to use it will segfault, sure. You can still end up with corrupt data in a sense of not upholding the expected logic invariants, which is to say, it's still corrupt for any practical purpose (and may in turn lead to taking code paths that are not supposed to ever happen etc).


A library written in another language would have a Python extension module wrapping it, which would still hold the GIL for the duration of the native call (it can be released, but this is opt-in not opt-out), so that is usually not the issue with this arrangement.

The bigger problem is that it teaches people dangerously misguided notions such as "I don't need to synchronize if I work with built-in Python collections". Which, of course, is only true if a single guaranteed-atomic operation on the collection actually corresponds to a single logical atomic operation in your algorithm. What often happens is people start writing code without locks and it works, so they keep doing it until at some point they do something that actually requires locking (like atomic remove from one collection & add to another) without realizing that they have crossed a line.

Interestingly, we've been there before, multiple times even. The original design of Java collections entailed implicit locking on every operation, with the same exact outcome. Then .NET copied that design in its own collections. Both frameworks dropped it pretty fast, though - Java in v1.2 and .NET in v2.0. But, of course, they could do it because the locking was already specific to collections - it wasn't a global lock used for literally every language object, as in Python.


> If you really care about performance, probably best to avoid Python entirely

This has been true forever. Nothing more needs to be said. Please, avoid Python.

On the other hand, I’ve never had issues with Python performance, in 20 years of using it, for all the reasons that have been beaten to death.

It’s great that some people want to do some crazy stuff to CPython, but honestly, don’t hold your breath. Please don’t use Python if Python interpreter performance is your top concern.


It’s another step in the right direction. These things take time.


Arguably, it's a step in the wrong direction. Share memory by communicating is already doable in Python with Pipe() and Queue() and side steps the issue entirely.


Is compound assignment atomic in any major language?


Python and Javascript (in the browser) due to their single threaded nature. C++ too as long as you have a std::atomic on the left hand side (since they overload the operator).


it has been in Python due to the GIL.


It's not atomic even with the GIL, though: another thread can run in between the bytecode's load and increment, right?

The GIL's guarantees didn't extend to this.


There is NB_INPLACE_ADD... but I'm struggling to find enough details to be truly confident :\ possibly its existence is misleading other people (thus me) to think += is a single operation in bytecode.

Or, on further reading, maybe it applies to anything that implements `_iadd_` in C. Which does not appear to include native longs: https://github.com/python/cpython/blob/main/Objects/longobje...


no... but some languages may disallow simultaneously holding a reference in different execution threads


> x += 1 is not thread-safe

Nit, that's true iff x is a primitive without the volatile modifier. That's not true for a volatile primitive.


Even with volatile it’s a load and then a store no? It may not be undefined behaviour, but I don’t think it will be atomic.


You're correct. If you have:

  public int someField;
 
  public void inc() {
    someField += 1;
  }
that still compiles down to:

  GETFIELD [someField]
  ICONST_1
  IADD
  PUTFIELD [somefield]
whether 'someField' is volatile or not. The volatile just affects the load/store semantics of the GETFIELD/PUTFIELD ops. For atomic increment you have to go through something like AtomicInteger that will internally use an Unsafe instance to ensure it emits a platform-specific atomic increment instruction.


> It would be a big help if CPython released a tool that could at least scan a C code base to detect free threaded issues, and ideally verify it is correct.

Create or extend a list of answers to:

What heuristics predict that code will fail in CPython's nogil "free threaded" mode?


Some of that is already around, but scattered across multiple locations. For example there is a list in the Python doc:

https://docs.python.org/3/howto/free-threading-extensions.ht...

And a dedicated web site:

https://py-free-threading.github.io/

But as an example neither include PySequence_Fast which is in the json.c changes I pointed to. The folks doing the auditing of stdlib do have an idea of what they are looking for, and so would be best suited to keep a list (and tool) up to date with what is needed.


A list of Issue and PR URLs that identify and fix free threading issues would likely also be of use for building a 2to3-like tool to lint and fix C extensions to work with CPython free threading nogil mode


> “at least scan a C code base to detect free threaded issues”

if such a thing were possible, thread coordination would not have those issues in the first place


Some examples of what it could do when using the C Python APIs:

* Point out using APIs that return borrowed references

* Suggest assertions that critical sections are held when operating on objects

* Suggest alternate APIs

* Recognise code patterns that are similar to those done during the stdlib auditing work

The compiler thread sanitizers didn't work the last time I checked - so get them working.

Edit: A good example of what can be done is Coccinelle used in the Linux kernel which can detect problematic code (locking is way more complex!) as well as apply source transformations. https://www.kernel.org/doc/html/v6.17/dev-tools/coccinelle.h...


I agree and honestly it may as well be considered a form of ABI incompatibility. They should make this explicit such that existing C extensions need to be updated to use some new API call for initialization to flag that they are GILless-ready, so that older extensions cannot even successfully be loaded when GIL is disabled.


This has already been done. There is a 't' suffix in the ABI tag.

You have to explicitly compile the extension against a free threaded interpreter in order to get that ABI tag in your extension and even be able to load the extension. The extension then has to opt-in to free threading in its initialization.

If it does not opt-in then a message appears saying the GIL has been enabled, and the interpreter continues to run with the GIL.

This may seem a little strange but is helpful. It means the person running Python doesn't have to keep regular and free threaded Python around, and duplicate sets of extensions etc. They can just have the free threaded one, anything loaded that requires the GIL gives you the normal Python behaviour.

What is a little more problematic is that some of the standard library is marked as supporting free threading, even though they still have the audit and update work outstanding.

Also the last time I checked, the compiler thread sanitizers can't work with free threaded Python.


the problem with that is it effects the entire application and makes the whole thing free-threading incompatible.

it's quite possible to make a python app that requires libraries A and B to be able to be loaded into a free-threaded application, but which doesn't actually do any unsafe operations with them. we need to be able to let people load these libraries, but say: this thing may not be safe, add your own mutexes or whatever


Apparently using Linux does the trick too. I have no idea what technical limitation exists to prevent the code from working on Linux.


SQLite has a builtin session extension that can be used to record and replay groups of changes, with all the necessary handling. I don't necessarily recommend session as your solution, but it is at least a good idea to see how it compares to others.

https://sqlite.org/sessionintro.html

That provides a C level API. If you know Python and want to do some prototyping and exploration then you may find my SQLite wrapper useful as it supports the session extension. This is the example giving a feel for what it is like to use:

https://rogerbinns.github.io/apsw/example-session.html


Unless you compile SQLite yourself, you'll find the maximum mmap size is 2GB. ie even with your pragma above, only the first 2GB of the database are memory mapped. It is defined by the SQLITE_MAX_MMAP_SIZE compile time constant. You can use pragma compile_options to see what the value is.

https://sqlite.org/compile.html#max_mmap_size

Ubuntu system pragma compile_options:

    MAX_MMAP_SIZE=0x7fff0000


That seems like a holdover from 32-bit days. I wonder why this is still the default.


SQLite has 32 bit limits. For example the largest string or blob it can store is 2GB. That could only be addressed by an incompatible file format change. Many APIs also use int in places again making limits be 32 bits, although there are also a smattering of 64 bit APIs.

Changing this default requires knowing it is a 64 bit platform when the C preprocessor runs, and would surprise anyone who was ok with the 2GB value.

There are two downsides of mmap - I/O errors can't be caught and handled by SQLite code, and buggy stray writes by other code in the process could corrupt the database.

It is best practise to directly include the SQLite amalgamation into your own projects which allows you to control version updating, and configuration.


>There are two downsides of mmap - I/O errors can't be caught and handled by SQLite code,

True. https://www.sqlite.org/mmap.html lists 3 other issues as well.

> and buggy stray writes by other code in the process could corrupt the database.

Not true: "SQLite uses a read-only memory map to prevent stray pointers in the application from overwriting and corrupting the database file."


All great points. Thank you!


The general test suite is not proprietary, and is a standard part of the code. You can run make test. It uses TCL to run the testing, and covers virtually everything.

There is a separate TH3 test suite which is proprietary. It generates C code of the tests so you can run the testing in embedded and similar environments, as well as coverage of more obscure test cases.

https://sqlite.org/th3.html


Why is that? Surely that leads to conversations with open source contributors like 'this fails the test suite, but I can't show you, please fix it'?


This isn't an issue as SQLite doesn't accept contributions because they don't want to risk someone submitting proprietary code and lying about its origin.

I've never understood why other large open-source projects are just willing to accept contributions from anyone. What's the plan when someone copy-pastes code from some proprietary codebase and the rights holders finds it?


Partly why they have CLAs I suppose?

If someone sells me something they stole, I'm not on the hook for the theft.


The "plan" is to take out the contaminated code and rewrite it.


If the rights holder is particularly litigious then I could see them suing even if you agreed to take out their code under the argument that you've distributed it and profited from it. I don't know if there's been any cases of this historically but I'd be surprised if there hasn't been.


Every open source project has the possibility of litigation. Can't always live in fear of the bogeyman


The same issue is present with the use of LLMs. Are you absolutely sure it didn't just repeat some copyrighted code at you?


SQLite doesn’t accept contributions


I'd love to see an analysis of byte ordering impact on CPU implementation. Does little vs big endian make any difference to the complexity of the algorithms and circuits?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: