Make implicitly_convertable sub-interpreter and free-threading safe (#5777)

* Replace static bool with thread-specific-storage to make this code sub-interpreter and free-threading safe.

* Make sure there is only one tss value in existence for this.

The previous code had multiple (one for every type pair, as this is a template function), which may have posed a problem for some platforms.

* Make set_flag in implicitly_convertible() non-copyable/movable

set_flag is an RAII guard for a thread-specific reentrancy flag.
Copying or moving it would risk double-resetting or rearming the flag,
breaking the protection. Disable copy/move constructors and assignment
operators to make this explicit.

* Minor cleanup to avoid venturing into UB territory.

* Experiment: Disable `~thread_specific_storage()` body when using GraalPy.

* Try the suggestion to only call TSS_free if the python interpreter is still active.

* Add IsFinalizing check

* Put this back to having a per-template-instance static

---------

Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
Co-authored-by: Ralf W. Grosse-Kunstleve <rwgkio@gmail.com>
This commit is contained in:
b-pass
2025-08-01 23:04:44 -04:00
committed by GitHub
parent 9af8adb712
commit 780ec11360
2 changed files with 23 additions and 11 deletions

View File

@@ -90,14 +90,21 @@ public:
}
~thread_specific_storage() {
// This destructor could be called *after* Py_Finalize(). That *SHOULD BE* fine. The
// following details what happens when PyThread_tss_free is called.
// PYBIND11_TLS_FREE is PyThread_tss_free on python 3.7+. On older python, it does
// This destructor is often called *after* Py_Finalize(). That *SHOULD BE* fine on most
// platforms. The following details what happens when PyThread_tss_free is called in
// CPython. PYBIND11_TLS_FREE is PyThread_tss_free on python 3.7+. On older python, it does
// nothing. PyThread_tss_free calls PyThread_tss_delete and PyMem_RawFree.
// PyThread_tss_delete just calls TlsFree (on Windows) or pthread_key_delete (on *NIX).
// Neither of those have anything to do with CPython internals. PyMem_RawFree *requires*
// that the `key` be allocated with the CPython allocator (as it is by
// PyThread_tss_create).
// However, in GraalPy (as of v24.2 or older), TSS is implemented by Java and this call
// requires a living Python interpreter.
#ifdef GRAALVM_PYTHON
if (!Py_IsInitialized() || _Py_IsFinalizing()) {
return;
}
#endif
PYBIND11_TLS_FREE(key_);
}

View File

@@ -3148,17 +3148,22 @@ typing::Iterator<ValueType> make_value_iterator(Type &value, Extra &&...extra) {
template <typename InputType, typename OutputType>
void implicitly_convertible() {
static int tss_sentinel_pointee = 1; // arbitrary value
struct set_flag {
bool &flag;
explicit set_flag(bool &flag_) : flag(flag_) { flag_ = true; }
~set_flag() { flag = false; }
thread_specific_storage<int> &flag;
explicit set_flag(thread_specific_storage<int> &flag_) : flag(flag_) {
flag = &tss_sentinel_pointee; // trick: the pointer itself is the sentinel
}
~set_flag() { flag.reset(nullptr); }
// Prevent copying/moving to ensure RAII guard is used safely
set_flag(const set_flag &) = delete;
set_flag(set_flag &&) = delete;
set_flag &operator=(const set_flag &) = delete;
set_flag &operator=(set_flag &&) = delete;
};
auto implicit_caster = [](PyObject *obj, PyTypeObject *type) -> PyObject * {
#ifdef Py_GIL_DISABLED
thread_local bool currently_used = false;
#else
static bool currently_used = false;
#endif
static thread_specific_storage<int> currently_used;
if (currently_used) { // implicit conversions are non-reentrant
return nullptr;
}