Commit Graph

1653 Commits

Author SHA1 Message Date
Itamar Oren
0a45af2531 gh-5991: Fix segfault during finalization related to function_record (#6010)
* gh-5991: Fix segfault during finalization related to function_record

This patch was developed with assistance from  Claude Code Opus 4.6

Here's Claude's explanation of the crash mechanism and some reasoning for the difficulty to repro:

`tp_dealloc_impl` calls `cpp_function::destruct` which:
1. Calls `std::free()` on function_record string members (`name`, `doc`, `signature`)
2. Calls `arg.value.dec_ref()` on default argument values
3. Calls `delete rec` on the function_record

But it never calls `PyObject_Free(self)` or `Py_DECREF(Py_TYPE(self))`, which are
required for heap types.

During `_Py_Finalize`, final GC collects the heap types (which survive module dict
clearing via `tp_mro` self-references). This triggers a massive cascade:
`type_dealloc → property_dealloc → meth_dealloc → tp_dealloc_impl → destruct`.

At scale (~1,200+ function_records), the volume of `delete`/`free` calls corrupts
heap metadata, causing subsequent `std::free()` to receive garbage pointers → SEGV.

* Add detail::py_is_finalizing() wrapper to deduplicate version-guarded #ifdef blocks

Also fixes clang-tidy readability-implicit-bool-conversion warnings.

Made-with: Cursor

---------

Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
2026-03-23 20:59:26 -07:00
Henry Schreiner
1c72409f7f chore: some minor CPython API cleanup (#6005)
* chore: use PyType_GetFlags

Signed-off-by: Henry Schreiner <henryfs@princeton.edu>

* chore: use public VectorCall in 3.9+

Signed-off-by: Henry Schreiner <henryfs@princeton.edu>

---------

Signed-off-by: Henry Schreiner <henryfs@princeton.edu>
2026-03-22 14:27:20 -07:00
pre-commit-ci[bot]
4d51aefc2c chore(deps): update pre-commit hooks (#6002)
* chore(deps): update pre-commit hooks

updates:
- [github.com/pre-commit/mirrors-clang-format: v21.1.8 → v22.1.0](https://github.com/pre-commit/mirrors-clang-format/compare/v21.1.8...v22.1.0)
- [github.com/astral-sh/ruff-pre-commit: v0.14.14 → v0.15.4](https://github.com/astral-sh/ruff-pre-commit/compare/v0.14.14...v0.15.4)
- [github.com/adhtruong/mirrors-typos: v1.42.3 → v1.44.0](https://github.com/adhtruong/mirrors-typos/compare/v1.42.3...v1.44.0)
- [github.com/PyCQA/pylint: v4.0.4 → v4.0.5](https://github.com/PyCQA/pylint/compare/v4.0.4...v4.0.5)
- [github.com/python-jsonschema/check-jsonschema: 0.36.1 → 0.37.0](https://github.com/python-jsonschema/check-jsonschema/compare/0.36.1...0.37.0)

* style: pre-commit fixes

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2026-03-22 14:23:00 -07:00
Aaron Gokaslan
ac286c932f internals: optimize std::unordered_map internals with noexcept (#5960) 2026-02-16 23:00:56 -08:00
Scott Wolchok
ccb7129f54 Improve performance of enum_ operators by going back to specific implementation (#5887)
* Improve performance of enum_ operators by going back to specific implementation

test_enum needs a patch because ops are now overloaded and this affects their docstrings.

* outline call_impl to save on code size

This does cause more move constructions, as shown by the needed update to test_copy_move. Up to reviewers whether they want more code size or more moves.

* add function_ref.h to PYBIND11_HEADERS.

* Update test_copy_move tests with C++17 passing values just so we can see mostly-not-red tests

* Remove stray TODO

* fix clang-tidy

* fix clang-tidy again. add function_ref.h to test_files.py

* Add static assertion for function_ref lifetime safety in call_impl

Add a static_assert to document and enforce that function_ref is
trivially copyable, ensuring safe pass-by-value usage. This also
documents the lifetime safety guarantees: function_ref is created
from cap->f which lives in the capture object, and is only used
synchronously within call_impl without being stored beyond its scope.

* Add #undef cleanup for enum operator macros

Undefine all enum operator macros after their last use to prevent
macro pollution and follow the existing code pattern. This matches
the cleanup pattern used for the previous enum operator macros.

* Rename PYBIND11_THROW to PYBIND11_ENUM_OP_THROW_TYPE_ERROR

Rename the macro to be more specific and avoid potential clashes with
public macros. The new name clearly indicates it's scoped to enum
operations and describes its purpose (throwing a type error).

* Clarify comments in function_ref.h

Replace vague comments about 'extensions to <functional>' and 'functions'
with a clearer description that this is a header-only class template
similar to std::function but with non-owning semantics. This makes it
clear that it's template-only and requires no additional library linking.

---------

Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
2026-02-16 23:00:21 -08:00
Michael Carlstrom
e8e8d6ab22 Expand float and complex strict mode to allow ints and ints/float (for PEP 484 compatibility). (#5879)
* init

Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>

* Add constexpr to is_floating_point check

This is known at compile time so it can be constexpr

* Allow noconvert float to accept int

* Update noconvert documentation

* Allow noconvert complex to accept int and float

* Add complex strict test

* style: pre-commit fixes

* Update unit tests so int, becomes double.

* style: pre-commit fixes

* remove if (constexpr)

Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>

* fix spelling error

* bump order in #else

* Switch order in c++11 only section

Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>

* ci: trigger build

* ci: trigger build

* Allow casting from float to int

The int type caster allows anything that implements __int__ with explicit exception of the python float. I can't see any reason for this.
This modifies the int casting behaviour to accept a float.
If the argument is marked as noconvert() it will only accept int.

* tests for py::float into int

* Update complex_cast tests

* Add SupportsIndex to int and float

* style: pre-commit fixes

* fix assert

* Update docs to mention other conversions

* fix pypy __index__ problems

* style: pre-commit fixes

* extract out PyLong_AsLong __index__ deprecation

Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>

* style: pre-commit fixes

* Add back env.deprecated_call

Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>

* remove note

Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>

* remove untrue comment

Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>

* fix noconvert_args

Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>

* resolve error

Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>

* Add comment

Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>

* [skip ci]

tests: Add overload resolution test for float/int breaking change

Add test_overload_resolution_float_int() to explicitly test the breaking
change where int arguments now match float overloads when registered first.

The existing tests verify conversion behavior (int -> float, int/float -> complex)
but do not test overload resolution when both float and int overloads exist.
This test fills that gap by:

- Testing that float overload registered before int overload matches int(42)
- Testing strict mode (noconvert) overload resolution breaking change
- Testing complex overload resolution with int/float/complex overloads
- Documenting the breaking change explicitly

This complements existing tests which verify 'can it convert?' by testing
'which overload wins when multiple can convert?'

* Add test to verify that custom __index__ objects (not PyLong) work correctly with complex conversion. These should be consistent across CPython, PyPy, and GraalPy.

* Improve comment clarity for PyPy __index__ handling

Replace cryptic 'So: PYBIND11_INDEX_CHECK(src.ptr())' comment with
clearer explanation of the logic:

- Explains that we need to call PyNumber_Index explicitly on PyPy
  for non-PyLong objects
- Clarifies the relationship to the outer condition: when convert
  is false, we only reach this point if PYBIND11_INDEX_CHECK passed
  above

This makes the code more maintainable and easier to understand
during review.

* Undo inconsequential change to regex in test_enum.py

During merge, HEAD's regex pattern was kept, but master's version is preferred.
The order of ` ` and `\|` in the character class is arbitrary. Keep master's order
(already fixed in PR #5891; sorry I missed looking back here when working on 5891).

* test_methods_and_attributes.py: Restore existing `m.overload_order(1.1)` call and clearly explain the behavior change.

* Reject float → int conversion even in convert mode

Enabling implicit float → int conversion in convert mode causes
silent truncation (e.g., 1.9 → 1). This is dangerous because:

1. It's implicit - users don't expect truncation when calling functions
2. It's silent - no warning or error
3. It can hide bugs - precision loss is hard to detect

This change restores the explicit rejection of PyFloat_Check for integer
casters, even in convert mode. This is more in line with Python's behavior
where int(1.9) must be explicit.

Note that the int → float conversion in noconvert mode is preserved,
as that's a safe widening conversion.

* Revert test changes that sidestepped implicit float→int conversion

This reverts all test modifications that were made to accommodate
implicit float→int conversion in convert mode. With the production
code change that explicitly rejects float→int conversion even in
convert mode, these test workarounds are no longer needed.

Changes reverted:
- test_builtin_casters.py: Restored cant_convert(3.14159) and
  np.float32 conversion with deprecated_call wrapper
- test_custom_type_casters.py: Restored TypeError expectation for
  m.ints_preferred(4.0)
- test_methods_and_attributes.py: Restored TypeError expectation
  for m.overload_order(1.1)
- test_stl.py: Restored float literals (2.0) that were replaced with
  strings to avoid conversion
- test_factory_constructors.py: Restored original constructor calls
  that were modified to avoid float→int conversion

Also removes the unused avoid_PyLong_AsLong_deprecation fixture
and related TypeVar imports, as all uses were removed.

* Replace env.deprecated_call() with pytest.deprecated_call()

The env.deprecated_call() function was removed, but two test cases
still reference it. Replace with pytest.deprecated_call(), which is
the standard pytest context manager for handling deprecation warnings.

Since we already require pytest>=6 (see tests/requirements.txt), the
compatibility function is obsolete and pytest.deprecated_call() is
available.

* Update test expectations for swapped NoisyAlloc overloads

PR 5879 swapped the order of NoisyAlloc constructor overloads:
- (int i, double) is now placement new (comes first)
- (double d, double) is now factory pointer (comes second)

This swap is necessary because pybind11 tries overloads in order
until one matches. With int → float conversion now allowed:

- create_and_destroy(4, 0.5): Without the swap, (double d, double)
  would match first (since int → double conversion is allowed),
  bypassing the more specific (int i, double) overload. With the
  swap, (int i, double) matches first (exact match), which is
  correct.

- create_and_destroy(3.5, 4.5): (int i, double) fails (float → int
  is rejected), then (double d, double) matches, which is correct.

The swap ensures exact int matches are preferred over double matches
when an int is provided, which is the expected overload resolution
behavior.

Update the test expectations to match the new overload resolution
order.

* Resolve clang-tidy error:

/__w/pybind11/pybind11/include/pybind11/cast.h:253:46: error: repeated branch body in conditional chain [bugprone-branch-clone,-warnings-as-errors]
  253 |         } else if (PyFloat_Check(src.ptr())) {
      |                                              ^
/__w/pybind11/pybind11/include/pybind11/cast.h:258:10: note: end of the original
  258 |         } else if (convert || PYBIND11_LONG_CHECK(src.ptr()) || PYBIND11_INDEX_CHECK(src.ptr())) {
      |          ^
/__w/pybind11/pybind11/include/pybind11/cast.h:283:16: note: clone 1 starts here
  283 |         } else {
      |                ^

* Add test coverage for __index__ and __int__ edge cases: incorrectly returning float

These tests ensure that:
- Invalid return types (floats) are properly rejected
- The fallback from __index__ to __int__ works correctly in convert mode
- noconvert mode correctly prevents fallback when __index__ fails

* Minor comment-only changes: add PR number, for easy future reference

* Ensure we are not leaking a Python error is something is wrong elsewhere (e.g. UB, or bug in Python beta testing).

See also: https://github.com/pybind/pybind11/pull/5879#issuecomment-3521099331

* [skip ci] Bump PYBIND11_INTERNALS_VERSION to 12 (for PRs 5879, 5887, 5960)

---------

Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>
Co-authored-by: gentlegiantJGC <gentlegiantJGC@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
2026-02-16 23:00:01 -08:00
Ralf W. Grosse-Kunstleve
2448bc5853 [skip ci] Bump version to v3.1.0a0 (#5987) 2026-02-16 21:12:56 -08:00
Ralf W. Grosse-Kunstleve
45fab4087e Update version number to v3.0.2 (final) and set release date in changelog.md to February 16, 2026 (#5985) 2026-02-16 20:31:46 -08:00
Xuehai Pan
3ae5a173c5 Add fallback implementation of PyCriticalSection_BeginMutex for Python 3.13t (#5981)
* Add failback implementation of `PyCriticalSection_BeginMutex` for Python 3.13t

* Add comment for Python version

* Use `_PyCriticalSection_BeginSlow`

* Add forward declaration

* Fix forward declaration

* Remove always true condition `defined(PY_VERSION_HEX)`

* Detect musllinux

* Add manylinux test

* Use direct mutex locking for Python 3.13t

`_PyCriticalSection_BeginSlow` is a private CPython function not exported
on Linux. For Python < 3.14.0rc1, use direct `mutex.lock()`/`mutex.unlock()`
instead of critical section APIs.

* Empty commit to trigger CI

* Empty commit to trigger CI

* Empty commit to trigger CI

* Run apt update before apt install

* Remove unnecessary prefix

* Add manylinux test with Python 3.13t

* Simplify pycritical_section with std::unique_lock fallback for Python < 3.14

* Fix potential deadlock in make_iterator_impl for Python 3.13t

Refactor pycritical_section into a unified class with internal version
checks instead of using a type alias fallback. Skip locking in
make_iterator_impl for Python < 3.14.0rc1 to avoid deadlock during
type registration, as pycritical_section cannot release the mutex
during Python callbacks without PyCriticalSection_BeginMutex.

* Add reference for xfail message
2026-02-09 21:14:07 -08:00
Daniel Simon
5f2c678916 Add helpers to array that return the size and strides as a std::span (#5974)
* Add helper functions to pybind11::array to return the shape and strides as a std::span. These functions are hidden with macros unless PYBIND11_CPP20 is defined and the <span> include has been found.

* style: pre-commit fixes

* tests: Add unit tests for shape_span() and strides_span()

Add comprehensive unit tests for the new std::span helper functions:
- Test 0D, 1D, 2D, and 3D arrays
- Verify spans match regular shape()/strides() methods
- Test that spans can be used to construct new arrays
- Tests are conditionally compiled only when PYBIND11_HAS_SPAN is defined

* Use __cpp_lib_span feature test macro instead of __has_include

Replace __has_include(<span>) check with __cpp_lib_span feature test macro
to resolve ambiguity where some pre-C++20 systems might have a global
header called <span> that isn't the C++20 std::span.

The check is moved after <version> is included, consistent with how
__cpp_lib_char8_t is handled.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Fix: Use py::ssize_t instead of ssize_t in span tests

On Windows/MSVC, ssize_t is not available in the standard namespace
without proper includes. Use py::ssize_t (the pybind11 typedef) instead
to ensure cross-platform compatibility.

Fixes compilation errors on:
- Windows/MSVC 2022 (C++20)
- GCC 10 (C++20)

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-08 15:04:46 -08:00
Sam Gross
4d7d02a8e5 Fix race condition with py::make_key_iterator in free threading (#5971)
* Fix race condition with py::make_key_iterator in free threading

The creation of the iterator class needs to be synchronized.

* style: pre-commit fixes

* Use PyCriticalSection_BeginMutex instead of recursive mutex

* style: pre-commit fixes

* Make pycritical_section non-copyable and non-movable

The pycritical_section class is a RAII wrapper that manages a Python
critical section lifecycle:
- Acquires the critical section in the constructor via
  PyCriticalSection_BeginMutex
- Releases it in the destructor via PyCriticalSection_End
- Holds a reference to a pymutex

Allowing copy or move operations would be dangerous:

1. Copy: Both the original and copied objects would call
   PyCriticalSection_End on the same PyCriticalSection object in their
   destructors, leading to double-unlock and undefined behavior.

2. Move: The moved-from object's destructor would still run and attempt
   to end the critical section, while the moved-to object would also try
   to end it, again causing double-unlock.

This follows the same pattern used by other RAII lock guards in the
codebase, such as gil_scoped_acquire and gil_scoped_release, which also
explicitly delete copy/move operations to prevent similar issues.

By explicitly deleting these operations, we prevent accidental misuse
and ensure the critical section is properly managed by a single RAII
object throughout its lifetime.

* Drop Python 3.13t support from CI

Python 3.13t was experimental, while Python 3.14t is not. This PR
uses PyCriticalSection_BeginMutex which is only available in Python
3.14+, making Python 3.13t incompatible with the changes.

Removed all Python 3.13t CI jobs:
- ubuntu-latest, 3.13t (standard-large matrix)
- macos-15-intel, 3.13t (standard-large matrix)
- windows-latest, 3.13t (standard-large matrix)
- manylinux job testing 3.13t

This aligns with the decision to drop Python 3.13t support as
discussed in PR #5971.

* Add Python 3.13 (default) replacement jobs for removed 3.13t jobs

After removing Python 3.13t support (incompatible with PyCriticalSection_BeginMutex
which requires Python 3.14+), we're adding replacement jobs using Python 3.13
(default) to maintain test coverage in key dimensions:

1. ubuntu-latest, Python 3.13: C++20 + DISABLE_HANDLE_TYPE_NAME_DEFAULT_IMPLEMENTATION
   - Replaces: ubuntu-latest, 3.13t with same config
   - Maintains coverage for this specific configuration combination

2. macos-15-intel, Python 3.13: C++11
   - Replaces: macos-15-intel, 3.13t with same config
   - Maintains macOS coverage for Python 3.13

3. manylinux (musllinux), Python 3.13: GIL testing
   - Replaces: manylinux, 3.13t job
   - Maintains manylinux/musllinux container testing coverage

These additions are proposed to get feedback on which jobs should be kept
to maintain appropriate test coverage without the experimental 3.13t builds.

* ci: run in free-threading mode a bit more on 3.14

* Revert "ci: run in free-threading mode a bit more on 3.14"

This reverts commit 91189c9242.

Reason: https://github.com/pybind/pybind11/pull/5971#issuecomment-3831321903

* Reapply "ci: run in free-threading mode a bit more on 3.14"

This reverts commit f3197de975.

After #5972 is/was merged, tests should pass (already tested under #5980).

See also https://github.com/pybind/pybind11/pull/5972#discussion_r2752674989

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
Co-authored-by: Henry Schreiner <HenrySchreinerIII@gmail.com>
Co-authored-by: Ralf W. Grosse-Kunstleve <rwgkio@gmail.com>
2026-02-01 22:02:50 -08:00
Xuehai Pan
e7754de037 Revert internals destruction and add test for internals recreation (#5972)
* Bump internals version

* Prevent internals destruction before all pybind11 types are destroyed

* Use Py_XINCREF and Py_XDECREF

* Hold GIL before decref

* Use weakrefs

* Remove unused code

* Move code location

* Move code location

* Move code location

* Try add tests

* Fix PYTHONPATH

* Fix PYTHONPATH

* Skip tests for subprocess

* Revert to leak internals

* Revert to leak internals

* Revert "Revert to leak internals"

This reverts commit c5ec1cf886.
This reverts commit 72c2e0aa9b.

* Revert internals version bump

* Reapply to leak internals

This reverts commit 8f25a254e8.

* Add re-entrancy detection for internals creation

Prevent re-creation of internals after destruction during interpreter
shutdown. If pybind11 code runs after internals have been destroyed,
fail early with a clear error message instead of silently creating
new empty internals that would cause type lookup failures.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix C++11/C++14 support

* Add lock under multiple interpreters

* Try fix tests

* Try fix tests

* Try fix tests

* Update comments and assertion messages

* Update comments and assertion messages

* Update comments

* Update lock scope

* Use original pointer type for Windows

* Change hard error to warning

* Update lock scope

* Update lock scope to resolve deadlock

* Remove scope release of GIL

* Update comments

* Lock pp on reset

* Mark content created after assignment

* Update comments

* Simplify implementation

* Update lock scope when delete unique_ptr

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 21:54:05 -08:00
Franz Pöschel
5a6edc9998 Exclude MSVC up to 19.16 from using std::launder (#5968)
* Exclude further MSVC versions from std::launder

Versions 19.4, 19.5 and 19.6 now also excluded. Error seen with 19.6, error triggered by this commit:
57b9a0af81

_deps\fetchedpybind11-src\include\pybind11\pybind11.h(3008): fatal error C1001: An internal error has occurred in the compiler. [C:\projects\openpmd-api\build\openPMD.py.vcxproj]
  (compiler file 'd:\agent\_work\8\s\src\vctools\compiler\utc\src\p2\main.c', line 187)
   To work around this problem, try simplifying or changing the program near the locations listed above.
  Please choose the Technical Support command on the Visual C++
   Help menu, or open the Technical Support help file for more information

* Add minimal comment // See PR #5968

---------

Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
2026-01-20 21:49:59 -08:00
b-pass
a8e223d0cd Directly check if/which interpreter is active before doing CLEAR in destructor (#5965)
* Directly check if/which interpreter is active before doing CLEAR in destructors.

Py_IsFinalizing only applies to the main interpreter.

* Backward compatibility fixes

* Make clang-tidy happy

* Add nullptr checks to istate as Cursor suggested
2026-01-20 06:51:25 -05:00
b-pass
da6e071084 Destruct internals during interpreter finalization (#5958)
* Add a shutdown method to internals.

shutdown can safely DECREF Python objects owned by the internals.

* Actually free internals during interpreter shutdown (instead of after)

* Make sure python is alive before DECREFing

If something triggers internals to be created during finalization, it might end up being destroyed after finalization and we don't want to do the DECREF at that point, we need the leaky behavior.

* make clang-tidy happy

* Check IsFinalizing and use Py_CLEAR, make capsule creation safe if the capsule already exists.

* oops, put TLS destructor back how it was.

* Oops, proper spelling of unstable _Py_IsFinalizing

* Add cleanup step to CI workflow

Added a step to clean out unused files to save space during CI.

* Accept suggested comment

* Avoid recreating internals during type deallocation at shutdown.

---------

Co-authored-by: Henry Schreiner <HenrySchreinerIII@gmail.com>
2026-01-18 13:24:34 -05:00
Xuehai Pan
cc551ada33 Appease MSVC Warning C4866: compiler may not enforce left-to-right evaluation order (#5955) 2026-01-10 10:46:01 -08:00
Xuehai Pan
d36f5dd5a4 Appease MSVC Warning C4866: compiler may not enforce left-to-right evaluation order (#5953)
* Appease MSVC Warning C4866: compiler may not enforce left-to-right evaluation order

* Remove const qualifier

* Reword comment to be self-explanatory without PR context

The previous comment referenced the MSVC warning but didn't explain
why the code is structured as two statements. The revised comment
clarifies the intent: fetching the value first ensures well-defined
evaluation order.

* chore(deps): switch typos to mirror repo

Switch from crate-ci/typos to adhtruong/mirrors-typos because
pre-commit autoupdate confuses tags in the upstream repo, selecting
the mutable `v1` tag instead of pinned versions like `v1.41.0`.

See https://github.com/crate-ci/typos/issues/390

---------

Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
2026-01-07 16:28:38 -08:00
T.Yamada
745e5688e7 Fix longdouble complex docstring to clongdouble (#5952) 2026-01-06 23:04:17 -08:00
b-pass
10f8708979 Change function calls to use vectorcall (#5948)
* Make argument_vector re-usable for other types.

* Attempt to collect args into array for vectorcall

* Revert "Attempt to collect args into array for vectorcall"

This reverts commit 418a034195.

* Implement vectorcall args collector

* pre-commit fixes

* Checkpoint in moving to METH_FASTCALL

* pre-commit fixes

* Use the names tuple directly, cleaner code and less reference counting

* Fix unit test, the code now holds more references

It cannot re-use the incoming tuple as before, because it is no longer a tuple at all.  So a new tuple must be created, which then holds references for each member.

* Make clangtidy happy

* Oops, _v is C++14

* style: pre-commit fixes

* Minor code cleanup

* Fix signed conversions

* Fix args expansion

This would be easier with `if constexpr`

* style: pre-commit fixes

* Code cleanup

* fix(tests): Install multiple-interpreter test modules into wheel

The `mod_per_interpreter_gil`, `mod_shared_interpreter_gil`, and
`mod_per_interpreter_gil_with_singleton` modules were being built
but not installed into the wheel when using scikit-build-core
(SKBUILD=true). This caused iOS (and potentially Android) CIBW
tests to fail with ModuleNotFoundError.

Root cause analysis:
- The main test targets have install() commands (line 531)
- The PYBIND11_MULTIPLE_INTERPRETERS_TEST_MODULES were missing
  equivalent install() commands
- For regular CMake builds, this wasn't a problem because
  LIBRARY_OUTPUT_DIRECTORY places the modules next to pybind11_tests
- For wheel builds, only targets with explicit install() commands
  are included in the wheel

This issue was latent until commit fee2527d changed the test imports
from `pytest.importorskip()` (graceful skip) to direct `import`
statements (hard failure), which exposed the missing modules.

Failing tests:
- test_multiple_interpreters.py::test_independent_subinterpreters
- test_multiple_interpreters.py::test_dependent_subinterpreters

Error: ModuleNotFoundError: No module named 'mod_per_interpreter_gil'

* tests: Pin numpy 2.4.0 for Python 3.14 CI tests

Add numpy==2.4.0 requirement for Python 3.14 (both default and
free-threaded builds). NumPy 2.4.0 is the first version to provide
official PyPI wheels for Python 3.14:

- numpy-2.4.0-cp314-cp314-manylinux_2_27_x86_64...whl (default)
- numpy-2.4.0-cp314-cp314t-manylinux_2_27_x86_64...whl (free-threaded)

Previously, CI was skipping all numpy-dependent tests for Python 3.14
because PIP_ONLY_BINARY was set and no wheels were available:

  SKIPPED [...] test_numpy_array.py:8: could not import 'numpy':
  No module named 'numpy'

With this change, the full numpy test suite will run on Python 3.14,
providing better test coverage for the newest Python version.

Note: Using exact pin (==2.4.0) rather than compatible release (~=2.4.0)
to ensure reproducible CI results with the first known-working version.

* tests: Add verbose flag to CIBW pytest command

Add `-v` to the pytest command in tests/pyproject.toml to help
diagnose hanging tests in CIBW jobs (particularly iOS).

This will show each test name as it runs, making it easier to
identify which specific test is hanging.

* tests: Skip subinterpreter tests on iOS, add pytest timeout

- Add `IOS` platform constant to `tests/env.py` for consistency with
  existing `ANDROID`, `LINUX`, `MACOS`, `WIN`, `FREEBSD` constants.

- Skip `test_multiple_interpreters.py` module on iOS. Subinterpreters
  are not supported in the iOS simulator environment. These tests were
  previously skipped implicitly because the modules weren't installed
  in the wheel; now that they are (commit 6ed6d5a8), we need an
  explicit skip.

- Change pytest timeout from 0 (disabled) to 120 seconds. This provides
  a safety net to catch hanging tests before the CI job times out after
  hours. Normal test runs complete in 33-55 seconds total (~1100 tests),
  so 120 seconds per test is very generous.

- Add `-v` flag for verbose output to help diagnose any future issues.

* More cleanups in argument vector, per comments.

* Per Cursor, move all versions to Vectorcall since it has been supported since 3.8.

This means getting rid of simple_collector, we can do the same with a constexpr if in the unpacking_collector.

* Switch to a bool vec for the used_kwargs flag...

This makes more sense and saves a sort, and the small_vector implementation means it will actually take less space than a vector of size_t elements.

The most common case is that all kwargs are used.

* Fix signedness for clang

* Another signedness issue

* tests: Disable pytest-timeout for Pyodide (no signal.setitimer)

Pyodide runs in a WebAssembly sandbox without POSIX signals, so
`signal.setitimer` is not available. This causes pytest-timeout to
crash with `AttributeError: module 'signal' has no attribute 'setitimer'`
when timeout > 0.

Override the test-command for Pyodide to keep timeout=0 (disabled).

* Combine temp storage and args into one vector

It's a good bit faster at the cost of this one scary reinterpret_cast.

* Phrasing

* Delete incorrect comment

At 6, the struct is 144 bytes (not 128 bytes as the comment said).

* Fix push_back

* Update push_back in argument_vector.h

Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>

* style: pre-commit fixes

* Use real types for these instead of object

They can be null if you "steal" a null handle.

* refactor: Replace small_vector<object> with ref_small_vector for explicit ownership

Introduce `ref_small_vector` to manage PyObject* references in `unpacking_collector`,
replacing the previous `small_vector<object>` approach.

Primary goals:

1. **Maintainability**: The previous implementation relied on
   `sizeof(object) == sizeof(PyObject*)` and used a reinterpret_cast to
   pass the object array to vectorcall. This coupling to py::object's
   internal layout could break if someone adds a debug field or other
   member to py::handle/py::object in the future.

2. **Readability**: The new `push_back_steal()` vs `push_back_borrow()`
   API makes reference counting intent explicit at each call site,
   rather than relying on implicit py::object semantics.

3. **Intuitive code**: Storing `PyObject*` directly and passing it to
   `_PyObject_Vectorcall` without casts is straightforward and matches
   what the C API expects. No "scary" reinterpret_cast needed.

Additional benefits:
- `PyObject*` is trivially copyable, simplifying vector operations
- Batch decref in destructor (tight loop vs N individual object destructors)
- Self-documenting ownership semantics

Design consideration: We considered folding the ref-counting functionality
directly into `small_vector` via template specialization for `PyObject*`.
We decided against this because:
- It would give `small_vector<PyObject*, N>` a different interface than the
  generic `small_vector<T, N>` (steal/borrow vs push_back)
- Someone might want a non-ref-counting `small_vector<PyObject*, N>`
- The specialization behavior could surprise users expecting uniform semantics

A separate `ref_small_vector` type makes the ref-counting behavior explicit
and self-documenting, while keeping `small_vector` generic and predictable.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>
2026-01-06 16:32:57 -05:00
Ralf W. Grosse-Kunstleve
b93c0f7ed8 Fix ambiguous str(handle) constructor for object-derived types (#5949)
* Fix ambiguous `str(handle)` constructor for object-derived types

Templatize `str(handle h)` with SFINAE to exclude types derived from
`object`, resolving ambiguity with `str(const object&)` when calling
`py::str()` on types like `kwargs`, `dict`, etc.

The template now only accepts `handle` or `PyObject*`, while all
`object`-derived types use the `str(const object&)` overload.

* fix(tests): CIBW test fixes from b-pass→vectorcall branch

- Install multiple-interpreter test modules into wheel (CMakeLists.txt)
  The mod_per_interpreter_gil, mod_shared_interpreter_gil, and
  mod_per_interpreter_gil_with_singleton modules were being built
  but not installed into the wheel when using scikit-build-core.

- Pin numpy 2.4.0 for Python 3.14 CI tests (requirements.txt)
  NumPy 2.4.0 is the first version with official Python 3.14 wheels.

- Add IOS platform constant to tests/env.py

- Skip subinterpreter tests on iOS (test_multiple_interpreters.py)
  Subinterpreters are not supported in the iOS simulator environment.

- Enable pytest timeout of 120s for CIBW tests (pyproject.toml)
  Provides a safety net to catch hanging tests before CI job timeout.

- Disable pytest-timeout for Pyodide (no signal.setitimer)
  Pyodide runs in WebAssembly without POSIX signals.

- Add -v flag for verbose pytest output in CIBW tests
2025-12-30 04:54:44 -08:00
Xuehai Pan
fee2527dfa Fix concurrency consistency for internals_pp_manager under multiple-interpreters (#5947)
* Add per-interpreter storage for `gil_safe_call_once_and_store`

* Disable thread local cache for `internals_pp_manager`

* Disable thread local cache for `internals_pp_manager` for multi-interpreter only

* Use anonymous namespace to separate these type_ids from other tests with the same class names.

* style: pre-commit fixes

* Revert internals_pp_manager changes

* This is the crux of fix for the subinterpreter_before_main failure.

The pre_init needs to check if it is in a subinterpreter or not. But in 3.13+ this static initializer runs in the main interpreter.  So we need to check this later, during the exec phase.

* Continue to do the ensure in both places, there might be a reason it was where it was...

Should not hurt anything to do it extra times here.

* Change get_num_interpreters_seen to a boolean flag instead.

The count was not used, it was just checked for > 1, we now accomplish this by setting the flag.

* Spelling typo

* Work around older python versions, only need this check for newish versions

* Add more comments for test case

* Add more comments for test case

* Stop traceback propagation

* Re-enable subinterpreter support on ubuntu 3.14 builds

Was disabled in e4873e8

* As suggested, don't use an anonymous namespace.

* Typo in test assert format string

* Use a more appropriate function name

* Fix mod_per_interpreter_gil* output directory on Windows/MSVC

On Windows with MSVC (multi-configuration generators), CMake uses
config-specific properties like LIBRARY_OUTPUT_DIRECTORY_DEBUG when
set, otherwise falls back to LIBRARY_OUTPUT_DIRECTORY/<Config>/.

The main test modules (pybind11_tests, etc.) correctly set both
LIBRARY_OUTPUT_DIRECTORY and the config-specific variants (lines
517-528), so they output directly to tests/.

However, the mod_per_interpreter_gil* modules only copied the base
LIBRARY_OUTPUT_DIRECTORY property, causing them to be placed in
tests/Debug/ instead of tests/.

This mismatch caused test_import_in_subinterpreter_concurrently and
related tests to fail with ModuleNotFoundError on Windows Python 3.14,
because the test code sets sys.path based on pybind11_tests.__file__
(which is in tests/) but tries to import mod_per_interpreter_gil_with_singleton
(which ended up in tests/Debug/).

This bug was previously masked by @pytest.mark.xfail decorators on
these tests. Now that the underlying "Duplicate C++ type registration"
issue is fixed and the xfails are removed, this path issue surfaced.

The fix mirrors the same pattern used for main test targets: also set
LIBRARY_OUTPUT_DIRECTORY_<CONFIG> for each configuration type.

* Remove unneeded `pytest.importorskip`

* Remove comment

---------

Co-authored-by: b-pass <b-pass@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
2025-12-26 13:59:11 -05:00
Xuehai Pan
0057e4945d Add per-interpreter storage for gil_safe_call_once_and_store (#5933)
* Add new argument to `gil_safe_call_once_and_store::call_once_and_store_result`

* Add per-interpreter storage for `gil_safe_call_once_and_store`

* Make `~gil_safe_call_once_and_store` a no-op

* Fix C++11 compatibility

* Improve thread-safety and add default finalizer

* Try fix thread-safety

* Try fix thread-safety

* Add a warning comment

* Simplify `PYBIND11_INTERNALS_VERSION >= 12`

* Try fix thread-safety

* Try fix thread-safety

* Revert get_pp()

* Update comments

* Move call-once storage out of internals

* Revert internal version bump

* Cleanup outdated comments

* Move atomic_bool alias into pybind11::detail namespace

The `using atomic_bool = ...` declaration was at global scope,
polluting the global namespace. Move it into pybind11::detail
to avoid potential conflicts with user code.

* Add explicit #include <unordered_map> for subinterpreter support

The subinterpreter branch uses std::unordered_map but relied on
transitive includes. Add an explicit include for robustness.

* Remove extraneous semicolon after destructor definition

Style fix: remove trailing semicolon after ~call_once_storage()
destructor body.

* Add comment explaining unused finalize parameter

Clarify why the finalize callback parameter is intentionally ignored
when subinterpreter support is disabled: the storage is process-global
and leaked to avoid destructor calls after interpreter finalization.

* Add comment explaining error_scope usage

Clarify why error_scope is used: to preserve any existing Python
error state that might be cleared or modified by dict_getitemstringref.

* Improve exception safety in get_or_create_call_once_storage_map()

Use std::unique_ptr to hold the newly allocated storage map until
the capsule is successfully created. This prevents a memory leak
if capsule creation throws an exception.

* Add timeout-minutes: 3 to cpptest workflow steps

Add a 3-minute timeout to all C++ test (cpptest) steps across all
platforms to detect hangs early. This uses GitHub Actions' built-in
timeout-minutes property which works on Linux, macOS, and Windows.

* Add progress reporter for test_with_catch Catch2 runner

Add a custom Catch2 streaming reporter that prints one line per test
case as it starts and ends, with immediate flushing to keep CI logs
current. This makes it easy to see where the embedded/interpreter
tests are spending time and to pinpoint which test case is stuck
when builds hang (e.g., free-threading issues).

The reporter:
- Prints "[ RUN      ]" when each test starts
- Prints "[       OK ]" or "[  FAILED  ]" when each test ends
- Prints the Python version once at the start via Py_GetVersion()
- Uses StreamingReporterBase for immediate output (not buffered)
- Is set as the default reporter via CATCH_CONFIG_DEFAULT_REPORTER

This approach gives visibility into all tests without changing their
behavior, turning otherwise opaque 90-minute CI timeouts into
locatable issues in the Catch output.

* clang-format auto-fix (overlooked before)

* Disable "Move Subinterpreter" test on free-threaded Python 3.14+

This test hangs in Py_EndInterpreter() when the subinterpreter is
destroyed from a different thread than it was created on.

The hang was observed:
- Intermittently on macOS with Python 3.14.0t
- Predictably on macOS, Ubuntu, and Windows with Python 3.14.1t and 3.14.2t

Root cause analysis points to an interaction between pybind11's
subinterpreter creation code and CPython's free-threaded runtime,
specifically around PyThreadState_Swap() after PyThreadState_DeleteCurrent().

See detailed analysis: https://github.com/pybind/pybind11/pull/5933

* style: pre-commit fixes

* Add test for gil_safe_call_once_and_store per-interpreter isolation

This test verifies that gil_safe_call_once_and_store provides separate
storage for each interpreter when subinterpreter support is enabled.

The test caches the interpreter ID in the main interpreter, then creates
a subinterpreter and verifies it gets its own cached value (not the main
interpreter's). Without per-interpreter storage, the subinterpreter would
incorrectly see the main interpreter's cached object.

* Add STARTING/DONE timestamps to test_with_catch output

Print UTC timestamps at the beginning and end of the test run to make
it immediately clear when tests started and whether they ran to
completion. The DONE message includes the Catch session result value.

Example output:
  [ STARTING ] 2025-12-21 03:23:20.497Z
  [ PYTHON   ] 3.14.2 ...
  [ RUN      ] Threads
  [       OK ] Threads
  [ DONE     ] 2025-12-21 03:23:20.512Z (result 0)

* Disable stdout buffering in test_with_catch

Ensure test output appears immediately in CI logs by disabling stdout
buffering. Without this, output may be lost if the process is killed
by a timeout, making it difficult to diagnose which test was hanging.

* EXPERIMENT: Re-enable hanging test to verify CI log buffering fix

This is a temporary commit to verify that the unbuffered stdout fix
makes the hanging test visible in CI logs. REVERT THIS COMMIT after
confirming the output appears.

* Revert "Disable stdout buffering in test_with_catch"

This reverts commit 0f8f32a92a.

* Use USES_TERMINAL for cpptest to show output immediately

Ninja buffers subprocess output until completion. When a test hangs,
the output is never shown, making it impossible to diagnose which test
is hanging. USES_TERMINAL gives the command direct terminal access,
bypassing ninja's buffering.

This explains why Windows CI showed test progress but Linux/macOS did
not - Windows uses MSBuild which doesn't buffer the same way.

* Fix clang-tidy performance-avoid-endl warning

Use '\n' instead of std::endl since USES_TERMINAL now handles
output buffering at the CMake level.

* Add SIGTERM handler to show when test is killed by timeout

When a test hangs and is killed by `timeout`, Catch2 marks it as failed
but the process exits before printing [ DONE ]. This made it unclear
whether the test failed normally or was terminated.

The signal handler prints a clear message when SIGTERM is received,
making timeout-related failures obvious in CI logs.

* Fix typo: atleast -> at_least

* Fix GCC warn_unused_result error for write() in signal handler

Assign the return value to a variable to satisfy GCC's warn_unused_result
attribute, then cast to void to suppress unused variable warning.

* Add USES_TERMINAL to other C++ test targets

Apply the same ninja output buffering fix to test_cross_module_rtti
and test_pure_cpp targets. Also add explanatory comments to all
USES_TERMINAL usages.

* Revert "EXPERIMENT: Re-enable hanging test to verify CI log buffering fix"

This reverts commit a3abdeea89.

* Update comment to reference PR #5940 for Move Subinterpreter fix

* Add alias `interpid_t = std::int64_t`

* Add isolation and gc test for `gil_safe_call_once_and_store`

* Add thread local cache for gil_safe_call_once_and_store

* Revert "Add thread local cache for gil_safe_call_once_and_store"

This reverts commit 5d6681956d2d326fe74c7bf80e845c8e8ddb2a7c.

* Revert changes according to code review

* Relocate multiple-interpreters tests

* Add more tests for multiple interpreters

* Remove copy constructor

* Apply suggestions from code review

* Refactor to use per-storage capsule instead

* Update comments

* Update singleton tests

* Use interpreter id type for `get_num_interpreters_seen()`

* Suppress unused variable warning

* HACKING

* Revert "HACKING"

This reverts commit 534235ea55.

* Try fix concurrency

* Test even harder

* Reorg code to avoid duplicates

* Fix unique_ptr::reset -> unique_ptr::release

* Extract reusable functions

* Fix indentation

* Appease warnings for MSVC

* Appease warnings for MSVC

* Appease warnings for MSVC

* Try fix concurrency by not using `get_num_interpreters_seen() > 1`

* Try fix tests

* Make Python path handling more robust

* Update comments and assertion messages

* Revert changes according to code review

* Disable flaky tests

* Use `@pytest.mark.xfail` rather than `pytest.skip`

* Retrigger CI

* Retrigger CI

* Revert file moves

* Refactor atomic_get_or_create_in_state_dict: improve API and fix on_fetch_ bug

Three improvements to atomic_get_or_create_in_state_dict:

1. Return std::pair<Payload*, bool> instead of just Payload*
   - The bool indicates whether storage was newly created (true) or
     already existed (false), following std::map::insert convention.
   - This fixes a bug where on_fetch_ was called even for newly created
     internals, when it should only run for fetched (existing) ones.
     (Identified by @b-pass in code review)

2. Change LeakOnInterpreterShutdown from template param to runtime arg
   - Renamed to `clear_destructor` to describe what it does locally,
     rather than embedding assumptions about why it's used.
   - Reduces template instantiations (header-only library benefits).
   - The check is in the slow path (create) anyway, so negligible cost.

3. Remove unnecessary braces around the fast-path lookup
   - The braces created a nested scope but declared no local variables
     that would benefit from scoping.

* Remove unused PYBIND11_MULTIPLE_INTERPRETERS_TEST_FILES variable

This variable was defined but never used.

---------

Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-12-24 23:33:02 -08:00
KenLee
7ae61bfb82 Avoid LNK2001 in c++20 when /GL (Whole program optimization) is on with MSVC Update pybind11.h (#5939) 2025-12-22 19:07:33 -08:00
Yuanyuan Chen
5b379161aa Apply clang-tidy fixes to subinterpreter support code (#5929)
* Fix PyObject_HasAttrString return value

Signed-off-by: cyy <cyyever@outlook.com>

* Tidy unchecked files

Signed-off-by: cyy <cyyever@outlook.com>

* [skip ci] Handle PyObject_HasAttrString error when probing __notes__

PyObject_HasAttrString may return -1 to signal an error and set a
Python exception. The previous logic only checked for "!= 0", which
meant that the error path was treated the same as "attribute exists",
causing two problems: misreporting the presence of __notes__ and
leaving a spurious exception pending.

The earlier PR tightened the condition to "== 1" so that only a
successful lookup marks the error string as [WITH __notes__], but it
still left the -1 case unhandled. In the context of
error_fetch_and_normalize, we are already dealing with an active
exception and only want to best-effort detect whether normalization
attached any __notes__. If the attribute probe itself fails, we do not
want that secondary failure to affect later C-API calls or the error
we ultimately report.

This change stores the PyObject_HasAttrString return value, treats
"== 1" as "has __notes__", and explicitly calls PyErr_Clear() when
it returns -1. That way, we avoid leaking a secondary error while
still preserving the original exception information and hinting
[WITH __notes__] only when we can determine it reliably.

* Run clang-tidy with -DPYBIND11_HAS_SUBINTERPRETER_SUPPORT

* [skip ci] Revert "Run clang-tidy with -DPYBIND11_HAS_SUBINTERPRETER_SUPPORT"

This reverts commit bb6e751de4.

---------

Signed-off-by: cyy <cyyever@outlook.com>
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
2025-12-13 16:20:26 -08:00
Yuanyuan Chen
032e73d563 Replace C-style casts to static_cast and reinterpret_cast (#5930)
* Replace C-style casts to static_cast and reinterpret_cast

Signed-off-by: cyy <cyyever@outlook.com>

* style: pre-commit fixes

---------

Signed-off-by: cyy <cyyever@outlook.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-12-12 21:42:36 -08:00
Yuanyuan Chen
3ebbecb8af Add more readability tidy rules (#5924)
* Apply clang-tidy readibility fixes

Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>

* Add checks

Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>

* More fixes

Signed-off-by: cyy <cyyever@outlook.com>

---------

Signed-off-by: Yuanyuan Chen <cyyever@outlook.com>
Signed-off-by: cyy <cyyever@outlook.com>
2025-12-08 09:36:51 -08:00
Ralf W. Grosse-Kunstleve
55e4bb9135 Work around GCC -Warray-bounds false positive in argument_vector (#5908) 2025-11-30 10:01:36 -08:00
gentlegiantJGC
af796d0a99 Don't allow keep_alive or call_guard on properties (#5533)
* Don't allow keep_alive or call_guard on properties

The def_property family blindly ignore the keep_alive and call_guard arguments passed to them making them confusing to use.
This adds a static_assert if either is passed to make it clear it doesn't work.
I would prefer this to be a compiler warning but I can't find a way to do that. Is that even possible?

* style: pre-commit fixes

* Re-run tests

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-11-15 08:53:15 -08:00
Michael Carlstrom
42cda7570e Fix *args/**kwargs return types. Add type hinting to py::make_tuple (#5881)
* Type hint make_tuple / fix *args/**kwargs return type

Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>

* add back commented out panic

* ignore return std move clang

Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>

* fix for mingmw

Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>

* added missing case

Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>

---------

Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>
2025-11-13 21:03:53 -08:00
Rostan
8ecf10e8cc Fix crash in gil_scoped_acquire (#5828)
* Add a test reproducing the #5827 crash

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

* Fix #5827

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

* Rename PYBIND11_HAS_BARRIER and move it to common.h

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

* In test_thread.{cpp,py}, rename has_barrier

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>

---------

Signed-off-by: Rostan Tabet <rtabet@nvidia.com>
2025-11-13 16:29:02 -08:00
Ralf W. Grosse-Kunstleve
9f1187f97c Add typing.SupportsIndex to int/float/complex type hints (#5891)
* Add typing.SupportsIndex to int/float/complex type hints

This corrects a mistake where these types were supported but the type
hint was not updated to reflect that SupportsIndex objects are accepted.

To track the resulting test failures:

The output of

"$(cat PYROOT)"/bin/python3 $HOME/clone/pybind11_scons/run_tests.py $HOME/forked/pybind11 -v

is in

~/logs/pybind11_pr5879_scons_run_tests_v_log_2025-11-10+122217.txt

* Cursor auto-fixes (partial) plus pre-commit cleanup. 7 test failures left to do.

* Fix remaining test failures, partially done by cursor, partially manually.

* Cursor-generated commit: Added the Index() tests from PR 5879.

Summary:

  Changes Made

  1. **C++ Bindings** (`tests/test_builtin_casters.cpp`)

  • Added complex_convert and complex_noconvert functions needed for the tests

  2. **Python Tests** (`tests/test_builtin_casters.py`)

  `test_float_convert`:
  • Added Index class with __index__ returning -7
  • Added Int class with __int__ returning -5
  • Added test showing Index() works with convert mode: assert pytest.approx(convert(Index())) == -7.0
  • Added test showing Index() doesn't work with noconvert mode: requires_conversion(Index())
  • Added additional assertions for int literals and Int() class

  `test_complex_cast`:
  • Expanded the test to include convert and noconvert functionality
  • Added Index, Complex, Float, and Int classes
  • Added test showing Index() works with convert mode: assert convert(Index()) == 1 and assert isinstance(convert(Index()), complex)
  • Added test showing Index() doesn't work with noconvert mode: requires_conversion(Index())
  • Added type hint assertions matching the SupportsIndex additions

  These tests demonstrate that custom __index__ objects work with float and complex in convert mode, matching the typing.SupportsIndex type hint added in PR
  5891.

* Reflect behavior changes going back from PR 5879 to master. This diff will have to be reapplied under PR 5879.

* Add PyPy-specific __index__ handling for complex caster

Extract PyPy-specific __index__ backporting from PR 5879 to fix PyPy 3.10
test failures in PR 5891. This adds:

1. PYBIND11_INDEX_CHECK macro in detail/common.h:
   - Uses PyIndex_Check on CPython
   - Uses hasattr check on PyPy (workaround for PyPy 7.3.3 behavior)

2. PyPy-specific __index__ handling in complex.h:
   - Handles __index__ objects on PyPy 7.3.7's 3.8 which doesn't
     implement PyLong_*'s __index__ calls
   - Mirrors the logic used in numeric_caster for ints and floats

This backports __index__ handling for PyPy, matching the approach
used in PR 5879's expand-float-strict branch.
2025-11-10 20:26:50 -08:00
Joshua Oreman
e6984c805e native_enum: add capsule containing enum information and cleanup logic (#5871)
* native_enum: add capsule containing enum information and cleanup logic

* style: pre-commit fixes

* Updates from code review

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-10-18 11:07:00 -06:00
daltairwalter
15943963b3 fix dangling thread state issue (#5870)
* fix dangling thread state issue

* formatting rules

* use tstate .set(nullptr) to pass clang-tidy check

* fix spelling mistake

* improve comments for maintainability
2025-10-16 08:40:50 -07:00
Scott Wolchok
1e5bc66e38 Factor out readable function signatures to avoid duplication (#5857)
* Centralize readable function signatures to avoid duplication

This seems to reduce size costs of adding enum_-specific implementations of dunder methods, but also should provide a nice to have size optimization for programs that use pybind11 in general.

* gate disabling of -Wdeprecated-redundant-constexpr-static-def to clang 17+

* fix gating to include Apple Clang 15

* Make GCC happy with types

* fix apple clang gating again. suppress -Wdeprecated for GCC

* Gate warning suppressions to C++17. Suppress -Wdeprecated for clang as well.

* hopefully fix last straggler CI job

* attempt to address readability review feedback from @rwgk

* drop warning suppressions and instead just gate compilation the pre-C++17 compat code
2025-10-15 21:12:44 -07:00
Joshua Oreman
cc36ac51a0 type_caster_generic: fix compiler error when casting a T that is implicitly convertible from T* (#5873)
* type_caster_generic: fix compiler error when casting a T that is implicitly convertible from T*

* style: pre-commit fixes

* Placate clang-tidy

* Expand NOLINT to specify Clang-Tidy check names

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
2025-10-15 09:10:50 -07:00
Joshua Oreman
a2c59711b2 type_caster_generic: add cast_sources abstraction (#5866)
* type_caster_generic: add cast_sources abstraction

* Respond to code review comments

* style: pre-commit fixes

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-10-14 18:42:04 -06:00
Scott Wolchok
fc423c948a Fix dangling pointer in internals::registered_types_cpp_fast from #5842 (#5867)
* Fix dangling pointer in internals::registered_types_cpp_fast from #5842

@oremanj pointed out in a comment on #5842 that I missed part
of the nanobind PR I was porting in such a way that we could have
dangling pointers in internals::registered_types_cpp_fast. This PR
adds a test that reproed the bug and then fixes the test.

* review feedback, attempt to fix -Werror in CI

* use const ref, skip test on python 3.13 free-threaded

* Skip test on 3.13t more robustly

* style: pre-commit fixes

* CI fix

---------

Co-authored-by: Joshua Oreman <oremanj@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-10-14 16:52:03 -06:00
Joshua Oreman
c7b4f66a73 type_caster_generic: add set_foreign_holder method for subclasses to implement (#5862)
* type_caster_generic: add set_foreign_holder method for subclasses to implement

* style: pre-commit fixes

* Rename try_shared_from_this -> set_via_shared_from_this to avoid confusion against try_get_shared_from_this

* Add comment explaining the limits of the test

* CI

* style: pre-commit fixes

* Fixes from code review

* style: pre-commit fixes

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-10-13 18:00:28 -06:00
Sam Gross
9f75202191 Fix thread-safety in get_local_type_info() (#5856)
Fixes potential thread-safety issues if types are concurrently
registered while `get_local_type_info()` is called in free threaded
Python.

Use the `internals` mutex to also protect `local_internals`. This
keeps the locking strategy simpler, and we already follow this pattern
in some places, such as `pybind11_meta_dealloc`.
2025-10-12 14:37:48 -07:00
pre-commit-ci[bot]
aa4259b4f8 chore(deps): update pre-commit hooks (#5820)
* chore(deps): update pre-commit hooks

updates:
- [github.com/pre-commit/mirrors-clang-format: v20.1.8 → v21.1.2](https://github.com/pre-commit/mirrors-clang-format/compare/v20.1.8...v21.1.2)
- [github.com/astral-sh/ruff-pre-commit: v0.12.7 → v0.13.3](https://github.com/astral-sh/ruff-pre-commit/compare/v0.12.7...v0.13.3)
- [github.com/pre-commit/mirrors-mypy: v1.17.1 → v1.18.2](https://github.com/pre-commit/mirrors-mypy/compare/v1.17.1...v1.18.2)
- [github.com/pre-commit/pre-commit-hooks: v5.0.0 → v6.0.0](https://github.com/pre-commit/pre-commit-hooks/compare/v5.0.0...v6.0.0)
- [github.com/adamchainz/blacken-docs: 1.19.1 → 1.20.0](https://github.com/adamchainz/blacken-docs/compare/1.19.1...1.20.0)
- [github.com/shellcheck-py/shellcheck-py: v0.10.0.1 → v0.11.0.1](https://github.com/shellcheck-py/shellcheck-py/compare/v0.10.0.1...v0.11.0.1)
- [github.com/PyCQA/pylint: v3.3.7 → v3.3.9](https://github.com/PyCQA/pylint/compare/v3.3.7...v3.3.9)
- [github.com/python-jsonschema/check-jsonschema: 0.33.2 → 0.34.0](https://github.com/python-jsonschema/check-jsonschema/compare/0.33.2...0.34.0)

* style: pre-commit fixes

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-10-11 10:23:50 -07:00
Joshua Oreman
1cf0948d34 Avoid a heap allocation on every legacy py::enum_ load (#5860) 2025-10-11 10:19:52 -07:00
Scott Wolchok
3262000195 Add fast_type_map, use it authoritatively for local types and as a hint for global types (ABI breaking) (#5842)
* Add fast_type_map, use it authoritatively for local types and as a hint for global types

nanobind has a similar two-level lookup strategy, added and explained
by
b515b1f7f2

In this PR I've ported this approach to pybind11. To avoid an ABI
break, I've kept the fast maps to the `local_internals`. I think this
should be safe because any particular module should see its
`local_internals` reset at least as often as the global `internals`,
and misses in the fast "hint" map for global types fall back to the
global `internals`.

Performance seems to have improved. Using my patched fork of
pybind11_benchmark
(https://github.com/swolchok/pybind11_benchmark/tree/benchmark-updates,
specifically commit hash b6613d12607104d547b1c10a8145d1b3e9937266), I
run bench.py and observe the MyInt case. Each time, I do 3 runs and
just report all 3.

master, Mac: 75.9, 76.9, 75.3 nsec/loop
this PR, Mac: 73.8, 73.8, 73.6 nsec/loop
master, Linux box: 188, 187, 188 nsec/loop
this PR, Linux box: 164, 165, 164 nsec/loop

Note that the "real" percentage improvement is larger than implied by the
above because master does not yet include #5824.

* simplify unsafe_reset_local_internals in test

* pre-implement PYBIND11_INTERNALS_VERSION 12

* use PYBIND11_INTERNALS_VERSION 12 on Python 3.14 per suggestion

* Implement reviewer comments: revert PY_VERSION_HEX change, fix REVIEW comment, add two-level lookup comments. ci.yml coming separately

* Use the inplace build to smoke test ABI bump?

* [skip ci] Remove "smoke" from comment. This is full testing, just only on a few platforms.

---------

Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
2025-10-05 11:07:25 -07:00
Sam Gross
9ea197627d Use new 3.14 C APIs when available (#5854)
* Use new 3.14 C APIs when available

Use the new "unstable" C APIs for the functions added in #5494.

* style: pre-commit fixes

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-10-05 09:58:13 -07:00
Ralf W. Grosse-Kunstleve
4dc33d6524 Fix smart_holder multiple/virtual inheritance bugs in shared_ptr and unique_ptr to-Python conversions (#5836)
* ChatGPT-generated diamond virtual-inheritance test case.

* Report "virtual base at offset 0" but don't skip test.

* Remove Left/Right virtual default dtors, to resolve clang-tidy errors:

```
/__w/pybind11/pybind11/tests/test_class_sh_mi_thunks.cpp:44:13: error: prefer using 'override' or (rarely) 'final' instead of 'virtual' [modernize-use-override,-warnings-as-errors]
   44 |     virtual ~Left() = default;
      |     ~~~~~~~ ^
      |                     override
/__w/pybind11/pybind11/tests/test_class_sh_mi_thunks.cpp:48:13: error: prefer using 'override' or (rarely) 'final' instead of 'virtual' [modernize-use-override,-warnings-as-errors]
   48 |     virtual ~Right() = default;
      |     ~~~~~~~ ^
      |                      override
```

* Add assert(ptr) in register_instance_impl, deregister_instance_impl

* Proper bug fix

* Also exercise smart_holder_from_unique_ptr

* [skip ci] ChatGPT-generated bug fix: smart_holder::from_unique_ptr()

* Exception-safe ownership transfer from unique_ptr to shared_ptr

ChatGPT:

* shared_ptr’s ctor can throw (control-block alloc). Using get() keeps unique_ptr owning the memory if that happens, so no leak.

* Only after the shared_ptr is successfully constructed do you release(), transferring ownership exactly once.

* [skip ci] Rename alias_ptr to mi_subobject_ptr to distinguish from trampoline code (which often uses the term "alias", too)

* [skip ci] Also exercise smart_holder::from_raw_ptr_take_ownership

* [skip ci] Add st.first comments (generated by ChatGPT)

* [skip ci] Copy and extend (raw_ptr, unique_ptr) reproducer from PR #5796

* Some polishing: comments, add back Left/Right dtors for consistency within test_class_sh_mi_thunks.cpp

* explicitly default copy/move for VBase to silence -Wdeprecated-copy-with-dtor

* Resolve clang-tidy error:

```
/__w/pybind11/pybind11/tests/test_class_sh_mi_thunks.cpp:67:5: error: 'auto ptr' can be declared as 'auto *ptr' [readability-qualified-auto,-warnings-as-errors]
   67 |     auto ptr = new Diamond;
      |     ^~~~
      |     auto *
```

* Expand comment in `smart_holder::from_unique_ptr()`

* Better Left/Right padding to make it more likely that we avoid "all at offset 0". Clarify comment.

* Give up on `alignas(16)` to resolve MSVC warning:

```
       "D:\a\pybind11\pybind11\build\ALL_BUILD.vcxproj" (default target) (1) ->
       "D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj" (default target) (13) ->
       (ClCompile target) ->
         D:\a\pybind11\pybind11\tests\test_class_sh_mi_thunks.cpp(70,17): warning C4316: 'test_class_sh_mi_thunks::Diamond': object allocated on the heap may not be aligned 16 [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
         D:\a\pybind11\pybind11\tests\test_class_sh_mi_thunks.cpp(80,43): warning C4316: 'test_class_sh_mi_thunks::Diamond': object allocated on the heap may not be aligned 16 [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
         C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316: 'std::_Ref_count_obj2<_Ty>': object allocated on the heap may not be aligned 16 [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
       C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316:         with [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
       C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316:         [ [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
       C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316:             _Ty=test_class_sh_mi_thunks::Diamond [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
       C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316:         ] [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
         D:\a\pybind11\pybind11\include\pybind11\detail\init.h(77,21): warning C4316: 'test_class_sh_mi_thunks::Diamond': object allocated on the heap may not be aligned 16 [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
```

The warning came from alignas(16) making Diamond over-aligned, while regular new/make_shared aren’t guaranteed to return 16-byte aligned memory on MSVC (hence C4316). I’ve removed the explicit alignment and switched to asymmetric payload sizes (char[4] vs char[24]), which still nudges MI layout without relying on over-alignment. This keeps the test goal and eliminates the warning across all MSVC builds. If we ever want to stress over-alignment explicitly, we can add aligned operator new/delete under __cpp_aligned_new, but that’s more than we need here.

* Rename test_virtual_base_at_offset_0() → test_virtual_base_not_at_offset_0() and replace pytest.skip() with assert. Add helpful comment for future maintainers.
2025-10-01 11:21:47 -07:00
gentlegiantJGC
8ed0dab67f Add float type caster and revert type hint changes to int_ and float_ (#5839)
* Revert type hint changes to int_ and float_

These two types do not support casting from int-like and float-like types.

* Fix tests

* Add a custom py::float_ caster

The default py::object caster only works if the object is an instance of the type.
py::float_ should accept python int objects as well as float.
This caster will pass through float as usual and cast int to float.
The caster handles the type name so the custom one is not required.

* style: pre-commit fixes

* Fix name

* Fix variable

* Try satisfying the formatter

* Rename test function

* Simplify type caster

* Fix reference counting issue

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-09-27 09:13:21 -07:00
Scott Wolchok
30748f863f Avoid heap allocation for function calls with a small number of args (#5824)
* Avoid heap allocation for function calls with a small number of arguments

We don't have access to llvm::SmallVector or similar, but given the
limited subset of the `std::vector` API that
`function_call::args{,_convert}` need and the "reserve-then-fill"
usage pattern, it is relatively straightforward to implement custom
containers that get the job done.

Seems to improves time to call the collatz function in
pybind/pybind11_benchmark significantly; numbers are a little noisy
but there's a clear improvement from "about 60 ns per call" to "about
45 ns per call" on my machine (M4 Max Mac), as measured with
`timeit.repeat('collatz(4)', 'from pybind11_benchmark import
collatz')`.

* clang-tidy

* more clang-tidy

* clang-tidy NOLINTBEGIN/END instead of NOLINTNEXTLINE

* forgot to increase inline size after removing std::variant

* constexpr arg_vector_small_size, use move instead of swap to hopefully clarify second_pass_convert

* rename test_embed to test_low_level

* rename test_low_level to test_with_catch

* Be careful to NOINLINE slow paths

* rename array/vector members to iarray/hvector. Move comment per request. Add static_asserts for our untagged union implementation per request.

* drop is_standard_layout assertions; see https://github.com/pybind/pybind11/pull/5824#issuecomment-3308616072
2025-09-19 13:44:40 -07:00
b-pass
326b10637a Use thread_local instead of thread_specific_storage for internals (#5834)
* Use thread_local instead of thread_specific_storage for internals mangement

thread_local is faster.

* Make the pp manager a singleton.

Strictly speaking, since the members are static, the instances must also be singletons or this wouldn't work.  They already are, but we can make the class enforce it to be more 'self-documenting'.
2025-09-14 09:07:08 -07:00
Scott Wolchok
937552f0ad Use thread_local for loader_life_support to improve performance (#5830)
* Use thread_local for loader_life_support to improve performance

As explained in a new code comment, `loader_life_support` needs to be
`thread_local` but does not need to be isolated to a particular
interpreter because any given function call is already going to only
happen on a single interpreter by definiton.

Performance before:
- on M4 Max using pybind/pybind11_benchmark unmodified repo:
```
> python -m timeit --setup 'from pybind11_benchmark import collatz' 'collatz(4)'
5000000 loops, best of 5: 63.8 nsec per loop
```

- Linux server:
```
python -m timeit --setup 'from pybind11_benchmark import collatz' 'collatz(4)'                                                                                                                                        (pytorch)
2000000 loops, best of 5: 120 nsec per loop
```

After:
- M4 Max:
```
python -m timeit --setup 'from pybind11_benchmark import collatz' 'collatz(4)'
5000000 loops, best of 5: 53.1 nsec per loop
```

- Linux server:
```
> python -m timeit --setup 'from pybind11_benchmark import collatz' 'collatz(4)'                                                                                                                                        (pytorch)
2000000 loops, best of 5: 101 nsec per loop
```

A quick profile with perf shows that pthread_setspecific and pthread_getspecific are gone.

Open questions:

- How do we determine whether we can safely use `thread_local`? I see
  concerns about old iOS versions on
  https://github.com/pybind/pybind11/pull/5705#issuecomment-2922858880
  and https://github.com/pybind/pybind11/pull/5709; is there anything
  else?
- Do we have a test that covers "function called in one interpreter
  calls a C++ function that causes a function call in another
  interpreter"? I think it's fine, but can it happen?
- Are we happy with what we think will happen in the case where
  multiple extensions compiled with and without this PR interoperate?
  I think it's fine -- each dispatch pushes and cleans up its own
  state -- but a second opinion is certainly welcome.

* Remove PYBIND11_CAN_USE_THREAD_LOCAL

* clarify comment

* Simplify loader_life_support TLS storage

Replace the `fake_thread_specific_storage` struct with a direct
thread-local pointer managed via a function-local static:

    static loader_life_support *& tls_current_frame()

This retains the "stack of frames" behavior via the `parent` link. It also
reduces indirection and clarifies intent.

Note: this form is C++11-compatible; once pybind11 requires C++17, the
helper can be simplified to:

    inline static thread_local loader_life_support *tls_current_frame = nullptr;

* loader_life_support: avoid duplicate tls_current_frame() calls

Replace repeated calls with a single local reference:

    auto &frame = tls_current_frame();

This ensures the thread_local initialization guard is checked only once
per constructor/destructor call site, avoids potential clang-tidy
complaints, and makes the code more readable. Functional behavior is
unchanged.

* Add REMINDER for next version bump in internals.h

---------

Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
2025-09-12 14:37:01 -07:00
Thomas Köppe
a6581eee89 pytypes.h: constrain accessor::operator= templates so that they do not obscure special members (#5832)
* pytypes.h: constrain accessor::operator= templates so that they do not match calls that should use the special member functions.

Found by an experimental, new clang-tidy check. While we may not know the exact design decisions now, it seems unlikely that the special members were deliberately meant to not be selected (for otherwise they could have been defined differently to make this clear). Rather, it seems like an oversight that the operator templates win in overload resolution, and we should restore the intended resolution.

* Use C++11-compatible facilities

* Use C++11-compatible facilities

* style: pre-commit fixes

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2025-09-08 11:52:41 -07:00
MoonE
3878c23f8d Fix typo in error message (#5817) 2025-08-30 23:07:03 -07:00