* Improve C++ test infrastructure and disable hanging test
This commit improves the C++ test infrastructure to ensure test output
is visible in CI logs, and disables a test that hangs on free-threaded
Python 3.14+.
Changes:
## CI/test infrastructure improvements
- .github/workflows: Added `timeout-minutes: 3` to all C++ test steps
to prevent indefinite hangs.
- tests/**/CMakeLists.txt: Added `USES_TERMINAL` to C++ test targets
(cpptest, test_cross_module_rtti, test_pure_cpp) to ensure output is
shown immediately rather than buffered and possibly lost on crash/timeout.
- tests/test_with_catch/catch.cpp: Added a custom Catch2 progress reporter
with timestamps, Python version info, and a SIGTERM handler to make test
execution and failures clearly visible in CI logs.
## Disabled hanging test
- The "Move Subinterpreter" test is disabled on free-threaded Python 3.14+
due to a hang in Py_EndInterpreter() when the subinterpreter is destroyed
from a different thread than it was created on. Work on fixing the
underlying issue will continue under PR #5940.
Context: We were in the dark for months (since we started testing with
Python 3.14t) because CI logs gave no clue about the root cause of hangs.
This led to ignoring intermittent hangs (mostly on macOS). Our hand was
forced only with the Python 3.14.1 release, when hangs became predictable
on all platforms.
For the full development history of these changes, see PR #5933.
* Add test summary to progress reporter
Print the total number of test cases and assertions at the end of the
test run, making it easy to spot if tests are disabled or added.
Example output:
[ PASSED ] 20 test cases, 1589 assertions.
* Add PYBIND11_CATCH2_SKIP_IF macro to skip tests at runtime
Catch2 v2 doesn't have native skip support (v3 does with SKIP()).
This macro allows tests to be skipped with a visible message while
still appearing in the test list.
Use this for the Move Subinterpreter test on free-threaded Python 3.14+
so it shows as skipped rather than being conditionally compiled out.
Example output:
[ RUN ] Move Subinterpreter
[ SKIPPED ] Skipped on free-threaded Python 3.14+ (see PR #5940)
[ OK ] Move Subinterpreter
* Fix clang-tidy bugprone-macro-parentheses warning in PYBIND11_CATCH2_SKIP_IF
* Limit busy-wait loops in per-subinterpreter GIL test
Add explicit timeouts to the busy-wait coordination loops in the
Per-Subinterpreter GIL test in tests/test_with_catch/test_subinterpreter.cpp.
Previously those loops spun indefinitely waiting for shared atomics like
`started` and `sync` to change, which is fine when CPython's free-threading
and per-interpreter GIL behavior matches the test's expectations but becomes
pathologically bad when that behavior regresses: the `test_with_catch`
executable can then hang forever, causing our 3.14t CI jobs to time out
after 90 minutes.
This change keeps the structure and intent of the test but adds a
std::chrono::steady_clock deadline to each of the coordination loops,
using a conservative 10 second bound. Worker threads record a failure and
return if they hit the timeout, while the main thread fails the test via
Catch2 instead of hanging. That way, if future CPython free-threading
patches change the semantics again, the test will fail quickly and
produced a diagnosable error instead of wedging the CI job.
* Revert "Limit busy-wait loops in per-subinterpreter GIL test"
This reverts commit 7847adacda.
* Add progress reporter for test_with_catch Catch runner
Introduce a custom Catch2 reporter for tests/test_with_catch that prints a
simple one-line status for each test case as it starts and ends, and wire the
cpptest CMake target to invoke test_with_catch with -r progress. This makes
it much easier to see where the embedded/interpreter test binary is spending
its time in CI logs, and in particular to pinpoint which test case is stuck
when the free-threading builds hang.
Compared to adding ad hoc timeouts around potentially infinite busy-wait
loops in individual tests, a progress reporter is a more general and robust
approach: it gives visibility into all tests (including future ones) without
changing their behavior, and turns otherwise opaque 90-minute timeouts into
locatable issues in the Catch output.
* Temporarily limit CI to Python 3.14t free-threading jobs
* Temporarily remove non-CI GitHub workflow files
* Temporarily disable AppVeyor builds via skip_commits
* Add DEBUG_LOOK in TEST_CASE("Move Subinterpreter")
* Add Python version banner to Catch progress reporter
Print the CPython version once at the start of the Catch-based
interpreter tests using Py_GetVersion(). This makes it trivial to
confirm which free-threaded build a failing run is using when
inspecting CI or local logs.
* Revert "Add DEBUG_LOOK in TEST_CASE("Move Subinterpreter")"
This reverts commit ad3e1c34ce.
* Pin CI free-threaded runs to CPython 3.14.0t
Update the standard-small and standard-large GitHub Actions jobs to
request python-version 3.14.0t instead of 3.14t. This forces setup-python
to use the last-known-good 3.14.0 free-threaded build rather than the
newer 3.14.1+ builds where subinterpreter finalization regressed.
* Revert "Pin CI free-threaded runs to CPython 3.14.0t"
This reverts commit 5281e1c20c.
* Revert "Temporarily disable AppVeyor builds via skip_commits"
This reverts commit ed11292636.
* Revert "Temporarily remove non-CI GitHub workflow files"
This reverts commit 0fe6a42a04.
* Revert "Temporarily limit CI to Python 3.14t free-threading jobs"
This reverts commit 60ae0e8f74.
* Pin CI free-threaded runs to CPython 3.14.0t
Update the standard-small and standard-large GitHub Actions jobs to
request python-version 3.14.0t instead of 3.14t. This forces setup-python
to use the last-known-good 3.14.0 free-threaded build rather than the
newer 3.14.1+ builds where subinterpreter finalization regressed.
* Switch NVHPC job to ubuntu-24.04 and disable AppVeyor
* Temporarily trim workflows to focus on NVHPC job
* First restore ci.yml from test-with-catch-timeouts branch, then delete all jobs except ubuntu-nvhpc7
* Change runner to ubuntu-24.04
* Use nvhpc-25-11
* Undo ALL changes relative to master (i.e. this branch is now an exact copy of master)
* Change runner to ubuntu-24.04
* Use nvhpc-25-11
* Remove misleading 7 from job name (i.e. ubuntu-nvhpc7 → ubuntu-nvhpc)
* Fix PyObject_HasAttrString return value
Signed-off-by: cyy <cyyever@outlook.com>
* Tidy unchecked files
Signed-off-by: cyy <cyyever@outlook.com>
* [skip ci] Handle PyObject_HasAttrString error when probing __notes__
PyObject_HasAttrString may return -1 to signal an error and set a
Python exception. The previous logic only checked for "!= 0", which
meant that the error path was treated the same as "attribute exists",
causing two problems: misreporting the presence of __notes__ and
leaving a spurious exception pending.
The earlier PR tightened the condition to "== 1" so that only a
successful lookup marks the error string as [WITH __notes__], but it
still left the -1 case unhandled. In the context of
error_fetch_and_normalize, we are already dealing with an active
exception and only want to best-effort detect whether normalization
attached any __notes__. If the attribute probe itself fails, we do not
want that secondary failure to affect later C-API calls or the error
we ultimately report.
This change stores the PyObject_HasAttrString return value, treats
"== 1" as "has __notes__", and explicitly calls PyErr_Clear() when
it returns -1. That way, we avoid leaking a secondary error while
still preserving the original exception information and hinting
[WITH __notes__] only when we can determine it reliably.
* Run clang-tidy with -DPYBIND11_HAS_SUBINTERPRETER_SUPPORT
* [skip ci] Revert "Run clang-tidy with -DPYBIND11_HAS_SUBINTERPRETER_SUPPORT"
This reverts commit bb6e751de4.
---------
Signed-off-by: cyy <cyyever@outlook.com>
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
* adding windows arm test
- excluding numpy 2.2.0 for arm64 builds
* adding windows arm msys2 test
* testing mingw python
* unnamed namespace test on windows arm with clang and mingw
* Revert "unnamed namespace test on windows arm with clang and mingw"
This reverts commit 08abf889ae.
* bumping c++ version
* commenting out other tests
* Ignore unnmaed namespace on arm windows with mingw
- Updatig XFAIL condition to expect a failure on windows arm with
mingw and clang
- setting python home and path variables in c++ tests
* Revert "commenting out other tests"
This reverts commit dc75243963.
* removing windows-11-arm from big test, push
* removing redundant shell
* removing trailing whitespace
* Clarify Windows ARM clang job naming
Rename the Windows ARM clang jobs to windows_arm_clang_msvc and windows_arm_clang_msys2 and adjust their display names to clang-msvc / clang-msys2. The first runs clang against the MSVC/Windows SDK toolchain and python.org CPython for Windows ARM, while the second runs clang inside the MSYS2/MinGW-w64 CLANGARM64 environment with MSYS2 Python.
Using clearly distinguished job names makes it easier to discuss failures and behavior in each environment without ambiguity, both in logs and in PR review discussions.
* Reduce Windows ARM clang matrix size
Limit the windows_arm_clang_msvc job to Python 3.13 and the windows_arm_clang_msys2 job to Python 3.12 to stay within our constrained GitHub Actions resources. Keep both jobs using a matrix over os and python so their structure stays aligned and it remains easy to expand coverage when needed.
* Remove matrix.python from windows_arm_clang_msys2 job: the Python version is determined by the msys2/setup-msys2 action and cannot be changed
---------
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
* Apply ruff/flake8-implicit-str-concat rule ISC003
ISC003 Explicitly concatenated string should be implicitly concatenated
* Enforce ruff/flake8-implicit-str-concat rules (ISC)
* Apply ruff/Perflint rule PERF102
PERF102 When using only the values of a dict use the `values()` method
* Apply ruff/Perflint rule PERF401
PERF401 Use a list comprehension to create a transformed list
* Enforce ruff/Perflint rules (PERF)
* [skip ci] Point main README.rst to CI for supported platforms and compilers
- **Replace** the hard-coded “Supported compilers” and “Supported platforms” lists in `README.rst`.
- **Point** readers to the current GitHub Actions matrix as the source of truth for tested platforms, compilers, and Python/C++ versions.
- **Clarify** that the matrix evolves over time and that configurations users care about can be kept working via contributions.
- **Avoid stale documentation**: Enumerating specific compiler and platform versions in the README is both burdensome and error-prone, and tends to drift out of sync with reality.
- **Align “supported” with “tested”**: In practice, the CI configuration is the only place where we can say with confidence which combinations are exercised. Nearby versions (e.g., adjacent compiler minor releases) will often work, but we cannot test every variant.
- **Reflect actual maintenance capacity**: pybind11 is maintained by a small, volunteer-based community, so support is necessarily best-effort. Pointing to CI and inviting contributions better matches how support is provided in practice.
- **No behavior change**: This PR updates documentation only.
- **Living source of truth**: As CI jobs are added or removed, the linked Actions view will automatically reflect the set of configurations we actively test. Keeping a configuration in CI is the best way to keep it “supported”.
* [skip ci] Slight rewording: point out GitHub's limits on concurrent jobs under the free tier (rather than free minutes).
* Remove enum from bold in doc
* [skip ci] Remove bold formatting around (see #5528)
---------
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
* Don't allow keep_alive or call_guard on properties
The def_property family blindly ignore the keep_alive and call_guard arguments passed to them making them confusing to use.
This adds a static_assert if either is passed to make it clear it doesn't work.
I would prefer this to be a compiler warning but I can't find a way to do that. Is that even possible?
* style: pre-commit fixes
* Re-run tests
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Type hint make_tuple / fix *args/**kwargs return type
Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>
* add back commented out panic
* ignore return std move clang
Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>
* fix for mingmw
Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>
* added missing case
Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>
---------
Signed-off-by: Michael Carlstrom <rmc@carlstrom.com>
Since we already require pytest>=6 (see tests/requirements.txt), the
old compatibility function is obsolete and pytest.deprecated_call() can
be used directly.
Extracted from PR #5879
Co-authored-by: Michael Carlstrom <rmc@carlstrom.com>
Co-authored-by: gentlegiantJGC <gentlegiantJGC@users.noreply.github.com>
* Enhance: edit doc py::native_enum feature in upgrade.rst
Added information about the inclusion requirement for py::native_enum feature.
* [skip ci] Polish wording
---------
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
* Add typing.SupportsIndex to int/float/complex type hints
This corrects a mistake where these types were supported but the type
hint was not updated to reflect that SupportsIndex objects are accepted.
To track the resulting test failures:
The output of
"$(cat PYROOT)"/bin/python3 $HOME/clone/pybind11_scons/run_tests.py $HOME/forked/pybind11 -v
is in
~/logs/pybind11_pr5879_scons_run_tests_v_log_2025-11-10+122217.txt
* Cursor auto-fixes (partial) plus pre-commit cleanup. 7 test failures left to do.
* Fix remaining test failures, partially done by cursor, partially manually.
* Cursor-generated commit: Added the Index() tests from PR 5879.
Summary:
Changes Made
1. **C++ Bindings** (`tests/test_builtin_casters.cpp`)
• Added complex_convert and complex_noconvert functions needed for the tests
2. **Python Tests** (`tests/test_builtin_casters.py`)
`test_float_convert`:
• Added Index class with __index__ returning -7
• Added Int class with __int__ returning -5
• Added test showing Index() works with convert mode: assert pytest.approx(convert(Index())) == -7.0
• Added test showing Index() doesn't work with noconvert mode: requires_conversion(Index())
• Added additional assertions for int literals and Int() class
`test_complex_cast`:
• Expanded the test to include convert and noconvert functionality
• Added Index, Complex, Float, and Int classes
• Added test showing Index() works with convert mode: assert convert(Index()) == 1 and assert isinstance(convert(Index()), complex)
• Added test showing Index() doesn't work with noconvert mode: requires_conversion(Index())
• Added type hint assertions matching the SupportsIndex additions
These tests demonstrate that custom __index__ objects work with float and complex in convert mode, matching the typing.SupportsIndex type hint added in PR
5891.
* Reflect behavior changes going back from PR 5879 to master. This diff will have to be reapplied under PR 5879.
* Add PyPy-specific __index__ handling for complex caster
Extract PyPy-specific __index__ backporting from PR 5879 to fix PyPy 3.10
test failures in PR 5891. This adds:
1. PYBIND11_INDEX_CHECK macro in detail/common.h:
- Uses PyIndex_Check on CPython
- Uses hasattr check on PyPy (workaround for PyPy 7.3.3 behavior)
2. PyPy-specific __index__ handling in complex.h:
- Handles __index__ objects on PyPy 7.3.7's 3.8 which doesn't
implement PyLong_*'s __index__ calls
- Mirrors the logic used in numeric_caster for ints and floats
This backports __index__ handling for PyPy, matching the approach
used in PR 5879's expand-float-strict branch.
* Centralize readable function signatures to avoid duplication
This seems to reduce size costs of adding enum_-specific implementations of dunder methods, but also should provide a nice to have size optimization for programs that use pybind11 in general.
* gate disabling of -Wdeprecated-redundant-constexpr-static-def to clang 17+
* fix gating to include Apple Clang 15
* Make GCC happy with types
* fix apple clang gating again. suppress -Wdeprecated for GCC
* Gate warning suppressions to C++17. Suppress -Wdeprecated for clang as well.
* hopefully fix last straggler CI job
* attempt to address readability review feedback from @rwgk
* drop warning suppressions and instead just gate compilation the pre-C++17 compat code
* type_caster_generic: fix compiler error when casting a T that is implicitly convertible from T*
* style: pre-commit fixes
* Placate clang-tidy
* Expand NOLINT to specify Clang-Tidy check names
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
* Fix dangling pointer in internals::registered_types_cpp_fast from #5842
@oremanj pointed out in a comment on #5842 that I missed part
of the nanobind PR I was porting in such a way that we could have
dangling pointers in internals::registered_types_cpp_fast. This PR
adds a test that reproed the bug and then fixes the test.
* review feedback, attempt to fix -Werror in CI
* use const ref, skip test on python 3.13 free-threaded
* Skip test on 3.13t more robustly
* style: pre-commit fixes
* CI fix
---------
Co-authored-by: Joshua Oreman <oremanj@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Fixes potential thread-safety issues if types are concurrently
registered while `get_local_type_info()` is called in free threaded
Python.
Use the `internals` mutex to also protect `local_internals`. This
keeps the locking strategy simpler, and we already follow this pattern
in some places, such as `pybind11_meta_dealloc`.
* Add fast_type_map, use it authoritatively for local types and as a hint for global types
nanobind has a similar two-level lookup strategy, added and explained
by
b515b1f7f2
In this PR I've ported this approach to pybind11. To avoid an ABI
break, I've kept the fast maps to the `local_internals`. I think this
should be safe because any particular module should see its
`local_internals` reset at least as often as the global `internals`,
and misses in the fast "hint" map for global types fall back to the
global `internals`.
Performance seems to have improved. Using my patched fork of
pybind11_benchmark
(https://github.com/swolchok/pybind11_benchmark/tree/benchmark-updates,
specifically commit hash b6613d12607104d547b1c10a8145d1b3e9937266), I
run bench.py and observe the MyInt case. Each time, I do 3 runs and
just report all 3.
master, Mac: 75.9, 76.9, 75.3 nsec/loop
this PR, Mac: 73.8, 73.8, 73.6 nsec/loop
master, Linux box: 188, 187, 188 nsec/loop
this PR, Linux box: 164, 165, 164 nsec/loop
Note that the "real" percentage improvement is larger than implied by the
above because master does not yet include #5824.
* simplify unsafe_reset_local_internals in test
* pre-implement PYBIND11_INTERNALS_VERSION 12
* use PYBIND11_INTERNALS_VERSION 12 on Python 3.14 per suggestion
* Implement reviewer comments: revert PY_VERSION_HEX change, fix REVIEW comment, add two-level lookup comments. ci.yml coming separately
* Use the inplace build to smoke test ABI bump?
* [skip ci] Remove "smoke" from comment. This is full testing, just only on a few platforms.
---------
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
* Use new 3.14 C APIs when available
Use the new "unstable" C APIs for the functions added in #5494.
* style: pre-commit fixes
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* ChatGPT-generated diamond virtual-inheritance test case.
* Report "virtual base at offset 0" but don't skip test.
* Remove Left/Right virtual default dtors, to resolve clang-tidy errors:
```
/__w/pybind11/pybind11/tests/test_class_sh_mi_thunks.cpp:44:13: error: prefer using 'override' or (rarely) 'final' instead of 'virtual' [modernize-use-override,-warnings-as-errors]
44 | virtual ~Left() = default;
| ~~~~~~~ ^
| override
/__w/pybind11/pybind11/tests/test_class_sh_mi_thunks.cpp:48:13: error: prefer using 'override' or (rarely) 'final' instead of 'virtual' [modernize-use-override,-warnings-as-errors]
48 | virtual ~Right() = default;
| ~~~~~~~ ^
| override
```
* Add assert(ptr) in register_instance_impl, deregister_instance_impl
* Proper bug fix
* Also exercise smart_holder_from_unique_ptr
* [skip ci] ChatGPT-generated bug fix: smart_holder::from_unique_ptr()
* Exception-safe ownership transfer from unique_ptr to shared_ptr
ChatGPT:
* shared_ptr’s ctor can throw (control-block alloc). Using get() keeps unique_ptr owning the memory if that happens, so no leak.
* Only after the shared_ptr is successfully constructed do you release(), transferring ownership exactly once.
* [skip ci] Rename alias_ptr to mi_subobject_ptr to distinguish from trampoline code (which often uses the term "alias", too)
* [skip ci] Also exercise smart_holder::from_raw_ptr_take_ownership
* [skip ci] Add st.first comments (generated by ChatGPT)
* [skip ci] Copy and extend (raw_ptr, unique_ptr) reproducer from PR #5796
* Some polishing: comments, add back Left/Right dtors for consistency within test_class_sh_mi_thunks.cpp
* explicitly default copy/move for VBase to silence -Wdeprecated-copy-with-dtor
* Resolve clang-tidy error:
```
/__w/pybind11/pybind11/tests/test_class_sh_mi_thunks.cpp:67:5: error: 'auto ptr' can be declared as 'auto *ptr' [readability-qualified-auto,-warnings-as-errors]
67 | auto ptr = new Diamond;
| ^~~~
| auto *
```
* Expand comment in `smart_holder::from_unique_ptr()`
* Better Left/Right padding to make it more likely that we avoid "all at offset 0". Clarify comment.
* Give up on `alignas(16)` to resolve MSVC warning:
```
"D:\a\pybind11\pybind11\build\ALL_BUILD.vcxproj" (default target) (1) ->
"D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj" (default target) (13) ->
(ClCompile target) ->
D:\a\pybind11\pybind11\tests\test_class_sh_mi_thunks.cpp(70,17): warning C4316: 'test_class_sh_mi_thunks::Diamond': object allocated on the heap may not be aligned 16 [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
D:\a\pybind11\pybind11\tests\test_class_sh_mi_thunks.cpp(80,43): warning C4316: 'test_class_sh_mi_thunks::Diamond': object allocated on the heap may not be aligned 16 [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316: 'std::_Ref_count_obj2<_Ty>': object allocated on the heap may not be aligned 16 [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316: with [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316: [ [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316: _Ty=test_class_sh_mi_thunks::Diamond [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316: ] [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
D:\a\pybind11\pybind11\include\pybind11\detail\init.h(77,21): warning C4316: 'test_class_sh_mi_thunks::Diamond': object allocated on the heap may not be aligned 16 [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
```
The warning came from alignas(16) making Diamond over-aligned, while regular new/make_shared aren’t guaranteed to return 16-byte aligned memory on MSVC (hence C4316). I’ve removed the explicit alignment and switched to asymmetric payload sizes (char[4] vs char[24]), which still nudges MI layout without relying on over-alignment. This keeps the test goal and eliminates the warning across all MSVC builds. If we ever want to stress over-alignment explicitly, we can add aligned operator new/delete under __cpp_aligned_new, but that’s more than we need here.
* Rename test_virtual_base_at_offset_0() → test_virtual_base_not_at_offset_0() and replace pytest.skip() with assert. Add helpful comment for future maintainers.
Python may leave behind temporary `.pyc.*` files inside `__pycache__`
on some filesystems (e.g. WSL2 mounts). Adding `__pycache__/` ensures
these directories and any leftover files are consistently ignored.
Background: Python writes bytecode to a temp file with an extra suffix
before renaming it to `.pyc`. If the process is interrupted or the
filesystem rename isn’t fully atomic, those temp files may remain.
See: https://docs.python.org/3/library/py_compile.html#py_compile.compile
Occasionally a test will get stuck and run for 6 hours until Github cancels the workflow.
This reduces the timeout to 90 minutes to not waste resources.
Pybind11's tests seem to run in 30 minutes so this should be plenty of time.
* Revert type hint changes to int_ and float_
These two types do not support casting from int-like and float-like types.
* Fix tests
* Add a custom py::float_ caster
The default py::object caster only works if the object is an instance of the type.
py::float_ should accept python int objects as well as float.
This caster will pass through float as usual and cast int to float.
The caster handles the type name so the custom one is not required.
* style: pre-commit fixes
* Fix name
* Fix variable
* Try satisfying the formatter
* Rename test function
* Simplify type caster
* Fix reference counting issue
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Avoid heap allocation for function calls with a small number of arguments
We don't have access to llvm::SmallVector or similar, but given the
limited subset of the `std::vector` API that
`function_call::args{,_convert}` need and the "reserve-then-fill"
usage pattern, it is relatively straightforward to implement custom
containers that get the job done.
Seems to improves time to call the collatz function in
pybind/pybind11_benchmark significantly; numbers are a little noisy
but there's a clear improvement from "about 60 ns per call" to "about
45 ns per call" on my machine (M4 Max Mac), as measured with
`timeit.repeat('collatz(4)', 'from pybind11_benchmark import
collatz')`.
* clang-tidy
* more clang-tidy
* clang-tidy NOLINTBEGIN/END instead of NOLINTNEXTLINE
* forgot to increase inline size after removing std::variant
* constexpr arg_vector_small_size, use move instead of swap to hopefully clarify second_pass_convert
* rename test_embed to test_low_level
* rename test_low_level to test_with_catch
* Be careful to NOINLINE slow paths
* rename array/vector members to iarray/hvector. Move comment per request. Add static_asserts for our untagged union implementation per request.
* drop is_standard_layout assertions; see https://github.com/pybind/pybind11/pull/5824#issuecomment-3308616072
* Use thread_local instead of thread_specific_storage for internals mangement
thread_local is faster.
* Make the pp manager a singleton.
Strictly speaking, since the members are static, the instances must also be singletons or this wouldn't work. They already are, but we can make the class enforce it to be more 'self-documenting'.
* Revert "s/windows-2022/windows-latest/ in .github/workflows/{ci,pip}.yml (#5826)"
This reverts commit 852a4b5010.
* Add module-level skip for Windows build >= 26100 in test_iostream.py
* Changes suggested by at-henryiii
* Use thread_local for loader_life_support to improve performance
As explained in a new code comment, `loader_life_support` needs to be
`thread_local` but does not need to be isolated to a particular
interpreter because any given function call is already going to only
happen on a single interpreter by definiton.
Performance before:
- on M4 Max using pybind/pybind11_benchmark unmodified repo:
```
> python -m timeit --setup 'from pybind11_benchmark import collatz' 'collatz(4)'
5000000 loops, best of 5: 63.8 nsec per loop
```
- Linux server:
```
python -m timeit --setup 'from pybind11_benchmark import collatz' 'collatz(4)' (pytorch)
2000000 loops, best of 5: 120 nsec per loop
```
After:
- M4 Max:
```
python -m timeit --setup 'from pybind11_benchmark import collatz' 'collatz(4)'
5000000 loops, best of 5: 53.1 nsec per loop
```
- Linux server:
```
> python -m timeit --setup 'from pybind11_benchmark import collatz' 'collatz(4)' (pytorch)
2000000 loops, best of 5: 101 nsec per loop
```
A quick profile with perf shows that pthread_setspecific and pthread_getspecific are gone.
Open questions:
- How do we determine whether we can safely use `thread_local`? I see
concerns about old iOS versions on
https://github.com/pybind/pybind11/pull/5705#issuecomment-2922858880
and https://github.com/pybind/pybind11/pull/5709; is there anything
else?
- Do we have a test that covers "function called in one interpreter
calls a C++ function that causes a function call in another
interpreter"? I think it's fine, but can it happen?
- Are we happy with what we think will happen in the case where
multiple extensions compiled with and without this PR interoperate?
I think it's fine -- each dispatch pushes and cleans up its own
state -- but a second opinion is certainly welcome.
* Remove PYBIND11_CAN_USE_THREAD_LOCAL
* clarify comment
* Simplify loader_life_support TLS storage
Replace the `fake_thread_specific_storage` struct with a direct
thread-local pointer managed via a function-local static:
static loader_life_support *& tls_current_frame()
This retains the "stack of frames" behavior via the `parent` link. It also
reduces indirection and clarifies intent.
Note: this form is C++11-compatible; once pybind11 requires C++17, the
helper can be simplified to:
inline static thread_local loader_life_support *tls_current_frame = nullptr;
* loader_life_support: avoid duplicate tls_current_frame() calls
Replace repeated calls with a single local reference:
auto &frame = tls_current_frame();
This ensures the thread_local initialization guard is checked only once
per constructor/destructor call site, avoids potential clang-tidy
complaints, and makes the code more readable. Functional behavior is
unchanged.
* Add REMINDER for next version bump in internals.h
---------
Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>
* pytypes.h: constrain accessor::operator= templates so that they do not match calls that should use the special member functions.
Found by an experimental, new clang-tidy check. While we may not know the exact design decisions now, it seems unlikely that the special members were deliberately meant to not be selected (for otherwise they could have been defined differently to make this clear). Rather, it seems like an oversight that the operator templates win in overload resolution, and we should restore the intended resolution.
* Use C++11-compatible facilities
* Use C++11-compatible facilities
* style: pre-commit fixes
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>