Commit Graph

5 Commits

Author SHA1 Message Date
Ralf W. Grosse-Kunstleve
b13e218bfa Revert "Add DEBUG_LOOK in TEST_CASE("Move Subinterpreter")"
This reverts commit ad3e1c34ce.
2025-12-13 23:26:04 -08:00
Ralf W. Grosse-Kunstleve
ad3e1c34ce Add DEBUG_LOOK in TEST_CASE("Move Subinterpreter") 2025-12-13 20:21:35 -08:00
Ralf W. Grosse-Kunstleve
32725f761b Revert "Limit busy-wait loops in per-subinterpreter GIL test"
This reverts commit 7847adacda.
2025-12-13 19:04:35 -08:00
Ralf W. Grosse-Kunstleve
7847adacda Limit busy-wait loops in per-subinterpreter GIL test
Add explicit timeouts to the busy-wait coordination loops in the
Per-Subinterpreter GIL test in tests/test_with_catch/test_subinterpreter.cpp.
Previously those loops spun indefinitely waiting for shared atomics like
`started` and `sync` to change, which is fine when CPython's free-threading
and per-interpreter GIL behavior matches the test's expectations but becomes
pathologically bad when that behavior regresses: the `test_with_catch`
executable can then hang forever, causing our 3.14t CI jobs to time out
after 90 minutes.

This change keeps the structure and intent of the test but adds a
std::chrono::steady_clock deadline to each of the coordination loops,
using a conservative 10 second bound. Worker threads record a failure and
return if they hit the timeout, while the main thread fails the test via
Catch2 instead of hanging. That way, if future CPython free-threading
patches change the semantics again, the test will fail quickly and
produced a diagnosable error instead of wedging the CI job.
2025-12-13 17:05:10 -08:00
Scott Wolchok
30748f863f Avoid heap allocation for function calls with a small number of args (#5824)
* Avoid heap allocation for function calls with a small number of arguments

We don't have access to llvm::SmallVector or similar, but given the
limited subset of the `std::vector` API that
`function_call::args{,_convert}` need and the "reserve-then-fill"
usage pattern, it is relatively straightforward to implement custom
containers that get the job done.

Seems to improves time to call the collatz function in
pybind/pybind11_benchmark significantly; numbers are a little noisy
but there's a clear improvement from "about 60 ns per call" to "about
45 ns per call" on my machine (M4 Max Mac), as measured with
`timeit.repeat('collatz(4)', 'from pybind11_benchmark import
collatz')`.

* clang-tidy

* more clang-tidy

* clang-tidy NOLINTBEGIN/END instead of NOLINTNEXTLINE

* forgot to increase inline size after removing std::variant

* constexpr arg_vector_small_size, use move instead of swap to hopefully clarify second_pass_convert

* rename test_embed to test_low_level

* rename test_low_level to test_with_catch

* Be careful to NOINLINE slow paths

* rename array/vector members to iarray/hvector. Move comment per request. Add static_asserts for our untagged union implementation per request.

* drop is_standard_layout assertions; see https://github.com/pybind/pybind11/pull/5824#issuecomment-3308616072
2025-09-19 13:44:40 -07:00