pybind11

mirror of https://github.com/pybind/pybind11.git synced 2026-05-12 01:10:34 +00:00

Author	SHA1	Message	Date
Ralf W. Grosse-Kunstleve	b13e218bfa	Revert "Add DEBUG_LOOK in TEST_CASE("Move Subinterpreter")" This reverts commit `ad3e1c34ce`.	2025-12-13 23:26:04 -08:00
Ralf W. Grosse-Kunstleve	48725893c6	Add Python version banner to Catch progress reporter Print the CPython version once at the start of the Catch-based interpreter tests using Py_GetVersion(). This makes it trivial to confirm which free-threaded build a failing run is using when inspecting CI or local logs.	2025-12-13 23:25:33 -08:00
Ralf W. Grosse-Kunstleve	ad3e1c34ce	Add DEBUG_LOOK in TEST_CASE("Move Subinterpreter")	2025-12-13 20:21:35 -08:00
Ralf W. Grosse-Kunstleve	179a66f606	Add progress reporter for test_with_catch Catch runner Introduce a custom Catch2 reporter for tests/test_with_catch that prints a simple one-line status for each test case as it starts and ends, and wire the cpptest CMake target to invoke test_with_catch with -r progress. This makes it much easier to see where the embedded/interpreter test binary is spending its time in CI logs, and in particular to pinpoint which test case is stuck when the free-threading builds hang. Compared to adding ad hoc timeouts around potentially infinite busy-wait loops in individual tests, a progress reporter is a more general and robust approach: it gives visibility into all tests (including future ones) without changing their behavior, and turns otherwise opaque 90-minute timeouts into locatable issues in the Catch output.	2025-12-13 19:08:49 -08:00
Ralf W. Grosse-Kunstleve	32725f761b	Revert "Limit busy-wait loops in per-subinterpreter GIL test" This reverts commit `7847adacda`.	2025-12-13 19:04:35 -08:00
Ralf W. Grosse-Kunstleve	7847adacda	Limit busy-wait loops in per-subinterpreter GIL test Add explicit timeouts to the busy-wait coordination loops in the Per-Subinterpreter GIL test in tests/test_with_catch/test_subinterpreter.cpp. Previously those loops spun indefinitely waiting for shared atomics like `started` and `sync` to change, which is fine when CPython's free-threading and per-interpreter GIL behavior matches the test's expectations but becomes pathologically bad when that behavior regresses: the `test_with_catch` executable can then hang forever, causing our 3.14t CI jobs to time out after 90 minutes. This change keeps the structure and intent of the test but adds a std::chrono::steady_clock deadline to each of the coordination loops, using a conservative 10 second bound. Worker threads record a failure and return if they hit the timeout, while the main thread fails the test via Catch2 instead of hanging. That way, if future CPython free-threading patches change the semantics again, the test will fail quickly and produced a diagnosable error instead of wedging the CI job.	2025-12-13 17:05:10 -08:00
Scott Wolchok	3262000195	Add fast_type_map, use it authoritatively for local types and as a hint for global types (ABI breaking) (#5842 ) * Add fast_type_map, use it authoritatively for local types and as a hint for global types nanobind has a similar two-level lookup strategy, added and explained by `b515b1f7f2` In this PR I've ported this approach to pybind11. To avoid an ABI break, I've kept the fast maps to the `local_internals`. I think this should be safe because any particular module should see its `local_internals` reset at least as often as the global `internals`, and misses in the fast "hint" map for global types fall back to the global `internals`. Performance seems to have improved. Using my patched fork of pybind11_benchmark (https://github.com/swolchok/pybind11_benchmark/tree/benchmark-updates, specifically commit hash b6613d12607104d547b1c10a8145d1b3e9937266), I run bench.py and observe the MyInt case. Each time, I do 3 runs and just report all 3. master, Mac: 75.9, 76.9, 75.3 nsec/loop this PR, Mac: 73.8, 73.8, 73.6 nsec/loop master, Linux box: 188, 187, 188 nsec/loop this PR, Linux box: 164, 165, 164 nsec/loop Note that the "real" percentage improvement is larger than implied by the above because master does not yet include #5824. * simplify unsafe_reset_local_internals in test * pre-implement PYBIND11_INTERNALS_VERSION 12 * use PYBIND11_INTERNALS_VERSION 12 on Python 3.14 per suggestion * Implement reviewer comments: revert PY_VERSION_HEX change, fix REVIEW comment, add two-level lookup comments. ci.yml coming separately * Use the inplace build to smoke test ABI bump? * [skip ci] Remove "smoke" from comment. This is full testing, just only on a few platforms. --------- Co-authored-by: Ralf W. Grosse-Kunstleve <rgrossekunst@nvidia.com>	2025-10-05 11:07:25 -07:00
Scott Wolchok	30748f863f	Avoid heap allocation for function calls with a small number of args (#5824 ) * Avoid heap allocation for function calls with a small number of arguments We don't have access to llvm::SmallVector or similar, but given the limited subset of the `std::vector` API that `function_call::args{,_convert}` need and the "reserve-then-fill" usage pattern, it is relatively straightforward to implement custom containers that get the job done. Seems to improves time to call the collatz function in pybind/pybind11_benchmark significantly; numbers are a little noisy but there's a clear improvement from "about 60 ns per call" to "about 45 ns per call" on my machine (M4 Max Mac), as measured with `timeit.repeat('collatz(4)', 'from pybind11_benchmark import collatz')`. * clang-tidy * more clang-tidy * clang-tidy NOLINTBEGIN/END instead of NOLINTNEXTLINE * forgot to increase inline size after removing std::variant * constexpr arg_vector_small_size, use move instead of swap to hopefully clarify second_pass_convert * rename test_embed to test_low_level * rename test_low_level to test_with_catch * Be careful to NOINLINE slow paths * rename array/vector members to iarray/hvector. Move comment per request. Add static_asserts for our untagged union implementation per request. * drop is_standard_layout assertions; see https://github.com/pybind/pybind11/pull/5824#issuecomment-3308616072	2025-09-19 13:44:40 -07:00

8 Commits