composable_kernel

mirror of https://github.com/ROCm/composable_kernel.git synced 2026-03-22 08:07:38 +00:00

Author	SHA1	Message	Date
Robin Voetter	42048bdb7d	[CK_BUILDER] Integrate CKB validation with CK verification (#3649 ) * ck-builder: tensor copy function This function copies one tensor to another, so that the memory layout can be changed between them. * ck-builder: fix ck::bhalf literals These types don't work properly. * ck-builder: abstract compare_elements in gpu_verification.hpp and make builder use it This reduces the amount of duplicated code a bit. * ck-builder: add flat tensor iterator This "iterator" type pretends to be a pointer, useful for passing tensors to functions expecting pointer-like types. * ck-builder: integrate validation with ck gpu verification By templating the gpu_verify function over iterators, we can use the new FlatTensorIterator to adapt the function to multi- dimensional tensors without changing either implementation too much. * ck-builder: add check_by_accumulations This changes the gpu_verification.hpp code to also accept "iterator" types for the relevant gpu_verify and gpu_reduce_max functions. * ck: fix test_gpu_verification GenerateRandomData for bhalf is_integer_it<bhalf_t> yields true, but it is not actually an integer. * ck: make gpu_verification kernels be proper persistent kernels Previously these were using a hardcoded value for the grid size. This commit changes that so that the grid size is automatically derived from the kernel's occupancy and the number of multiprocessors on the GPU. * ck: clean up gpu_verification.hpp using block_reduce This implements a small generic block reduce function, and rewrites the rest of gpu_verification.hpp using that function to clean it up a bit. * ck-builder: doc typos * ck-builder: update testing readme with validation interface. * ck-builder: rebase fixes + review comments * ck-builder: fix device integer generation with float types Passing bfloat here causes a nans due to type_convert performing a bitcast. * ck: another bhalf_t bug CK expects that int-generation with ck::bhalf_t yields bhalf integers, not unsigned integers. This makes the logic of FillUniformRandInteger compatible with GeneratorTensor_2<InDataType>, however idiotic that may be.	2026-01-28 17:41:02 +01:00
Robin Voetter	cc75948d1c	[CK_BUILDER] conv bwd weight testing (#3618 ) * ck-builder: restructure testing conv In order to prepare for bwd of conv testing, this commit moves some files and types around so that we can reuse ckt::Args for both forward and backwards convolution. * ck-builder: decouple fwd_ck.hpp and fwd_reference.hpp from fwd.hpp This will allow us to more easily include fwd.hpp from backwards definitions, which is required for initializing bwd values. * ck-builder: fix layout of test_ckb_conv_bwd_weight_xdl_cshuffle_v3 Turns out that the supplied layout isn't actually supported... * ck-builder: ck and reference conv integration for bwd weight * ck-builder: ck bwd weight execution test * ck-builder: ckt::run support for ck-tile bwd weight * ck-builder: ck tile bwd weight execution test * ck-builder: extra debug printing in MatchesReference * ck-builder: make ckt::run return RunResult This type is more convenient than std::tuple, as it will allow us to use google test matchers with this in the future. * ck-builder: RunResult matcher Using EXPECT_THAT(..., SuccessfulRun()) will generate a check and a nice error message about how and why running an algorithm failed. * ck-builder: doc fixes * ck-builder: add missing headers	2026-01-26 23:50:15 +01:00
Robin Voetter	b352a68606	ck-builder: tensor input/output reflection (#3536 ) This adds some utilities to automatically generate UniqueInputs, UniqueOutputs, alloc_inputs, alloc_outputs, and validate, based on a Inputs::reflect() and Outputs::reflect().	2026-01-12 09:45:53 +01:00
Robin Voetter	6219b12730	[CK_BUILDER] convolution testing (#3267 ) * Add README.md for testing * Add tensor_memory_manager. * ck-builder: tensor memory manager rebase fixes This fixes some issues caused by the API being changed recently. Also, this streamlines the ckt namespace to always be ck_tile::builder::test, as this is already being used by other tests Really, this commit should be squashed into the previous, but I'm keeping it separate for brevity. * ck-builder: test arguments initial prototype * ck-builder: test system initial prototype * ck-builder: fix non-standardized copyright comments * ck-builder: new prototype * ck-builder: group testing inputs/outputs into a separate structure This is basically the return of the tensor memory manager after all, except that the design is more closely tied to the actual operation. Using a struct allows us to add additional input/output tensors without breaking code (by defaulting those new parameters). Note that the tensors are split into a separate inputs/outputs because we usually want to allocate the output _twice_: once for the real computation and once for the reference computation. * ck-builder: simplify prototype naming; start docs * ck-builder: update testing readme * ck-builder: testing documentation * ck-builder: HipStatusMatcher This matcher can be used to check HIP status codes and provide nice and readable error messages. * ck-builder: tensor_buffer.hpp tests * ck-builder: conv_fwd.hpp tests * ck-builder: add example end-to-end test in conv fwd 2d fp16 * ck-builder: simplify extent usage * ck-builder: update testing doc * ck-builder: skip end to end test on non-gfx9 * fix check_copyright_year interpreter /bin/bash is not guaranteed to exist on Linux. Signed, a NixOS user * ck-builder: fix copyrights * ck-builder: reduce conv fwd testing size This test allocated 24GB of memory, too much for 16GB cards. --------- Co-authored-by: John Shumway <jshumway@amd.com>	2025-12-13 15:33:41 +01:00

4 Commits