mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-04-19 22:39:03 +00:00
[CK_BUILDER] Integrate CKB validation with CK verification (#3649)
* ck-builder: tensor copy function This function copies one tensor to another, so that the memory layout can be changed between them. * ck-builder: fix ck::bhalf literals These types don't work properly. * ck-builder: abstract compare_elements in gpu_verification.hpp and make builder use it This reduces the amount of duplicated code a bit. * ck-builder: add flat tensor iterator This "iterator" type pretends to be a pointer, useful for passing tensors to functions expecting pointer-like types. * ck-builder: integrate validation with ck gpu verification By templating the gpu_verify function over iterators, we can use the new FlatTensorIterator to adapt the function to multi- dimensional tensors without changing either implementation too much. * ck-builder: add check_by_accumulations This changes the gpu_verification.hpp code to also accept "iterator" types for the relevant gpu_verify and gpu_reduce_max functions. * ck: fix test_gpu_verification GenerateRandomData for bhalf is_integer_it<bhalf_t> yields true, but it is not actually an integer. * ck: make gpu_verification kernels be proper persistent kernels Previously these were using a hardcoded value for the grid size. This commit changes that so that the grid size is automatically derived from the kernel's occupancy and the number of multiprocessors on the GPU. * ck: clean up gpu_verification.hpp using block_reduce This implements a small generic block reduce function, and rewrites the rest of gpu_verification.hpp using that function to clean it up a bit. * ck-builder: doc typos * ck-builder: update testing readme with validation interface. * ck-builder: rebase fixes + review comments * ck-builder: fix device integer generation with float types Passing bfloat here causes a nans due to type_convert performing a bitcast. * ck: another bhalf_t bug CK expects that int-generation with ck::bhalf_t yields bhalf integers, not unsigned integers. This makes the logic of FillUniformRandInteger compatible with GeneratorTensor_2<InDataType>, however idiotic that may be.
This commit is contained in:
@@ -98,8 +98,10 @@ TEST(ConvFwdTesting, Validate)
|
||||
[&]([[maybe_unused]] std::string_view name,
|
||||
const auto& desc,
|
||||
void* ckt::Outputs<SIGNATURE>::*ptr) {
|
||||
ckt::clear_tensor_buffer(desc, a.get().*ptr, ck::bhalf_t{123});
|
||||
ckt::clear_tensor_buffer(desc, b.get().*ptr, ck::bhalf_t{123});
|
||||
ckt::clear_tensor_buffer(
|
||||
desc, a.get().*ptr, ck::type_convert<ck::bhalf_t, float>(123));
|
||||
ckt::clear_tensor_buffer(
|
||||
desc, b.get().*ptr, ck::type_convert<ck::bhalf_t, float>(123));
|
||||
});
|
||||
|
||||
const auto report = ckt::validate(ARGS, a.get(), b.get());
|
||||
@@ -115,8 +117,10 @@ TEST(ConvFwdTesting, Validate)
|
||||
const auto& desc,
|
||||
void* ckt::Outputs<SIGNATURE>::*ptr) {
|
||||
++field_count;
|
||||
ckt::clear_tensor_buffer(desc, a.get().*ptr, ck::bhalf_t{2});
|
||||
ckt::clear_tensor_buffer(desc, b.get().*ptr, ck::bhalf_t{1});
|
||||
ckt::clear_tensor_buffer(
|
||||
desc, a.get().*ptr, ck::type_convert<ck::bhalf_t, float>(2));
|
||||
ckt::clear_tensor_buffer(
|
||||
desc, b.get().*ptr, ck::type_convert<ck::bhalf_t, float>(1));
|
||||
});
|
||||
|
||||
const auto report = ckt::validate(ARGS, a.get(), b.get());
|
||||
|
||||
Reference in New Issue
Block a user