mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-02 20:51:23 +00:00
[CK_BUILDER] validation (#3471)
This pull request builds on #3267 by proving the "validation" infrastructure, the means to compare a set of `Outputs`. The design of the validation infrastructure is relatively straight forward: - Each SIGNATURE should come with a `validate()` implementation, which should be implemented in a similar way that the other functions/types from `testing.hpp` are implemented. - `validate()` returns a `ValidationReport`, which is a structure that keeps all relevant information about comparing the tensors from two `Outputs`. Note that crucially, `validate()` should not do any reporting by itself. Rather, glue logic should be implemented by the user to turn `ValidationReport` into a relevant error message. - You can see this clue code for CK-Builder itself in `testing_utils.hpp`, its `MatchesReference()`. This functionality is relatively barebones right now, it will be expanded upon in a different PR to keep the scope of this one down. The comparison is done on the GPU (using an atomic for now), to keep tests relatively quick. Some notable items from this PR: - To help compare the tensors and with writing tests, I've written a generic function `tensor_foreach` which invokes a callback on every element of a tensor. - For that it was useful that the `TensorDescriptor` has a rank which is known at compile-time, so I've changed the implementation of `TensorDescriptor` for that. I felt like it was a better approach than keeping it dynamic, for multiple reasons: - This is C++ and we should use static typing where possible and useful. This way, we don't have to implement runtime assertions about the tensor rank. - We know already know the rank of tensors statically, as it can be derived from the SIGNATURE. - It simpifies the implementation of `tensor_foreach` and other comparison code. - There are a lot of new tests for validating the validation implementation, validating validation validation tests (Only 3 recursive levels though...). For a few of those functions, I felt like it would be useful to expose them to the user. - Doc comments everywhere.
This commit is contained in:
@@ -5,6 +5,8 @@
|
||||
|
||||
#include <concepts>
|
||||
|
||||
#include "ck_tile/builder/testing/validation.hpp"
|
||||
|
||||
/// This file is the main header for the CK-Builder testing system. A high-level
|
||||
/// description of this testing system is documented in
|
||||
/// `ck_tile/builder/testing/README.md`. This file deals mainly deals with the
|
||||
@@ -78,7 +80,7 @@ namespace ck_tile::builder::test {
|
||||
/// that this structure is an aggregrate so that it can be initialized using C++20
|
||||
/// designated initializers to keep the tests readable.
|
||||
///
|
||||
/// @tparam SIGNATURE the signature to specialize the structure for.
|
||||
/// @tparam SIGNATURE The signature to specialize the structure for.
|
||||
template <auto SIGNATURE>
|
||||
struct Args;
|
||||
|
||||
@@ -98,7 +100,7 @@ struct Args;
|
||||
/// structure is an aggregrate so that it can be initialized using C++20
|
||||
/// designated initializers to keep the tests readable.
|
||||
///
|
||||
/// @tparam SIGNATURE the signature to specialize the structure for.
|
||||
/// @tparam SIGNATURE The signature to specialize the structure for.
|
||||
template <auto SIGNATURE>
|
||||
struct Inputs;
|
||||
|
||||
@@ -118,7 +120,7 @@ struct Inputs;
|
||||
/// structure is an aggregrate so that it can be initialized using C++20
|
||||
/// designated initializers to keep the tests readable.
|
||||
///
|
||||
/// @tparam SIGNATURE the signature to specialize the structure for.
|
||||
/// @tparam SIGNATURE The signature to specialize the structure for.
|
||||
template <auto SIGNATURE>
|
||||
struct Outputs;
|
||||
|
||||
@@ -133,7 +135,7 @@ struct Outputs;
|
||||
/// @note The easiest way to implement this type is to use the `DeviceBuffer`
|
||||
/// type to allocate individual device buffers for each input tensor.
|
||||
///
|
||||
/// @tparam SIGNATURE the signature to specialize the structure for.
|
||||
/// @tparam SIGNATURE The signature to specialize the structure for.
|
||||
///
|
||||
/// @see alloc_inputs()
|
||||
/// @see ValidUniqueInputs
|
||||
@@ -152,7 +154,7 @@ struct UniqueInputs;
|
||||
/// @note The easiest way to implement this type is to use the `DeviceBuffer`
|
||||
/// type to allocate individual device buffers for each output tensor.
|
||||
///
|
||||
/// @tparam SIGNATURE the signature to specialize the structure for.
|
||||
/// @tparam SIGNATURE The signature to specialize the structure for.
|
||||
///
|
||||
/// @see alloc_outputs()
|
||||
/// @see ValidUniqueOutputs
|
||||
@@ -195,7 +197,9 @@ concept ValidUniqueOutputs = requires(UniqueOutputs<SIGNATURE>& inputs) {
|
||||
/// amount of memory required and then allocate it on the device, for example
|
||||
/// using `alloc_buffer` or `alloc_tensor_buffer`.
|
||||
///
|
||||
/// @tparam SIGNATURE the signature to specialize the structure for.
|
||||
/// @tparam SIGNATURE The signature to specialize the structure for.
|
||||
///
|
||||
/// @param args The run-time arguments of the operation.
|
||||
///
|
||||
/// @see Inputs
|
||||
/// @see UniqueInputs
|
||||
@@ -208,16 +212,18 @@ UniqueInputs<SIGNATURE> alloc_inputs(const Args<SIGNATURE>& args);
|
||||
/// @brief Allocate inputs corresponding to a signature.
|
||||
///
|
||||
/// The `init_inputs()` function is used to initialize pseudo-random data
|
||||
/// to the tensors specified in the Inputs structure.
|
||||
/// to the tensors specified in the Inputs structure. Implementors should
|
||||
/// fill each of the tensors in `inputs` with appropriate random data.
|
||||
///
|
||||
/// @tparam SIGNATURE the signature to specialize the structure for.
|
||||
///
|
||||
/// @param args The run-time arguments of the operation.
|
||||
/// @param inputs The operation inputs to initialize with random data.
|
||||
///
|
||||
/// @see Inputs
|
||||
/// @see UniqueInputs
|
||||
/// @see tensor_initialization
|
||||
template <auto SIGNATURE>
|
||||
requires ValidUniqueInputs<SIGNATURE>
|
||||
void init_inputs(const Args<SIGNATURE>& args, UniqueInputs<SIGNATURE>& inputs);
|
||||
void init_inputs(const Args<SIGNATURE>& args, Inputs<SIGNATURE> inputs);
|
||||
|
||||
/// @brief Allocate outputs corresponding to a signature.
|
||||
///
|
||||
@@ -226,7 +232,9 @@ void init_inputs(const Args<SIGNATURE>& args, UniqueInputs<SIGNATURE>& inputs);
|
||||
/// amount of memory required and then allocate it on the device, for example
|
||||
/// using `alloc_buffer` or `alloc_tensor_buffer`.
|
||||
///
|
||||
/// @tparam SIGNATURE the signature to specialize the structure for.
|
||||
/// @tparam SIGNATURE The signature to specialize the structure for.
|
||||
///
|
||||
/// @param args The run-time arguments of the operation.
|
||||
///
|
||||
/// @see Outputs
|
||||
/// @see UniqueOutputs
|
||||
@@ -236,6 +244,29 @@ template <auto SIGNATURE>
|
||||
requires ValidUniqueOutputs<SIGNATURE>
|
||||
UniqueInputs<SIGNATURE> alloc_outputs(const Args<SIGNATURE>& args);
|
||||
|
||||
/// @brief Compare device operation outputs.
|
||||
///
|
||||
/// This function implements the main comparison functionality, used to compare
|
||||
/// the output of one implementation for a particular `SIGNATURE` with that of
|
||||
/// another. Usually, the `expected` output should be computed by a reference
|
||||
/// implementation.
|
||||
///
|
||||
/// The implementation of this function generates a "report", which includes
|
||||
/// detailed information about which tensors are different, how many elements
|
||||
/// were incorrect, and where (a subset of) those elements are located within
|
||||
/// the tensor. See `ValidationReport` for more information about the report.
|
||||
///
|
||||
/// @tparam SIGNATURE The signature to specialize the structure for.
|
||||
///
|
||||
/// @param args The run-time arguments of the operation.
|
||||
/// @param actual The actual results, the results of the operation to-be-tested.
|
||||
/// @param expected The expected results, the results of the reference implementation.
|
||||
///
|
||||
/// @see ValidationReport
|
||||
template <auto SIGNATURE>
|
||||
ValidationReport
|
||||
validate(const Args<SIGNATURE>& args, Outputs<SIGNATURE> actual, Outputs<SIGNATURE> expected);
|
||||
|
||||
/// @brief Invoke a device operation created by CK Builder.
|
||||
///
|
||||
/// This is the main function used to invoke a particular device operation
|
||||
@@ -257,7 +288,7 @@ UniqueInputs<SIGNATURE> alloc_outputs(const Args<SIGNATURE>& args);
|
||||
/// @post The tensors in `outputs` are overwritten with the outputs of the device
|
||||
/// operation.
|
||||
///
|
||||
/// @tparam SIGNATURE the signature to specialize this function for
|
||||
/// @tparam SIGNATURE The signature to specialize this function for
|
||||
/// @tparam Operation the kernel of the operation to invoke. This type should be
|
||||
/// one that is created using the Builder API.
|
||||
/// @param operation An instance of the operation to invoke.
|
||||
|
||||
Reference in New Issue
Block a user