Add json dump support to output details from CK/CKTile Examples. (#2551)

* Adding RapidJson Library

* Adding Json Dumps in all CK_Tile Examples

Not verified yet

* Adding json to cktile Batched Transpose

* adding json dumps to layernorm2d_fwd

* Adding  json dump to flatmm_basic

* Adding RapidJson Library

* Adding Json Dumps in all CK_Tile Examples

Not verified yet

* Adding json to cktile Batched Transpose

* adding json dumps to layernorm2d_fwd

* Adding  json dump to flatmm_basic

* Adding json in 03_gemm

* Add json dump to 16_batched_gemm

* Add json dump to gemm_multi_d_fp16

* Add json dump to grouped_gemm

* fix fmha_bwd/fwd

* Fix clang-format errors

exclude include/rapidjson in jenkins as its a third-party library

* Saparating function and defination.

* Update Documentation of 03_gemm

* Refactoring as per code review

* Disable fp8 instances on unsupported targets (#2592)

* Restrict building of gemm_universal_preshuffle_f8 instances to specific targets in CMakeLists.txt

* Add condition to skip gemm_xdl_universal_preshuffle_f8 instances for unsupported targets in CMakeLists.txt

* Add conditions to skip unsupported targets for gemm_universal_preshuffle_f8 and gemm_xdl_universal_preshuffle_f8 instances in CMakeLists.txt

* Refine conditions to exclude gemm_universal_preshuffle_f8 instances for unsupported targets in CMakeLists.txt

---------

Co-authored-by: AviralGoelAMD <aviralgoel@amd.com>

* fix clang format

* remove duplicate lines of code from library/src/tensor_operation_instance/gpu/CMakeLists.txt

* Fixing Readme and unifying jsondumps

* adding moe_smoothquant

* adding fused_moe

* Fixing Readme for batched_gemm

* Fixing Readme for grouped_gemm

* adding flatmm

* adding gemm_multi_d_fp16

* adding elementwise

* adding File name when json is dumped

* Fixing Reduce after merge

* adding batched_transpose

* Adding Warptile in Gemm

* Fixing Clang Format

---------

Co-authored-by: Aviral Goel <aviral.goel@amd.com>
Co-authored-by: AviralGoelAMD <aviralgoel@amd.com>
Co-authored-by: illsilin_amdeng <Illia.Silin@amd.com>
This commit is contained in:
rahjain-amd
2025-09-03 12:01:29 +05:30
committed by GitHub
parent e1ab460d2d
commit 4d041837ad
88 changed files with 21219 additions and 856 deletions

View File

@@ -0,0 +1,71 @@
// Tencent is pleased to support the open source community by making RapidJSON available.
//
// Copyright (C) 2015 THL A29 Limited, a Tencent company, and Milo Yip.
//
// Licensed under the MIT License (the "License"); you may not use this file except
// in compliance with the License. You may obtain a copy of the License at
//
// http://opensource.org/licenses/MIT
//
// Unless required by applicable law or agreed to in writing, software distributed
// under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
// CONDITIONS OF ANY KIND, either express or implied. See the License for the
// specific language governing permissions and limitations under the License.
#ifndef RAPIDJSON_MEMORYSTREAM_H_
#define RAPIDJSON_MEMORYSTREAM_H_
#include "stream.h"
#ifdef __clang__
RAPIDJSON_DIAG_PUSH
RAPIDJSON_DIAG_OFF(unreachable-code)
RAPIDJSON_DIAG_OFF(missing-noreturn)
#endif
RAPIDJSON_NAMESPACE_BEGIN
//! Represents an in-memory input byte stream.
/*!
This class is mainly for being wrapped by EncodedInputStream or AutoUTFInputStream.
It is similar to FileReadBuffer but the source is an in-memory buffer instead of a file.
Differences between MemoryStream and StringStream:
1. StringStream has encoding but MemoryStream is a byte stream.
2. MemoryStream needs size of the source buffer and the buffer don't need to be null terminated. StringStream assume null-terminated string as source.
3. MemoryStream supports Peek4() for encoding detection. StringStream is specified with an encoding so it should not have Peek4().
\note implements Stream concept
*/
struct MemoryStream {
typedef char Ch; // byte
MemoryStream(const Ch *src, size_t size) : src_(src), begin_(src), end_(src + size), size_(size) {}
Ch Peek() const { return RAPIDJSON_UNLIKELY(src_ == end_) ? '\0' : *src_; }
Ch Take() { return RAPIDJSON_UNLIKELY(src_ == end_) ? '\0' : *src_++; }
size_t Tell() const { return static_cast<size_t>(src_ - begin_); }
Ch* PutBegin() { RAPIDJSON_ASSERT(false); return 0; }
void Put(Ch) { RAPIDJSON_ASSERT(false); }
void Flush() { RAPIDJSON_ASSERT(false); }
size_t PutEnd(Ch*) { RAPIDJSON_ASSERT(false); return 0; }
// For encoding detection only.
const Ch* Peek4() const {
return Tell() + 4 <= size_ ? src_ : 0;
}
const Ch* src_; //!< Current read position.
const Ch* begin_; //!< Original head of the string.
const Ch* end_; //!< End of stream.
size_t size_; //!< Size of the stream.
};
RAPIDJSON_NAMESPACE_END
#ifdef __clang__
RAPIDJSON_DIAG_POP
#endif
#endif // RAPIDJSON_MEMORYBUFFER_H_