mirror of
https://github.com/ROCm/composable_kernel.git
synced 2026-05-14 18:17:44 +00:00
* Update license year
* Add initial code to override decode problem
* Fix splitkv traits/args overriding error
* Reshape and transpose lse for decode
* Remove debug code
* Prettify example code
* Use better function name
* Add kMergeNumHeadGroupsSeqLenQ flag
Kernel user can use this switch to turn on/off optimization for
some problem sizes
* Add missing flag declarations
* Default turn off kMergeNumHeadGroupsSeqLenQ in codegen
* Group similar statements together
* Remove assumption of seqlen_q=1
* Remove kMergeNumHeadGroupsSeqLenQ from splitkv combine kernel
* Support kMergeNumHeadGroupsSeqLenQ=true in fmha splitkv kernel
* Run kMergeNumHeadGroupsSeqLenQ=true kernels when need
* Fix group mode block skip logics
* Undo changes of normal fwd kernel
* Update in GridSize() and using GridSize() for splitkv kernel (#1799)
---------
Co-authored-by: Qianfeng <qianfeng.zhang@amd.com>
[ROCm/composable_kernel commit: 24b12d04af]
14 lines
681 B
C++
14 lines
681 B
C++
// SPDX-License-Identifier: MIT
|
|
// Copyright (c) 2018-2025, Advanced Micro Devices, Inc. All rights reserved.
|
|
|
|
#pragma once
|
|
|
|
#include "ck_tile/ops/smoothquant/kernel/moe_smoothquant_kernel.hpp"
|
|
#include "ck_tile/ops/smoothquant/kernel/smoothquant_kernel.hpp"
|
|
#include "ck_tile/ops/smoothquant/pipeline/smoothquant_pipeline_default_policy.hpp"
|
|
#include "ck_tile/ops/smoothquant/pipeline/smoothquant_pipeline_one_pass.hpp"
|
|
#include "ck_tile/ops/smoothquant/pipeline/smoothquant_pipeline_problem.hpp"
|
|
#include "ck_tile/ops/smoothquant/pipeline/smoothquant_pipeline_two_pass.hpp"
|
|
#include "ck_tile/ops/common/generic_2d_block_shape.hpp"
|
|
#include "ck_tile/ops/common/tensor_layout.hpp"
|