Commit Graph

1436 Commits

Author SHA1 Message Date
Illia Silin
11b7a4db00 re-enable the FMHA performance monitoring (#1539) 2024-10-01 13:17:55 -07:00
Illia Silin
8e4c3fb1bc [CK_TILE] add missing vector header (#1537)
* add missing vector header

* Re-format header using remod.py

---------

Co-authored-by: Po Yen, Chen <PoYen.Chen@amd.com>
2024-10-01 07:58:20 -07:00
Po Yen Chen
a1c07e8d91 [CK_TILE] Change output accum tensor layout of fmha fwd split-kv & combine kernels (#1527)
* Use same layout for o_acc and o tensor

* Use better param names in partitioner

* Remove redundant kargs 'max_seqlen_q'

* Use better param names in splitkv kernel

* Add comment for additional kernel arguments

* Sync empty loop early return logics between pipelines

* Pass more arguments to cmake in scripts

* Align backslashes

* Fix wrong o_acc tensor view strides

* Change o_acc layout if o_perm=0

* Handle whole row masked via attn_bias

* Use use vector width = 1 for o_acc

* Use more even split sizes
2024-10-01 22:13:52 +08:00
M.Emin Ozturk
4cd1dc7f06 Complex Contraction CK Bilinear Example (#1061)
* complex type contraction

* bug fix

* update

* Tensor Contraction Complex Data Type is working

* 4D Kernel

* some change

* validation check in progress

* validation issue

* fp32 verification error is fixed

* fp32 and fp64 are done

* remove old files

* remove cmake files

* remove cmake files

* Readme

* img verification

* CMakeList

* number changed

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
Co-authored-by: Emin Ozturk <emin.ozturk@utah.edu>
2024-09-30 21:05:42 -06:00
Bartłomiej Kocot
de3e3b6424 [CK_TILE] Image to Column kernel (#1532)
* [CK_TILE] Image to Column kernel

* Fixes

* Vector loads and stores

* Fixes

* Fixes

* change test dir name
2024-09-27 22:57:38 +02:00
Dan Yao
9d69a099a4 [CK_TILE] Fix compiler related FA bwd issues (#1530)
* add barriers

* tail bias barriers

* adjust bf16/hd256 tol

* continue adjust bf16/hd256 tol
2024-09-26 12:18:39 -07:00
Illia Silin
42e6dceacc Fix compilation errors with Clang20.0. (#1533)
* fix clang20 compilation errors for gfx90a

* fix clang20 compilation errors for gfx11 targets
2024-09-25 13:45:38 -07:00
Illia Silin
65f8d1440f make CK CI use different git credentials (#1529) 2024-09-25 09:05:48 -07:00
dependabot[bot]
1c5a4d1b9f Bump rocm-docs-core from 1.8.1 to 1.8.2 in /docs/sphinx (#1531)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.1 to 1.8.2.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.8.2/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.1...v1.8.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-25 06:59:25 -07:00
BrianHarrisonAMD
3528a523ff Add additional instances to device_mha_instance (#1522)
* Add additional instances to device_mha_instance

* Add comment to describe what receipt 3 option filters

---------

Co-authored-by: Po Yen Chen <PoYen.Chen@amd.com>
2024-09-24 10:15:30 -06:00
Illia Silin
f16ebf82d4 Add a daily CI build with legacy dockers. (#1525)
* add an option to build CK with legacy dockers

* change the custom docker settings

* add environment varianble for custom docker

* use a new variable for legacy docker name

* new way to pass docker names for legacy OS

* add legacy docker check in the Build_CK function

* change groovy syntax

* add a check for legacy docker in getDockerImage

* make sure the legacy docker name is not empty

* remove the dumb-init call

* disable the tests in legacy OS dockers

* disable tests in legacy dockers

* use a different way to disable tests in legacy dockers

* rearrange the CI stages for legacy OS

* use different way to disable tests in legacy dockers

* update LD_LIBRARY_PATH for legacy dockers and add cron job

* update LD_LIBRARY_PATH at docker launch

* change the sytax for setting LD_LIBRARY_PATH
2024-09-23 09:03:55 -07:00
Po Yen Chen
770d2b7725 Early return if seqlen_k=0 on group mode (#1524) 2024-09-22 20:05:58 +08:00
Bartłomiej Kocot
4ba52b35dc Add support for NGCHW in grouped conv fwd (#1499)
* Support NGCHW in grouped conv fwd

* Remove not needed variable

* Fixes
2024-09-20 10:45:46 +02:00
Adam Osewski
0c39954da9 Remove unsupported (fp8) type from Add memory operation. (#1521)
The dynamic buffer doesn't have support for fp8 in `Update` operation thus fp8 is not supporting `InMemoryDataOperation::Add`
2024-09-20 09:40:45 +02:00
Thomas Ning
694c300145 Ck tile gemm padding dim (#1516)
* Support the N dimension padding

* Finished the padding feature for different dimension of K
2024-09-18 11:32:29 -07:00
dependabot[bot]
e84adec3ba Bump rocm-docs-core from 1.8.0 to 1.8.1 in /docs/sphinx (#1519)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.8.0 to 1.8.1.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.8.1/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.8.0...v1.8.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-18 07:00:26 -07:00
Illia Silin
1658c0dc11 Add rocm6.3_rc1 docker image (#1518)
* add image for rocm6.3_rc1

* fix deb package url
2024-09-17 15:59:26 -07:00
aledudek
a793afc961 Extend pool3d fwd avg, max operations by f8_t, int8_t types (#1483)
* Extend pool3d fwd avg, max operations by f8_t, int8_t types

* Pack MaxPool3dFwd params together

* Fix MaxPool3dFwd AVG instances

* Decrease verification precision for bf16

* Adjust tests + review changes

* Adjust threshold for F8

* Adjusted compute types for MAX op instances

* Fix ComputeDataType mismatch in tests and profiler for AVG

* Fix naming from max_pool3d_fwd to pool3d_fwd

* Adjust CMakeLists

---------

Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>
2024-09-17 15:57:10 +02:00
dependabot[bot]
8ec15e644e Bump rocm-docs-core from 1.7.2 to 1.8.0 in /docs/sphinx (#1517)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.7.2 to 1.8.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.7.2...v1.8.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-16 22:37:57 -07:00
Mateusz Ozga
6834e5ee74 This commit contains implementation of max pool2d for f8 type (#1506)
Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>
2024-09-16 10:15:06 +02:00
Thomas Ning
844f5a1712 Ck tile GPU verification sample develop & Add the CK TILE GEMM to the CI/CD test (#1505)
* Finished the feature of gpu verification

* Add the ck_tile_gemm test in the CI CD

* add the include of tensor_layou in reference_gemm

* Comment Addressed

* split ck_tile fhma and gemm tests into separate stages

* restructure the reference gemm

* restructure a new reference_gemm api that could read the device mem

---------

Co-authored-by: carlushuang <carlus.huang@amd.com>
Co-authored-by: illsilin <Illia.Silin@amd.com>
2024-09-14 21:08:40 +08:00
bibek
49e012dee1 Fix duplicate CMake tidy-target issue (#1513) 2024-09-13 21:15:04 -07:00
jakpiase
8f8a2ce396 Add pool2d int8 and fp8 instances (#1508)
* add pool2d fp8 and int8

* minor fixes

* add formatting

* add reviewer suggestions

* add reviewer suggestions
2024-09-13 10:18:21 -07:00
dependabot[bot]
a4982c3b86 Bump sphinxcontrib-bibtex from 2.6.2 to 2.6.3 in /docs/sphinx (#1511)
Bumps [sphinxcontrib-bibtex](https://github.com/mcmtroffaes/sphinxcontrib-bibtex) from 2.6.2 to 2.6.3.
- [Changelog](https://github.com/mcmtroffaes/sphinxcontrib-bibtex/blob/develop/CHANGELOG.rst)
- [Commits](https://github.com/mcmtroffaes/sphinxcontrib-bibtex/compare/2.6.2...2.6.3)

---
updated-dependencies:
- dependency-name: sphinxcontrib-bibtex
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-13 08:17:26 -07:00
Jun Liu
81bc1496b2 Customize filesystem in CK for legacy systems (#1509)
* Legacy support: customized filesystem

* Update cmakefile for python alternative path

* fix build issues

* CK has no boost dependency

* More fixes to issues found on legay systems

* fix clang format issue

* Check if blob is correctly generated in cmake

* fix the python issues

* add a compiler flag for codegen when using alternative python

* use target_link_options instead of target_compile_options

---------

Co-authored-by: illsilin <Illia.Silin@amd.com>
2024-09-13 07:51:07 -07:00
Illia Silin
e07f1108c0 make sure to rebuild compilers if they changed (#1504) 2024-09-12 07:49:55 -07:00
Mateusz Ozga
448c0f56d8 Pool2d max/avg kernel in the BWD version (#1494)
* Add pool2d instance BWD AVG

* Add pool2d instance BWD MAX

* Fix: avg review

* Fix review: part2

* Fix - enable test when type is compiled

* Fix review part3
2024-09-12 11:47:52 +02:00
jakpiase
e8d2887cb2 Rewrite pool2d fwd (#1462)
* added pool2d fwd

* add tests

* add reviewers changes

* Revert "Merge remote-tracking branch 'origin/develop' into jakpiase/pool2d_fwd_new"

This reverts commit 6b2ba7ff89, reversing
changes made to 22c82bea0c.

* Revert "add reviewers changes"

This reverts commit 22c82bea0c.

* added reviewers comments

* revert some old files

* add reviewers requests

---------

Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>
2024-09-11 15:21:00 +02:00
jakpiase
2a261afcdf Added structural sparsity blockwise gemm (#1435)
* Implemented smfmac xdlops

* Added smfmac blockwise xdlops

* fixes

* add reviewers suggestions

---------

Co-authored-by: Adam Osewski <19374865+aosewski@users.noreply.github.com>
2024-09-11 15:19:42 +02:00
Dan Yao
d09572e8c2 [CK_TILE] FA bwd repair (#1502)
* fix fa bwd

* revert kernelBlockSize in gemm_kernel.hpp
2024-09-10 10:45:32 -07:00
Thomas Ning
cf08df6b5e fix the unsupported scenario of Ali TestGemmUniversal (#1501) 2024-09-09 11:31:27 -07:00
Thomas Ning
caacd38830 Ck tile gemm example (#1488)
* Checkpoint: Finished with the tile example & kernel verification, working on the different matrix layout

* Finished the Matrix Layout feature set up. Note: Need to modify the inner block to solve the shuffle problem in the future.

* Fix: Clang Format, API fixed from fmha

* fix with better naming convention

* revert back the pipeline code of fmha

* Fixed: Addressed the comments and merge the GEMM shape of GEMM Operator and FMHA Operator to one.

* clang format with the reference_gemm file

* convert the clang format with the remod.py

* Changed the format and variable name of the kernel gemm_shape and partitioner

---------

Co-authored-by: thomasning <thomasning@banff-cyxtera-s70-4.ctr.dcgpu>
2024-09-07 16:23:32 +08:00
M.Emin Ozturk
8378855361 Moficiation to fix this issue "threadwise_tensor_slice_transfer_v5r1 issue #1279" (#1492)
* issue fix, one line changed for tmp

* clang

---------

Co-authored-by: Emin Ozturk <emin.ozturk@utah.edu>
Co-authored-by: Harisankar Sadasivan <135730918+hsadasiv@users.noreply.github.com>
2024-09-04 21:52:55 -07:00
Haocong WANG
5b10dae6a4 Add gemm universal bf16 instances (#1484)
* revert ckprofiler change

* temp save

* Add test and test pass

* test pass

* Fix bug inside rotating buffer when tensor is not packed

* bug fix

* clang format

---------

Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
2024-09-04 20:58:54 -07:00
Rostyslav Geyyer
52410b49c7 Temporarily disable flaky test for all (#1495) 2024-09-04 07:36:57 -07:00
Illia Silin
8b95d9ad52 copy all fmha headers when building library (#1497)
* copy all fmha headers when building library

* fix the rocm_install call for mha headers
2024-09-04 07:36:41 -07:00
Illia Silin
841009c5ee Add an option to select an alternative python version during build. (#1496)
* locate a newwer version of python when -DRHEL=ON flag is set

* allow setting python version on cmake command line
2024-09-04 07:36:27 -07:00
Bartłomiej Kocot
73b67f290f Add support for NGCHW in grouped conv bwd wei (#1491)
* Add support for NGCHW in grouped conv bwd wei

* Comments fixes

* navi fixes

* Update function names
2024-09-03 10:52:03 +02:00
Bartłomiej Kocot
a9b170b541 Revert "Revert "Revert Revert Support access per groups and filter2x3 in grouped conv fwd (#1382) (#1406) (#1415)" (#1455)" (#1490)
This reverts commit 5ff8eeebf9.
2024-09-02 10:39:49 +02:00
Dan Yao
b8addae293 [CK_TILE] float -> bf16 inline asm rtn (#1482)
* asm rtn

* add asm rtn macro

* reorder macro

---------

Co-authored-by: carlushuang <carlus.huang@amd.com>
2024-08-30 15:38:09 +08:00
Po Yen Chen
461ec98d78 Enable scratch memory workaround on ROCm 6.2 (#1486)
Co-authored-by: carlushuang <carlus.huang@amd.com>
2024-08-30 10:40:00 +08:00
Po Yen Chen
8107ee6270 Add missing function and parameters (#1493) 2024-08-29 18:55:44 +08:00
Po Yen Chen
c156989298 [CK_TILE] Add PagedAttention kernels (#1387)
* Use dictionary to config all the functions

* Add init codegen logic for fmha fwd appendkv

* Call HIP_CHECK_ERROR() macro to get real source info

* Setup meaningfull arguments

* Sync kernel name with the codegen

* Add knew/vnew tensors to the kernel argument

* Fix wrong K values after appending

* Fix vnew append errro

* Extract common logics

* Fix Vnew tile dstr for row major case

* Conditionally add fwd_splitkv API in fmha_fwd example

* Conditionally add call to fmha_fwd_splitkv()

* Remove "EXAMPLE_" prefix of cmake variables

* Regsiter API handlers automatically

* Early return if 0 < s_k_new is not supported

* Show message if we are ignoring option

* Unify CMakeLists.txt coding style

* Set num_splits=1 if split-kv is not supported

* Add length/stride getters for HostTensor

* Add RoPE example utilities

* Add reference_rotary_position_embedding() (not implemented)

* Finish reference_rotary_position_embedding() impl

* Fix typo of HostTensor<>::get_length()

* Fix compilation errors

* Fix wrong answer when interleaved=false

* Fix wrong answer when interleaved=true

* Append K/V in the host verification code

* Simplify K appending logics

* Simplify v_host_ref definition

* Reduce input/output dimensions

* Rename function: add "batched" prefix

* Apply RoPE on host side

* Rename RoPE utility function

* Fix wrong tensor size

* Avoid invoking deprecated method 'find_module'

* Pass RoPE kernel args

* Create Rotary Cos/Sin tile windows in kernel

* Add compute data type alias for RoPE

* Randomly generate seqlen_knew if needed

* Fix seqlen_knew enabling check logic

* Add minimum seqlen_k to generate compliance kvcache

* Fix compilation error in debug mode

* Fix wrong boundaries

* Fix wrong seqlen_k for kvcache

* Rename variables used in distributio encoding

* Fix rotary cos/sin tensor/tile size

* Add constraint to the rotary_dim option

* Remove unused inner namespace

* Add dram distribution for rotary_cos/rotary_sin (interleaved)

* Only apply interleaved RoPE on Knew for now

* Fix wrong thread starting offset

* Instantiate multiple kernels for RoPE approaches

* Clean-up pipeline

* Fix error in RoPE host reference

* Handle RoPE half-rotated logics

* Support 8x rotary_dim under half-rotated RoPE

* Add comment

* Apply elementwise function to the loaded tiles

* Unify parameter/variable naming style

* Remove constness from q_ptr

* Add code blocks for q_tile

* Apply RoPE to q_tile

* Remove debug print code in kernel

* Fix wrong knew/vnew appending positions

* Use better naming for tile indices

* Add make_tile_window() for adding distribution only

* Skip code if # of block is more than needed

* Move thread locating logics into policy

* Remove always true static_assert()

* Rename header

* Rename RotaryEmbeddingEnum

* Extract rotary embedding logic out

* Re-order parameters

* Align naming of some tile size constants

* Rename more tile size constants

* Fix wrong grid size

* Fix wrong shape of knew_host/vnew_host

* Fix wrong index into knew_host/vnew_host

* Fix wrong rotary_cos/rotary_sin memory size for Q

* Extract Q/Knew vector size to helper methods

* Use different rotary_cos/rotary_sin distr for Q/Knew

* Update host/device specifiers

* Fix wrong data type for Q rotary_cos/rotary_sin

* Remove RoPEComputeDataType type alias

* Shift rotary_cos/rotary_sin by cache_seqlen_k

* Add comment for why I just 't' for all padding flags

* Align commit message to the real comment

* Fix wrong pipeline

* Rename utility function

* Disable host verification if API not exist

* Fix wrong rope key for fp8 pipeline

* Allow only apply RoPE on Q (without append KV)

* Add append-kv smoke tests

* Remove debug statements

* Remove more debug statements

* Re-arrange the 'set +x' command

* Remove no-longer used method in pipeline

* Add missing init code

* Refine pipeline padding settings

* Enlarge rotary_dim limit (8 -> 16)

* Enlarge KPerThread for rotary_interleaved=false

* Update rotary_dim range in smoke_test_fwd.sh

* Add template argument 'kIsPagedKV' for splitkv kernels

* Launch splitkv kernel if given page_block_size

* Fix wrong kernel name

* Fix seqlen_k_min for pre-fill case (1 -> 0)

* Add copy_const<> type trait

* Add another make_tile_window()

* Introduce 'TileWindowNavigator' types

* Simplify TileWindowNavigator interfaces

* Fix tile window navigation bugs

* Disable calling fmha_fwd()

* Remove ununnecessary data members

* Simplify more make_tile_window() overloads

* Move V tile through TileWindowNavigator

* Fix uneven split checking logic

* Move code after decide seqlen_q/seqlen_k

* Make sure we always start reading complete tile

* Use 128 as minimus page_block_size

* Fix wrong origin for bias

* Add batch_stride_k/batch_stride_v in group mode

* Unify origin

* Add missing kernel arguments for group mode

* Add paged-kv codegen logic for appendkv kernels

* Add block_table kernel args for appendkv kernel

* Add tile navigators to the appendkv kernel

* Fix wrong tensor descriptor lengths

* Pass re-created tile window to pipeline

* Fix wrong strides for appendkv kernel

* Allow transit tile_window to another page-block

* Handle cross-page-block write

* Donot perform write again if already in last page-block

* Always add fmha_fwd() api

* Add missing group mode argument

* Remove debug macro usages

* Rename option s_k_new to s_knew

* Separate splitkv/non-splitkv args/traits

* Remove fmha_fwd_dispatch()

* Fix compilation errors

* Remove dropout code in splitkv kernel

* Allow problem types without define kHasDropout attr

* Use generic lambda to init traits objects

* Separate more non-splitkv & splitkv traits/args

* Display more info for specific kernels

* Show more detailed warning message

* Rename 'max_num_blocks' to 'max_num_page_blocks'

* Remove no-longer used pipeline files

* Wrap code by #if directives

* Move functors to the begining of validation code

* Use generic lambda to init all the api traits/args

* Fix wrong seqlen for kvcache

* Add missing comment

* Rename TileWindowNavigator to PageBlockNavigator

* Only expose necessary methods (not attributes)

* Re-order pipeline paremeters

* Refine smoke_test_fwd.sh

* Fix wrong arugment count

* Make tile window directly via PageBlockNavigator

* Remove unused template paremeter

* Remove group mode from appendkv kernel

* Fix skcheck logic

* Fix wrong syntax in skcheck expr

* Use meaningful options in smoke test

* Remove options

* Fix formatting

* Fix more format

* Re-organize bash functions

* Pass cache_batch_idx to kernels

* Support cache_batch_idx in example

* Fix compilation error

* Add more appendkv test

* Add more case for appendkv

* Fix unexisted attribute

* Remove 0 < seqlen_knew constraint

* Clarify the case in warning message

* Remove macro checking

* Force batch mode when invoking appendkv & splitkv apis

* Fix mode overriding logics

* Fix wrong parameter name

* Randomize seqlen_k if use kvcache

* Use randomized seqlen_k for kvcache

* Avoid using too small rotary_cos & rotary_sin

* Rename parameter

* Add seqlen_q & seqlen_k rules

* Add comment

* Add more comments

* Fix compilation errors

* Fix typo in comment

* Remove type argument

* Avoid seqlen_k=0 for kvcache

* Revert "Avoid seqlen_k=0 for kvcache"

This reverts commit 21c4df89e4.

* Fix wrong uneven split checking logics

* Only randomize kvcache seqlen_k if 1 < batch

* Return earlier if split is empty

* Revert "Only randomize kvcache seqlen_k if 1 < batch"

This reverts commit b9a4ab0d7e.

* Re-order seqlen_k_start adjustment logics

* Fix compilation errors

* Re-format script

* Find executable from folder automatically

* Fix kvcache seqlen_k generating logic

* Make comment more clear

* Fix wrong knew/vew appending logic on host

* Add s_barrier to sync threads

* Revert "Add s_barrier to sync threads"

This reverts commit d3f550f30c.

* Support only using 1 row of rotary_cos/rotary_sin

* Rotate Q in different way

* Unify tensor view creation logics

* Fix wrong argument

* Add mask to switch how we use the rotary_cos/sin

* Move attr from traits to problem

* Move has_mask to fmha_fwd_appendkv_args

* Support use uint32_t as SAD operand in Alibi<>

* Use sad_u32() in splitkv kernels

* Store tensor views in PageBlockNavigator

* Use stored tensor view to update tile windows

* Enlarge tensor view size

* Remove debug code

* Fix wrong tensor view size

* Wrap tensor view into PageBlockNavigator

* Add DataType member to PageBlockNavigator

* Remove unnecessary member functions

* Refind macro use

* Fix typo

* Add blank line between directives and actual code

* Re-format files

* Remove type in comment

---------

Co-authored-by: carlushuang <carlus.huang@amd.com>
Co-authored-by: rocking <ChunYu.Lai@amd.com>
2024-08-28 20:50:43 +08:00
Illia Silin
19d22e60c1 Enable daily ninja build traces. (#1487)
* add ninja trace to CI builds

* fix ninja trace logic

* update the ninja trace logic in jenkins file

* limit the number of threads to run ninja build

* use ninja for installation after build

* update the path to ninjatracing tool

* use ninja to run check when using build trace

* fix jenkins logic

* fix typos

* set proper setup_args for all stages

* fix ninja syntax

* replace ninja check with ninja test

* enable ninja tracing with mainline and staging compilers
2024-08-26 13:27:01 -07:00
Illia Silin
25935b57a0 fix codegen rtc lib build issue (#1485) 2024-08-23 15:11:47 -07:00
Illia Silin
0056e0bf4b disable bad fp8 test on gfx12 (#1481) 2024-08-22 15:05:20 -07:00
Illia Silin
d3fa00f14c disabel codegen tests when INSTANCES_ONLY is set (#1480) 2024-08-22 09:50:17 -07:00
arai713
967b1f0fda Codegen INSTANCES_ONLY build (#1468)
* initial push - altering codegen build

* fix the codegen cmake

* enable codegen build for gfx908 and gfx90a

* enable building codegen with INSTANCES_ONLY=ON

* updating ck_rtc

* remove gpu targets for codegen and rename tests

* make codegen tests dependencies of tests and check targets

---------

Co-authored-by: illsilin <Illia.Silin@amd.com>
Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
2024-08-22 07:24:55 -07:00
dependabot[bot]
0d9bf9f154 Bump rocm-docs-core from 1.7.1 to 1.7.2 in /docs/sphinx (#1479)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.7.1 to 1.7.2.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.7.1...v1.7.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-21 22:40:49 -07:00
Illia Silin
1925b322eb fix the build errors with clang20 (#1478) 2024-08-21 21:29:48 -07:00