Files
composable_kernel/cmake
John Shumway 3a0cb27966 Shard several of the most costly targets. (#2266)
* Shard several of the most costly targets.

Introduces a filter_tuple_by_modulo to break up tuples.

Drops build time of target from 21 minutes to under 14 minutes with 64
build processes, or 11 minutes with 128 build processes.

time ninja -j 64 device_grouped_conv3d_fwd_instance

* fix clang format

* Fix build errors in instantiation code.

I wasn't sure how to test the header-only instantiation code on my
initial commit. From Jenkins CI test results, I see that there is a
test target that depends on these headers:

ninja -j 128 test_grouped_convnd_fwd

This allowed me to test the build locally. I found three mistakes I
made, mostly related to early experiments on I tried on the code.
This was hard to find earlier because this PR is really too large.

I also discovered that there are five 2D convolution targets that now
dominate the compilation time. I will likely address those in a later
PR, rather than adding even more changes to this PR.

* Fix link errors from mismatched declarations.

Our pattern for instantiating MIOpen templates uses duplicate
declarations (instead of headers). This is fragile, and I didn't
notice that my last commit had a bunch of link errors. I fixed these
mistakes, and the bin/test_grouped_conv_fwd test target binary now links
correctly.

* Migrate the design to a code-generation approach.

Use a CMake function with template files to generate the source files for the
intantiating the kerenels and to generate the calling function.

* Shard the longest 2D convolution builds

Now that we have automated the shard instantiation, we can shard the 2D
convolution targets that take the longest to build. The target
test_grouped_conv2d_fwd now compiles in 15 minutes.

* Use PROJECT_SOURCE_DIR for submodule compatibility

I used CMAKE_SOURCE_DIR to refer to the top-level source directory in
the ShardInstantiation.cmake file, but this can cause issues with
git submodules.  Instead, we should use PROJECT_SOURCE_DIR to ensure
compatibility when this project is used as a submodule in another
project.

---------

Co-authored-by: illsilin <Illia.Silin@amd.com>
2025-06-13 03:58:50 -07:00
..
2021-08-08 17:41:54 +00:00
2021-08-08 17:41:54 +00:00
2024-10-04 10:51:50 -07:00
2022-02-18 21:44:11 -06:00