Files
composable_kernel/codegen/include/ck/host/stringutils.hpp
arai713 3e9711f0cb CK Instance Gen (#1145)
* Format

* Format

* Format

* Remove const

* Use the right template

* Format

* Format

* add row/col instances

* Add missing file

* fixed

* fixing block to etile error

* Format

* Updates

* Format

* fixed rrr layout

* generating a sample JSON file: currently contains includes, prologue/epilogue and instances

* version where the json is passed into the instances to generate a key

* updated run function to just launch kernel

* updated run function: only contains kernel object, json file is updated but still needs to be cleaned up, added front-end API to parse JSON into character buffer

* adding in testing files

* cleaned up comments, still need to work on including header files

* removed unneeded files

* removed/commented out JSON implementation

* added fusion(prologue/epilogue) into instance generation

* working on instance selection

* added instance selection, need to fix instance validation

* removed block2etile map validity check for testing purposes

* test running: failing due to incorrect files/input

* all grid descs/ptrs completed, but device file not found

* Update test and embed modules

* Restore older version

* added convolution operation, written test, debugging generated code for compilation

* attempting to include CK in host directory: _Float16 error

* CK header file issues

* slight fix

* don't crash when hip can't report total memory

* dump generated code to a file

* changing sizes

* creating tensor descriptors using CK methods: set up grid desc manually, also trying to set up an argument pointer - this needs to be fixed

* some fixes to call the device code

* separating test files for conv and gemm

* completed arg ptr, now have linking errors

* clang format fix

* resolved linker issues in conv test

* remove dependency on libutility from ck

* resolved num dim error

* properly passing arg ptr, errors with passing typenames: redefinition/redeclaration

* undo the commenting of device function

* hand created kernel code to find rtc issues

* dump the full src to file

* resolved redeclaration errors, cleaned up errors for Amber's kernel code

* debugging purposes: redeclaration error

* config files

* resolved errors for NumTensor and redeclaration, formatted version.h

* resolved most errors in manually added kernel and my own. error with calling kernel object: overloaded function type

* WIP: close to getting kernel compiled

* WIP: fixing rtc errors

* fixed sequence errors, formatting, still one error with run fcn

* yay: kernel compiles and runs

* updated templated/generated version to run and compile

* minor fixes

* working generated example, resolved memory access error due to padding

* adding in reference kernel, validation failing against reference

* debugging: printing kernel argsz

* reduced error in results

* debugged reference kernel and output errors, added to generated version, currently debugging prologue function issues

* working validation (using reference convolution) with prologue function for both hard-coded and generated version

* WIP: create an alt version that creates Argument on the device

* wip: added new duplicate files, fixed fusion templating errors from working example, setting up kernel arguments

* wip: making necessary methods device code

* added grid descs, working on grid pointers, errors with stl numerics

* wip: updating kernel args - issue, replacing some std functions

* replaced std::accumulate call with temp hardcoded version

* wip: args causing memory issue

* Construct Argument object inside the kernel and use it to call convolution device function. Code runs and verification passes

* adding object file dump

* temporary hardcoding of grid size, can remove device op inst + arg ptr

* minor fix for grid size

* added modified example where arg ptr is created on the device for generated version as well

* removed device op instance and arg ptr from modified examples

* moving device op file for testing purposes and to properly build CK

* commenting out print-outs

* adjust compiler args to produce a valid ELF file

* temporary removal of validation

* reverting compiler args back for working example

* retrieve necessary arguments from generated template parameters in correct format

* calculating grid size on host-side, still need to clean up process, pass parameters to host functions properly

* scaled up factory functions/wrapper structs to implement host-side launch parameter calculations using CK host side functions - in hard-coded example

* temporary change to generate ELF format binary object file

* removed unecessary code, added comments

* formatting fix

* cleaned up code, added new tests, restructured library: move helper into CK

* refactored launch parameter calculation to be more concise

* renamed files and variables for more clarity/uniformity

* more code cleaning, removed debug statements

* moved majority of my files into codegen directory, running properly

* updated Embed.cmake(string_view) in codegen directory

* updated host directory to match Embed.cmake as well

* added old tests in

* updated instance generation methods to be more concise

* removed layout from launch parameter calculation

* working test

* fixed issue with verification, all instances working

* updated verification in other tests

* removed duplicate matrix padder file, removed code dumps

* removed old hard-coded tests

* removed old host directory, all files in codegen directory now

* fixed copyright in files

* commenting out validation

* renamed files

* made changes for review: fixed copyright, renamed files for clarity, removed comments, refactored code

* updated headers

* removing duplicate file for fwd conv to gemm, merging with original file

* fix building codegen with clang++ directly

* resolving build error from conv_fwd_to_gemm

* fix for previous error

* renaming tests

* created common test file

* cleaned up code, added comments

* renamed device op

* fixed typos in comments

* removed extra space

* code cleanup: resolving Amber's comments

* removed wrapper struct for matrix padder, fixed template

* cleaned up if statements for better readability

---------

Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: Jing Zhang <jizha@amd.com>
Co-authored-by: M. Amber Hassaan <amber_474@yahoo.com>
Co-authored-by: illsilin <Illia.Silin@amd.com>
Co-authored-by: Illia Silin <98187287+illsilin@users.noreply.github.com>
2024-06-25 16:37:35 -05:00

105 lines
3.2 KiB
C++

// SPDX-License-Identifier: MIT
// Copyright (c) 2018-2024, Advanced Micro Devices, Inc. All rights reserved.
#pragma once
#include <algorithm>
#include <cassert>
#include <numeric>
#include <string>
#include <utility>
#include <unordered_map>
#include <vector>
namespace ck {
namespace host {
template <class F>
std::string trim(const std::string& s, F f)
{
auto start = std::find_if_not(s.begin(), s.end(), f);
auto last = std::find_if_not(s.rbegin(), std::string::const_reverse_iterator(start), f).base();
return {start, last};
}
inline std::string trim(const std::string& s)
{
return trim(s, [](unsigned char c) { return std::isspace(c); });
}
template <class Strings>
inline std::string JoinStrings(Strings strings, const std::string& delim)
{
auto it = strings.begin();
if(it == strings.end())
return "";
auto nit = std::next(it);
return std::accumulate(nit, strings.end(), *it, [&](std::string x, std::string y) {
return std::move(x) + delim + std::move(y);
});
}
template <class F>
inline std::string
InterpolateString(const std::string& input, F f, std::string start = "${", std::string end = "}")
{
std::string result = "";
result.reserve(input.size());
auto it = input.begin();
while(it != input.end())
{
auto next_start = std::search(it, input.end(), start.begin(), start.end());
auto next_end = std::search(next_start, input.end(), end.begin(), end.end());
result.append(it, next_start);
if(next_start == input.end())
break;
if(next_end == input.end())
{
throw std::runtime_error("Unbalanced brackets");
}
auto r = f(next_start + start.size(), next_end);
result.append(r.begin(), r.end());
it = next_end + end.size();
}
return result;
}
inline std::string InterpolateString(const std::string& input,
const std::unordered_map<std::string, std::string>& vars,
std::string start = "${",
std::string end = "}")
{
return InterpolateString(
input,
[&](auto start_it, auto last_it) {
auto key = trim({start_it, last_it});
auto it = vars.find(key);
if(it == vars.end())
throw std::runtime_error("Unknown key: " + key);
return it->second;
},
std::move(start),
std::move(end));
}
template <class Range, class F>
inline auto Transform(const Range& r, F f) -> std::vector<decltype(f(*r.begin()))>
{
std::vector<decltype(f(*r.begin()))> result;
std::transform(r.begin(), r.end(), std::back_inserter(result), f);
return result;
}
template <class Range1, class Range2, class F>
inline auto Transform(const Range1& r1, const Range2& r2, F f)
-> std::vector<decltype(f(*r1.begin(), *r2.begin()))>
{
std::vector<decltype(f(*r1.begin(), *r2.begin()))> result;
assert(std::distance(r1.begin(), r1.end()) == std::distance(r2.begin(), r2.end()));
std::transform(r1.begin(), r1.end(), r2.begin(), std::back_inserter(result), f);
return result;
}
} // namespace host
} // namespace ck