[rocm-libraries] ROCm/rocm-libraries#5977 (commit 794bea7)

[CK_TILE] Fix Windows build in FMHA head grouping

## Motivation

This is a follow-up fix for [PR
#5018](https://github.com/ROCm/rocm-libraries/pull/5018).

[PR #5018](https://github.com/ROCm/rocm-libraries/pull/5018) added
LLC-aware FMHA head grouping / head-major scheduling on RDNA, but it
also introduced Linux-only code paths, including `<dirent.h>`, which
break Windows builds. This change fixes that by guarding the
Linux-specific LLC probing logic so non-Linux platforms can still build
correctly.

## Technical Details

- Guard `<dirent.h>` with `#ifdef __linux__`
- Guard KFD sysfs traversal logic with `#if defined(__linux__)`
- On non-Linux platforms, return `0` from
`get_kfd_sysfs_llc_cache_bytes()`
- Preserve existing fallback behavior through:
  - `CK_TILE_FMHA_LLC_CACHE_MB`
  - arch-based default LLC sizes
  - no head grouping when no LLC size can be resolved

## Test Plan

<!-- Explain any relevant testing done to verify this PR. -->

## Test Result

<!-- Briefly summarize test outcomes. -->

## Submission Checklist

- [ ] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
This commit is contained in:
Hosang Yoon
2026-03-30 14:19:19 +00:00
committed by assistant-librarian[bot]
parent 7968368d92
commit 2dcae9d173

View File

@@ -8,13 +8,16 @@
#include <algorithm>
#include <cctype>
#include <cstdio>
#include <dirent.h>
#include <fstream>
#include <iostream>
#include <limits>
#include <optional>
#include <string>
#ifdef __linux__
#include <dirent.h>
#endif
#ifndef CK_TILE_FMHA_ENABLE_HEAD_GROUPING
#define CK_TILE_FMHA_ENABLE_HEAD_GROUPING 1
#endif
@@ -70,6 +73,8 @@ inline std::optional<long long> read_property_value(const std::string& filepath,
return std::nullopt;
}
#if defined(__linux__)
struct kfd_device_location
{
int domain = 0;
@@ -176,6 +181,12 @@ inline size_t get_kfd_sysfs_llc_cache_bytes()
return read_kfd_node_l3_bytes(*node);
}
#else
inline size_t get_kfd_sysfs_llc_cache_bytes() { return 0; }
#endif
inline size_t get_default_llc_cache_bytes_for_arch(const std::string& arch);
inline size_t resolve_llc_cache_bytes_uncached(const std::string& arch)