Add ck-rocprof: GPU profiling tool for rocprof-compute (#3627)

* Decouple configure/build/test tools from Docker Create a two-layer tool architecture: - Core tools (ck-configure, ck-build, ck-test): Environment-agnostic, work on any system with ROCm - no Docker dependency - Container tools (ck-docker): Manage Docker containers and delegate to core tools via docker exec Changes: - Add ck-configure: New CMake configuration tool with preset support, native GPU detection, and flexible options - Refactor ck-build: Remove Docker dependency, add --configure and --list options, call ninja directly - Refactor ck-test: Remove Docker dependency, add CTest integration with --smoke/--regression/--all options - Enhance common.sh: Add native GPU detection, build directory utils, and output helpers - Update ck-docker: Add configure/build/test/exec commands that delegate to core tools inside container This enables: - Native development on ROCm hosts without Docker - Simpler CI/CD integration - Consistent behavior inside and outside containers Co-Authored-By: Claude <noreply@anthropic.com> * Add ck-rocprof: GPU profiling tool for rocprof-compute Adds a command-line profiling tool to simplify GPU performance analysis workflow using AMD rocprof-compute. Features: - Easy setup with automatic Python venv configuration - Simple CLI: setup, run, analyze, compare, list - Automatic GPU architecture detection - Focus on LDS metrics (Block 12) for bank conflict analysis - Comprehensive documentation with examples and troubleshooting Usage: ck-rocprof setup # One-time environment setup ck-rocprof run <name> <executable> # Profile executable ck-rocprof analyze <name> [block] # Analyze metrics ck-rocprof compare <name1> <name2> # Compare two runs ck-rocprof list # List available runs * Make ck-rocprof documentation concise and improve Docker integration - Streamlined documentation from 416 to 157 lines (62% reduction) - Focused on essential commands, metrics, and workflows - Enhanced script to run all operations inside Docker containers - Fixed workload directory path and improved container management - Added automatic rocprofiler-compute installation and dependency handling * Add --no-roof flag to ck-rocprof profile command Skip roofline analysis by default to speed up profiling. Roofline analysis can add significant time to profiling runs but is not needed for most LDS bank conflict analysis workflows. * Make ck-rocprof work independently of Docker Add native execution mode that runs rocprof-compute directly on the host system when available, falling back to Docker mode when not. Key changes: - Auto-detect native mode when rocprof-compute is in PATH or common locations - Add execution mode wrappers (exec_cmd, file_exists, dir_exists, etc.) - Native mode stores venv at .ck-rocprof-venv in project root - Native mode stores workloads at build/workloads/ - Support user-installed rocprofiler-compute (e.g., ~/.local/rocprofiler-compute) - Add CK_FORCE_DOCKER env var to force Docker mode - Update help message to show current execution mode - Maintain full backward compatibility with existing Docker workflow Tested successfully with rocprofiler-compute 3.4.0 installed from source on MI300X GPU in native mode. Co-Authored-By: Claude <noreply@anthropic.com> * Add clean/status commands and improve ck-rocprof robustness - Add 'clean' command to remove profiling runs (supports --all) - Add 'status' command to show configuration and environment info - Add workload name validation to prevent path traversal attacks - Fix uv installation to use pip instead of curl for reliability - Add cross-platform stat support for macOS compatibility - Consolidate ROCPROF_CANDIDATES to avoid code duplication - Expand help documentation with all profiling block descriptions - Fix Docker wrapper script escaping issues Co-Authored-By: Claude <noreply@anthropic.com> * Fix analyze command to use correct workload path rocprof-compute stores results directly in the workload directory (pmc_perf.csv) rather than in a GPU architecture subdirectory. Updated find_workload_path to detect this correctly. Co-Authored-By: Claude <noreply@anthropic.com> * Address PR review security and robustness issues Security fixes: - Escape executable path in cmd_run to prevent shell injection - Add workload name validation to cmd_analyze and cmd_compare Robustness improvements: - Add error checking for uv package manager installation - Use consistent project root detection (find_project_root || get_project_root) - Use /opt/rocm instead of hardcoded /opt/rocm-7.0.1 in Docker mode - Derive ROCM_REQUIREMENTS path from ROCPROF_BIN for flexibility - Use gfx950 as fallback GPU consistent with common.sh Documentation updates: - Fix env var name GPU_TARGET -> CK_GPU_TARGET - Update storage layout to reflect current structure (workloads/<name>/) - Document clean and status commands - Clarify native vs Docker default paths Co-Authored-By: Claude <noreply@anthropic.com> * Simplify ck-rocprof to native-only mode Remove Docker mode from ck-rocprof. Docker users should run the tool via `ck-docker exec ck-rocprof ...` instead. This simplification: - Removes ~210 lines of Docker-specific code - Eliminates mode detection complexity - Makes the script easier to maintain - Provides clearer error messages when rocprof-compute is not found The setup command now lists all searched locations when rocprof-compute is not found, helping users understand how to install it. Co-Authored-By: Claude <noreply@anthropic.com> * Add rocprofiler-compute source installation fallback When rocprof-compute is not found in system locations, automatically install rocprofiler-compute 3.4.0 from source as a fallback. This eliminates the hard dependency on system ROCm packages. Implementation details: - Clone rocprofiler-compute from GitHub to ~/.local/ - Install dependencies via requirements.txt (not editable install) - Create wrapper that sets PYTHONPATH to source directory - Execute source script directly rather than importing as module This approach matches the project's development workflow and works around the incomplete pyproject.toml that prevents editable installs. Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
2026-04-19 22:39:03 +00:00 · 2026-01-29 17:20:22 -08:00
parent 05ef93a69d
commit 83b6155354
7 changed files with 1528 additions and 275 deletions
--- a/script/tools/ck-build
+++ b/script/tools/ck-build
@@ -2,7 +2,8 @@
 # Copyright (c) Advanced Micro Devices, Inc., or its affiliates.
 # SPDX-License-Identifier: MIT

-# CK Build - Build Composable Kernel targets in Docker
+# CK Build - Build Composable Kernel targets
+# Environment-agnostic: works natively on ROCm hosts or inside containers

 set -e
 set -o pipefail
@@ -12,46 +13,51 @@ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
 source "${SCRIPT_DIR}/common.sh"

 # Initialize configuration
-PROJECT_ROOT=$(get_project_root "${SCRIPT_DIR}")
-CONTAINER_NAME=$(get_container_name "${PROJECT_ROOT}")
+PROJECT_ROOT=$(find_project_root "${SCRIPT_DIR}" || get_project_root "${SCRIPT_DIR}")
+BUILD_DIR=$(get_build_dir "${PROJECT_ROOT}")

 # Help message
 show_help() {
    cat << EOF
-CK Build - Build Composable Kernel targets in Docker
+CK Build - Build Composable Kernel targets

 Usage: ck-build [options] [target...]

 Options:
  -h, --help              Show this help message
-  --name <name>           Specify container name
-  --reconfigure           Reconfigure CMake before building
  -j <N>                  Parallel jobs (passed to ninja)
+  -v, --verbose           Verbose output
+  --build-dir <dir>       Build directory (default: ./build)
  --clean                 Clean before building
+  --configure             Auto-configure if build.ninja missing
+  --list                  List available targets

 Arguments:
  target                  Target(s) to build (default: all)

 Environment:
-  CK_CONTAINER_NAME - Override default container name
-  GPU_TARGET        - Override GPU target detection (e.g., gfx950, gfx942)
+  CK_BUILD_DIR    - Override build directory
+  CK_GPU_TARGET   - Override GPU target for auto-configure

 Examples:
  ck-build                                # Build all targets
  ck-build test_amdgcn_mma                # Build specific target
  ck-build test_amdgcn_mma test_gemm      # Build multiple targets
-  ck-build --reconfigure                  # Reconfigure CMake and build all
+  ck-build --configure                    # Auto-configure and build all
  ck-build --clean test_amdgcn_mma        # Clean and build target
  ck-build -j 8 test_amdgcn_mma           # Build with 8 parallel jobs
+  ck-build --list                         # List available targets

 EOF
 }

 # Parse arguments
 targets=()
-reconfigure=false
-clean=false
 parallel_jobs=""
+verbose=false
+clean=false
+auto_configure=false
+list_targets=false

 while [[ $# -gt 0 ]]; do
    case $1 in
@@ -59,21 +65,35 @@ while [[ $# -gt 0 ]]; do
            show_help
            exit 0
            ;;
-        --name)
-            CONTAINER_NAME="$2"
+        -j)
+            require_arg "$1" "${2:-}"
+            parallel_jobs="$2"
            shift 2
            ;;
-        --reconfigure)
-            reconfigure=true
+        -j*)
+            parallel_jobs="${1#-j}"
            shift
            ;;
+        -v|--verbose)
+            verbose=true
+            shift
+            ;;
+        --build-dir)
+            require_arg "$1" "${2:-}"
+            BUILD_DIR="$2"
+            shift 2
+            ;;
        --clean)
            clean=true
            shift
            ;;
-        -j)
-            parallel_jobs="-j $2"
-            shift 2
+        --configure)
+            auto_configure=true
+            shift
+            ;;
+        --list)
+            list_targets=true
+            shift
            ;;
        *)
            targets+=("$1")
@@ -82,62 +102,62 @@ while [[ $# -gt 0 ]]; do
    esac
 done

-# Ensure container is running
-if ! container_is_running "${CONTAINER_NAME}"; then
-    echo "Container '${CONTAINER_NAME}' not running. Starting..."
-    "${SCRIPT_DIR}/ck-start" "${CONTAINER_NAME}"
+# Handle --list
+if [ "$list_targets" = true ]; then
+    if ! is_build_configured "${BUILD_DIR}"; then
+        error "Build not configured. Run 'ck-configure' first or use --configure"
+        exit 1
+    fi
+    info "Available targets:"
+    cd "${BUILD_DIR}"
+    ninja -t targets 2>/dev/null | grep -E '^[a-zA-Z_][a-zA-Z0-9_-]*:' | cut -d: -f1 | sort | head -100
    echo ""
+    echo "(Showing first 100 targets. Use 'ninja -t targets' for full list)"
+    exit 0
 fi

-# Configure CMake if needed or requested
-if [ "$reconfigure" = true ] || ! docker exec "${CONTAINER_NAME}" test -f /workspace/build/build.ninja 2>/dev/null; then
-    echo "Detecting GPU target..."
-    GPU_TARGET_DETECTED=$(detect_gpu_target "${CONTAINER_NAME}")
-
-    if [ "$reconfigure" = true ]; then
-        echo "Reconfiguring CMake from scratch for GPU target: ${GPU_TARGET_DETECTED}"
+# Auto-configure if needed
+if ! is_build_configured "${BUILD_DIR}"; then
+    if [ "$auto_configure" = true ]; then
+        info "Build not configured. Running ck-configure..."
+        "${SCRIPT_DIR}/ck-configure" --build-dir "${BUILD_DIR}"
+        echo ""
    else
-        echo "Configuring build with CMake for GPU target: ${GPU_TARGET_DETECTED}"
+        error "Build not configured. Run 'ck-configure' first or use --configure"
+        exit 1
    fi
-
-    docker exec "${CONTAINER_NAME}" bash -c "
-        cd /workspace || exit 1
-        rm -rf /workspace/build
-        mkdir /workspace/build
-        cd /workspace/build || exit 1
-        cmake .. -GNinja \
-            -DGPU_TARGETS=${GPU_TARGET_DETECTED} \
-            -DCMAKE_BUILD_TYPE=Release \
-            -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++ \
-            -DBUILD_TESTING=ON 2>&1 | tail -30
-    "
-    echo ""
 fi

 # Clean if requested
 if [ "$clean" = true ]; then
-    echo "Cleaning build directory..."
-    docker exec "${CONTAINER_NAME}" bash -c "
-        cd /workspace/build || exit 1
-        ninja clean
-    "
+    info "Cleaning build directory..."
+    cd "${BUILD_DIR}"
+    ninja clean
    echo ""
 fi

-# Build targets
-if [ ${#targets[@]} -eq 0 ]; then
-    echo "Building all configured targets..."
-    docker exec "${CONTAINER_NAME}" bash -c "
-        cd /workspace/build || exit 1
-        ninja ${parallel_jobs} 2>&1
-    "
-else
-    echo "Building targets: ${targets[*]}"
-    docker exec "${CONTAINER_NAME}" bash -c "
-        cd /workspace/build || exit 1
-        ninja ${parallel_jobs} ${targets[*]} 2>&1
-    "
+# Build ninja command
+ninja_cmd=(ninja -C "${BUILD_DIR}")
+
+if [ -n "$parallel_jobs" ]; then
+    ninja_cmd+=("-j" "$parallel_jobs")
 fi

+if [ "$verbose" = true ]; then
+    ninja_cmd+=(-v)
+fi
+
+# Add targets
+ninja_cmd+=("${targets[@]}")
+
+# Build targets
+if [ ${#targets[@]} -eq 0 ]; then
+    info "Building all configured targets..."
+else
+    info "Building targets: ${targets[*]}"
+fi
+
+"${ninja_cmd[@]}"
+
 echo ""
-echo "Build complete ✓"
+info "Build complete"
--- a/script/tools/ck-configure
+++ b/script/tools/ck-configure
@@ -0,0 +1,187 @@
+#!/bin/bash
+# Copyright (c) Advanced Micro Devices, Inc., or its affiliates.
+# SPDX-License-Identifier: MIT
+
+# CK Configure - Configure CMake build for Composable Kernel
+# Environment-agnostic: works natively on ROCm hosts or inside containers
+
+set -e
+set -o pipefail
+
+# Find script directory and load common utilities
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+source "${SCRIPT_DIR}/common.sh"
+
+# Initialize configuration
+PROJECT_ROOT=$(find_project_root "${SCRIPT_DIR}" || get_project_root "${SCRIPT_DIR}")
+BUILD_DIR=$(get_build_dir "${PROJECT_ROOT}")
+
+# Help message
+show_help() {
+    cat << EOF
+CK Configure - Configure CMake build for Composable Kernel
+
+Usage: ck-configure [options]
+
+Options:
+  -h, --help              Show this help message
+  --preset <name>         Use CMake preset (dev, dev-gfx908, dev-gfx90a, dev-gfx942, dev-gfx950)
+  --gpu <target>          Override GPU_TARGETS (auto-detected if not specified)
+  --dtypes <types>        Set DTYPES (e.g., fp16,fp32,bf16)
+  --build-type <type>     CMAKE_BUILD_TYPE (default: Release)
+  --build-dir <dir>       Build directory (default: ./build)
+  --clean                 Remove existing build directory before configuring
+  --list-presets          List available CMake presets
+  -D <VAR>=<value>        Pass additional CMake variable
+
+Environment:
+  CK_GPU_TARGET  - Override GPU target detection (e.g., gfx950, gfx942)
+  CK_BUILD_DIR   - Override build directory
+
+Examples:
+  ck-configure                              # Auto-detect GPU and configure
+  ck-configure --preset dev-gfx950          # Use CMake preset
+  ck-configure --gpu gfx942                 # Configure for specific GPU
+  ck-configure --clean --preset dev         # Clean and reconfigure
+  ck-configure -D BUILD_DEV=ON              # Pass CMake variable
+
+EOF
+}
+
+# Parse arguments
+preset=""
+gpu_target=""
+dtypes=""
+build_type="Release"
+clean=false
+list_presets=false
+cmake_vars=()
+
+while [[ $# -gt 0 ]]; do
+    case $1 in
+        -h|--help)
+            show_help
+            exit 0
+            ;;
+        --preset)
+            require_arg "$1" "${2:-}"
+            preset="$2"
+            shift 2
+            ;;
+        --gpu)
+            require_arg "$1" "${2:-}"
+            gpu_target="$2"
+            shift 2
+            ;;
+        --dtypes)
+            require_arg "$1" "${2:-}"
+            dtypes="$2"
+            shift 2
+            ;;
+        --build-type)
+            require_arg "$1" "${2:-}"
+            build_type="$2"
+            shift 2
+            ;;
+        --build-dir)
+            require_arg "$1" "${2:-}"
+            BUILD_DIR="$2"
+            shift 2
+            ;;
+        --clean)
+            clean=true
+            shift
+            ;;
+        --list-presets)
+            list_presets=true
+            shift
+            ;;
+        -D)
+            require_arg "$1" "${2:-}"
+            cmake_vars+=("-D$2")
+            shift 2
+            ;;
+        -D*)
+            cmake_vars+=("$1")
+            shift
+            ;;
+        *)
+            error "Unknown option: $1"
+            echo ""
+            show_help
+            exit 1
+            ;;
+    esac
+done
+
+# Handle --list-presets
+if [ "$list_presets" = true ]; then
+    echo "Available CMake presets:"
+    presets=$(list_cmake_presets "${PROJECT_ROOT}" 2>/dev/null)
+    if [ -n "$presets" ]; then
+        echo "$presets" | sed 's/^/  /'
+    else
+        echo "  (No CMakePresets.json found or jq not available)"
+    fi
+    exit 0
+fi
+
+# Clean build directory if requested
+if [ "$clean" = true ]; then
+    if [ -d "${BUILD_DIR}" ]; then
+        info "Removing existing build directory: ${BUILD_DIR}"
+        rm -rf "${BUILD_DIR}"
+    fi
+fi
+
+# Create build directory
+mkdir -p "${BUILD_DIR}"
+
+# Change to project root for CMake
+cd "${PROJECT_ROOT}"
+
+# Build CMake command
+cmake_cmd=(cmake -S . -B "${BUILD_DIR}" -GNinja)
+
+# Use preset if specified
+if [ -n "$preset" ]; then
+    cmake_cmd+=(--preset "${preset}")
+    info "Using CMake preset: ${preset}"
+else
+    # Manual configuration
+
+    # Detect GPU target if not specified
+    if [ -z "$gpu_target" ]; then
+        gpu_target=$(detect_gpu_native)
+        info "Auto-detected GPU target: ${gpu_target}"
+    else
+        info "Using specified GPU target: ${gpu_target}"
+    fi
+
+    cmake_cmd+=(-DGPU_TARGETS="${gpu_target}")
+    cmake_cmd+=(-DCMAKE_BUILD_TYPE="${build_type}")
+    cmake_cmd+=(-DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++)
+    cmake_cmd+=(-DBUILD_TESTING=ON)
+
+    # Add DTYPES if specified
+    if [ -n "$dtypes" ]; then
+        cmake_cmd+=(-DDTYPES="${dtypes}")
+        info "Using DTYPES: ${dtypes}"
+    fi
+fi
+
+# Add any additional CMake variables
+for var in "${cmake_vars[@]}"; do
+    cmake_cmd+=("$var")
+done
+
+# Run CMake
+info "Configuring build in: ${BUILD_DIR}"
+echo "Running: ${cmake_cmd[*]}"
+echo ""
+
+"${cmake_cmd[@]}"
+
+echo ""
+info "Configuration complete. Build directory: ${BUILD_DIR}"
+info "Next: run 'ck-build' to build targets"
--- a/script/tools/ck-docker
+++ b/script/tools/ck-docker
@@ -22,25 +22,29 @@ CK Docker Tool - Build and test composable_kernel in Docker

 Usage: ck-docker <command> [options]

-Commands:
-  start [name]                    Start Docker container
-  build [target] [--reconfigure]  Build target (optionally reconfigure CMake)
-  test <test> [options]           Run test
-  shell [name]                    Open shell in container
-  status [name]                   Check container status
-  stop [name]                     Stop and remove container
+Container Management:
+  start [name]            Start Docker container
+  stop [name]             Stop and remove container
+  status [name]           Check container status
+  shell [name]            Open shell in container
+
+Build/Test (delegates to core tools inside container):
+  configure [opts]        Run ck-configure in container
+  build [opts]            Run ck-build in container
+  test [opts]             Run ck-test in container
+  exec <cmd>              Run arbitrary command in container

 Examples:
  ck-docker start
+  ck-docker configure --preset dev-gfx950
  ck-docker build test_amdgcn_mma
-  ck-docker build --reconfigure test_amdgcn_mma
-  ck-docker test test_amdgcn_mma --gtest_filter=*Fp16*
+  ck-docker test test_amdgcn_mma --filter '*Fp16*'
  ck-docker shell
+  ck-docker exec rocminfo

 Environment:
  CK_CONTAINER_NAME - Override default container name (default: ck_<username>_<branch>)
  CK_DOCKER_IMAGE   - Override Docker image (default: rocm/composable_kernel:ck_ub24.04_rocm7.0.1)
-  GPU_TARGET        - Override GPU target detection (e.g., gfx950, gfx942)
 EOF
 }

@@ -77,126 +81,38 @@ cmd_start() {
    docker exec "${name}" bash -c "echo 'Working directory:' && pwd"
 }

-# Build target
-cmd_build() {
-    local target=""
-    local name="${CONTAINER_NAME}"
-    local reconfigure=false
-
-    while [[ $# -gt 0 ]]; do
-        case $1 in
-            --name)
-                name="$2"
-                shift 2
-                ;;
-            --reconfigure)
-                reconfigure=true
-                shift
-                ;;
-            *)
-                target="$1"
-                shift
-                ;;
-        esac
-    done
-
-    # Check if container is running
-    if ! container_is_running "${name}"; then
-        echo "Container '${name}' not running. Starting..."
-        cmd_start "${name}"
-    fi
-
-    # Reconfigure CMake if requested or if build.ninja doesn't exist
-    if [ "$reconfigure" = true ] || ! docker exec "${name}" test -f /workspace/build/build.ninja 2>/dev/null; then
-        echo "Detecting GPU target..."
-        local gpu_target=$(detect_gpu_target "${name}")
-
-        if [ "$reconfigure" = true ]; then
-            echo "Reconfiguring CMake from scratch for GPU target: ${gpu_target}"
-        else
-            echo "Configuring build with CMake for GPU target: ${gpu_target}"
-        fi
-
-        docker exec "${name}" bash -c "
-            cd /workspace || exit 1
-            rm -rf /workspace/build
-            mkdir /workspace/build
-            cd /workspace/build || exit 1
-            cmake .. -GNinja \
-                -DGPU_TARGETS=${gpu_target} \
-                -DCMAKE_BUILD_TYPE=Release \
-                -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++ \
-                -DBUILD_TESTING=ON 2>&1 | tail -30
-        "
-    fi
-
-    if [ -z "$target" ]; then
-        echo "Building all configured targets..."
-    else
-        echo "Building target: ${target}"
-    fi
-
-    docker exec "${name}" bash -c "
-        cd /workspace/build || exit 1
-        ninja ${target} 2>&1
-    "
-
-    echo "Build complete"
+# Configure (delegate to ck-configure in container)
+cmd_configure() {
+    ensure_container_running "${CONTAINER_NAME}" "${SCRIPT_DIR}"
+    docker exec "${CONTAINER_NAME}" /workspace/script/tools/ck-configure "$@"
 }

-# Run test
+# Build (delegate to ck-build in container)
+cmd_build() {
+    ensure_container_running "${CONTAINER_NAME}" "${SCRIPT_DIR}"
+    docker exec "${CONTAINER_NAME}" /workspace/script/tools/ck-build "$@"
+}
+
+# Test (delegate to ck-test in container)
 cmd_test() {
-    local test_name=""
-    local name="${CONTAINER_NAME}"
-    local -a test_options=()
+    ensure_container_running "${CONTAINER_NAME}" "${SCRIPT_DIR}"
+    docker exec "${CONTAINER_NAME}" /workspace/script/tools/ck-test "$@"
+}

-    while [[ $# -gt 0 ]]; do
-        case $1 in
-            --name)
-                name="$2"
-                shift 2
-                ;;
-            --gtest_*|--help)
-                test_options+=("$1")
-                shift
-                ;;
-            *)
-                if [ -z "$test_name" ]; then
-                    test_name="$1"
-                else
-                    test_options+=("$1")
-                fi
-                shift
-                ;;
-        esac
-    done
-
-    if [ -z "$test_name" ]; then
-        echo "Error: test_name required"
-        echo "Usage: ck-docker test <test_name> [--name container_name] [gtest_options]"
+# Execute arbitrary command in container
+cmd_exec() {
+    if [ $# -eq 0 ]; then
+        error "command required"
+        echo "Usage: ck-docker exec <command>"
        return 1
    fi

-    # Check if container is running
-    if ! container_is_running "${name}"; then
-        echo "Error: Container '${name}' not running"
-        echo "Start it with: ck-docker start --name ${name}"
-        return 1
-    fi
+    ensure_container_running "${CONTAINER_NAME}" "${SCRIPT_DIR}"

-    if ! docker exec "${name}" test -f "/workspace/build/bin/${test_name}" 2>/dev/null; then
-        echo "Test executable not found. Building ${test_name}..."
-        cmd_build "${test_name}" --name "${name}"
-    fi
+    local docker_flags=()
+    [ -t 0 ] && [ -t 1 ] && docker_flags+=("-it")

-    echo "Running: ${test_name} ${test_options[*]}"
-    echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
-    # Build the command with proper quoting
-    local cmd="cd /workspace/build && ./bin/${test_name}"
-    for opt in "${test_options[@]}"; do
-        cmd="${cmd} $(printf '%q' "$opt")"
-    done
-    docker exec "${name}" bash -c "${cmd}"
+    docker exec "${docker_flags[@]}" "${CONTAINER_NAME}" "$@"
 }

 # Shell
@@ -220,7 +136,7 @@ cmd_status() {

    if [ -z "$name" ]; then
        echo "Composable Kernel Docker Containers:"
-        echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+        echo "---"
        docker ps -a --filter "ancestor=${docker_image}" \
            --format "table {{.Names}}\t{{.Status}}\t{{.CreatedAt}}" || echo "No containers found"
    else
@@ -262,6 +178,10 @@ case "${1:-}" in
        shift
        cmd_start "$@"
        ;;
+    configure)
+        shift
+        cmd_configure "$@"
+        ;;
    build)
        shift
        cmd_build "$@"
@@ -270,6 +190,10 @@ case "${1:-}" in
        shift
        cmd_test "$@"
        ;;
+    exec)
+        shift
+        cmd_exec "$@"
+        ;;
    shell)
        shift
        cmd_shell "$@"
--- a/script/tools/ck-rocprof
+++ b/script/tools/ck-rocprof
@@ -0,0 +1,806 @@
+#!/bin/bash
+# Copyright (c) Advanced Micro Devices, Inc., or its affiliates.
+# SPDX-License-Identifier: MIT
+
+# CK ROCProf Tool - Profile CK applications with rocprof-compute
+# Native-only tool. For Docker usage, run via: ck-docker exec ck-rocprof ...
+
+set -e
+set -o pipefail
+
+# Find script directory and load common utilities
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+source "${SCRIPT_DIR}/common.sh"
+
+# Initialize configuration
+PROJECT_ROOT=$(find_project_root "${SCRIPT_DIR}" || get_project_root "${SCRIPT_DIR}")
+
+# ============================================================================
+# rocprof-compute detection
+# ============================================================================
+
+# Common rocprof-compute binary locations
+# Order: user installs first, then system ROCm versions (newest first)
+ROCPROF_CANDIDATES=(
+    "${HOME}/.local/rocprofiler-compute/3.4.0/bin/rocprof-compute"
+    "/opt/rocm/bin/rocprof-compute"
+    "/opt/rocm-7.2.0/bin/rocprof-compute"
+    "/opt/rocm-7.0.1/bin/rocprof-compute"
+    "/opt/rocm-6.2.0/bin/rocprof-compute"
+    "/opt/rocm-6.1.0/bin/rocprof-compute"
+)
+
+# Find rocprof-compute binary
+find_rocprof_bin() {
+    # Check CK_ROCPROF_BIN first
+    if [ -n "${CK_ROCPROF_BIN:-}" ] && [ -f "${CK_ROCPROF_BIN}" ]; then
+        echo "${CK_ROCPROF_BIN}"
+        return 0
+    fi
+
+    # Check PATH
+    if command -v rocprof-compute &>/dev/null; then
+        command -v rocprof-compute
+        return 0
+    fi
+
+    # Check common ROCm locations and user installations
+    for bin in "${ROCPROF_CANDIDATES[@]}"; do
+        if [ -f "$bin" ]; then
+            echo "$bin"
+            return 0
+        fi
+    done
+
+    return 1
+}
+
+# Find ROCm requirements file
+find_rocm_requirements() {
+    local rocprof_bin="${1:-$(find_rocprof_bin)}"
+    if [ -z "$rocprof_bin" ]; then
+        return 1
+    fi
+
+    # Requirements file is typically at ../libexec/rocprofiler-compute/requirements.txt
+    local rocm_dir
+    rocm_dir=$(dirname "$(dirname "$rocprof_bin")")
+    local req_file="${rocm_dir}/libexec/rocprofiler-compute/requirements.txt"
+
+    if [ -f "$req_file" ]; then
+        echo "$req_file"
+        return 0
+    fi
+
+    return 1
+}
+
+# ============================================================================
+# Configuration
+# ============================================================================
+
+ROCPROF_BIN="${CK_ROCPROF_BIN:-$(find_rocprof_bin || echo "")}"
+VENV_PATH="${CK_PROFILE_VENV:-${PROJECT_ROOT}/.ck-rocprof-venv}"
+WORKLOAD_DIR="${CK_WORKLOAD_DIR:-$(get_build_dir "${PROJECT_ROOT}")/workloads}"
+ROCM_REQUIREMENTS="${CK_ROCM_REQUIREMENTS:-$(find_rocm_requirements "${ROCPROF_BIN}" || echo "")}"
+
+# ============================================================================
+# Helper functions
+# ============================================================================
+
+# Get file/directory size
+get_size() {
+    local path="$1"
+    du -sh "$path" 2>/dev/null | cut -f1
+}
+
+# Get file modification date (cross-platform: Linux and macOS)
+get_date() {
+    local path="$1"
+    # Try GNU stat first (Linux), fall back to BSD stat (macOS)
+    if stat --version &>/dev/null 2>&1; then
+        stat -c %y "$path" 2>/dev/null | cut -d' ' -f1
+    else
+        stat -f %Sm -t %Y-%m-%d "$path" 2>/dev/null
+    fi
+}
+
+# Help message
+show_help() {
+    cat << EOF
+CK ROCProf Tool - Profile CK applications with rocprof-compute
+
+Usage: ck-rocprof <command> [options]
+
+Commands:
+  setup                           One-time setup: create Python venv and install dependencies
+  run <name> <executable> [args]  Profile executable and save results as <name>
+  analyze <name> [block]          Analyze profiling results (default: block 12 - LDS metrics)
+  compare <name1> <name2>         Compare two profiling runs
+  list                            List available profiling runs
+  clean <name>                    Remove a profiling run (use --all for all runs)
+  status                          Show current configuration and status
+  help                            Show this help message
+
+Examples:
+  ck-rocprof setup
+  ck-rocprof run baseline ./bin/tile_example_gemm_universal
+  ck-rocprof analyze baseline
+  ck-rocprof analyze baseline 12
+  ck-rocprof compare baseline optimized
+  ck-rocprof list
+  ck-rocprof clean baseline
+  ck-rocprof status
+
+Environment Variables:
+  CK_GPU_TARGET        - Override GPU detection (e.g., gfx950, MI300X)
+  CK_PROFILE_VENV      - Python venv path (default: \$PROJECT/.ck-rocprof-venv)
+  CK_ROCPROF_BIN       - rocprof-compute binary path
+  CK_ROCM_REQUIREMENTS - Path to rocprofiler-compute requirements.txt
+  CK_WORKLOAD_DIR      - Workload storage directory
+
+Profiling Blocks (use with 'analyze <name> <block>'):
+  Block 2:  System Speed-of-Light (SOL)
+  Block 6:  Shader Engine (SE) utilization
+  Block 7:  L2 Cache metrics
+  Block 11: Vector L1D Cache metrics
+  Block 12: LDS (Local Data Share) - DEFAULT
+  Block 16: Instruction mix statistics
+  Block 17: Compute Unit (CU) metrics
+
+LDS Metrics (Block 12):
+  - 12.1.3: Bank Conflict Rate (% of peak)
+  - 12.2.9: Bank Conflicts/Access (conflicts/access)
+  - 12.2.12: Bank Conflict (cycles per kernel)
+  - 12.2.17: LDS Data FIFO Full Rate (cycles)
+
+Notes:
+  - Workload names must be alphanumeric with hyphens/underscores only
+  - Profiling skips roofline analysis (--no-roof) for faster execution
+  - Results stored in workloads/<name>/
+  - For Docker usage, run via: ck-docker exec ck-rocprof ...
+EOF
+}
+
+# Get rocprof-compute wrapper path
+get_rocprof_wrapper() {
+    echo "${VENV_PATH}/bin/rocprof-compute"
+}
+
+# Validate workload name to prevent path traversal and shell injection
+# Allowed: alphanumeric, hyphens, underscores
+validate_workload_name() {
+    local name="$1"
+    if [[ ! "$name" =~ ^[a-zA-Z0-9_-]+$ ]]; then
+        error "Invalid workload name: '$name'"
+        echo "Names must contain only letters, numbers, hyphens, and underscores"
+        return 1
+    fi
+    # Prevent reserved names
+    if [[ "$name" == "." || "$name" == ".." ]]; then
+        error "Invalid workload name: '$name'"
+        return 1
+    fi
+    return 0
+}
+
+# Check if setup is complete
+is_setup_complete() {
+    local wrapper
+    wrapper=$(get_rocprof_wrapper)
+    [ -d "${VENV_PATH}" ] && [ -f "${wrapper}" ]
+}
+
+# ============================================================================
+# Source installation
+# ============================================================================
+
+# rocprofiler-compute source installation location
+ROCPROF_SOURCE_VERSION="3.4.0"
+ROCPROF_SOURCE_DIR="${HOME}/.local/rocprofiler-compute/${ROCPROF_SOURCE_VERSION}"
+ROCPROF_SOURCE_BIN="${ROCPROF_SOURCE_DIR}/bin/rocprof-compute"
+ROCPROF_REPO_URL="https://github.com/ROCm/rocprofiler-compute.git"
+ROCPROF_REPO_BRANCH="release/rocprofiler-compute-v${ROCPROF_SOURCE_VERSION}"
+
+# Install rocprofiler-compute from source
+install_from_source() {
+    local install_dir="${ROCPROF_SOURCE_DIR}"
+    local src_dir="${install_dir}/src"
+
+    info "Installing rocprofiler-compute ${ROCPROF_SOURCE_VERSION} from source..."
+    echo "Install location: ${install_dir}"
+    echo ""
+
+    # Ensure uv is available
+    if ! command -v uv &>/dev/null; then
+        info "Installing uv package manager via pip..."
+        if ! python3 -m pip install --user uv; then
+            error "Failed to install uv package manager"
+            return 1
+        fi
+        export PATH="${HOME}/.local/bin:${PATH}"
+        if ! command -v uv &>/dev/null; then
+            error "uv installed but not found in PATH"
+            return 1
+        fi
+    fi
+
+    # Create installation directory
+    mkdir -p "${install_dir}"
+
+    # Clone repository
+    if [ -d "${src_dir}" ]; then
+        info "Source already exists, updating..."
+        git -C "${src_dir}" fetch --quiet
+        git -C "${src_dir}" checkout --quiet "${ROCPROF_REPO_BRANCH}" 2>/dev/null || \
+            git -C "${src_dir}" checkout --quiet "amd-mainline"
+    else
+        info "Cloning rocprofiler-compute repository..."
+        if ! git clone --quiet --branch "${ROCPROF_REPO_BRANCH}" --depth 1 "${ROCPROF_REPO_URL}" "${src_dir}" 2>/dev/null; then
+            # Fall back to amd-mainline if release branch doesn't exist
+            info "Release branch not found, using amd-mainline..."
+            git clone --quiet --branch "amd-mainline" --depth 1 "${ROCPROF_REPO_URL}" "${src_dir}"
+        fi
+    fi
+
+    # Create venv for source installation
+    local venv_dir="${install_dir}/venv"
+    if [ ! -d "${venv_dir}" ]; then
+        info "Creating Python virtual environment..."
+        uv venv "${venv_dir}"
+    fi
+
+    # Install dependencies from requirements.txt
+    info "Installing dependencies (this may take a minute)..."
+    uv pip install --python "${venv_dir}/bin/python" -r "${src_dir}/requirements.txt" --quiet
+    # Pin pandas to avoid CSV conversion bug
+    uv pip install --python "${venv_dir}/bin/python" 'pandas<3.0' --quiet
+
+    # Create bin directory and wrapper script
+    mkdir -p "${install_dir}/bin"
+    cat > "${ROCPROF_SOURCE_BIN}" << 'WRAPPER_EOF'
+#!/bin/bash
+# rocprof-compute wrapper for source installation
+INSTALL_DIR="$(cd "$(dirname "$0")/.." && pwd)"
+SRC_DIR="${INSTALL_DIR}/src/src"
+VENV_DIR="${INSTALL_DIR}/venv"
+
+# Set PYTHONPATH to source directory for module imports
+export PYTHONPATH="${SRC_DIR}:${PYTHONPATH}"
+
+# Execute rocprof-compute script with venv Python
+exec "${VENV_DIR}/bin/python3" "${SRC_DIR}/rocprof-compute" "$@"
+WRAPPER_EOF
+    chmod +x "${ROCPROF_SOURCE_BIN}"
+
+    info "rocprofiler-compute installed successfully!"
+    echo "  Binary: ${ROCPROF_SOURCE_BIN}"
+    echo ""
+}
+
+# ============================================================================
+# Commands
+# ============================================================================
+
+# Setup: Create Python venv and install rocprof-compute dependencies
+cmd_setup() {
+    echo "Setting up rocprof-compute profiling environment..."
+    echo "==========================================="
+
+    # Check if rocprof-compute exists, install from source if not
+    if [ -z "${ROCPROF_BIN}" ] || [ ! -f "${ROCPROF_BIN}" ]; then
+        warn "rocprof-compute not found in standard locations"
+        echo ""
+        echo "Searched locations:"
+        for bin in "${ROCPROF_CANDIDATES[@]}"; do
+            echo "  - $bin"
+        done
+        echo ""
+
+        # Check if we can install from source
+        if ! command -v git &>/dev/null; then
+            error "git is required to install from source"
+            return 1
+        fi
+        if ! command -v python3 &>/dev/null; then
+            error "python3 is required to install from source"
+            return 1
+        fi
+
+        echo "Installing rocprofiler-compute from source..."
+        echo ""
+        if ! install_from_source; then
+            error "Failed to install rocprofiler-compute from source"
+            return 1
+        fi
+
+        # Update configuration with source installation
+        ROCPROF_BIN="${ROCPROF_SOURCE_BIN}"
+        ROCM_REQUIREMENTS="${ROCPROF_SOURCE_DIR}/libexec/rocprofiler-compute/requirements.txt"
+    fi
+    info "Using rocprof-compute: ${ROCPROF_BIN}"
+
+    # Check requirements file (only needed for non-source installs that use separate venv)
+    if [ -z "${ROCM_REQUIREMENTS}" ] || [ ! -f "${ROCM_REQUIREMENTS}" ]; then
+        # For source installs, requirements are bundled
+        if [[ "${ROCPROF_BIN}" == "${ROCPROF_SOURCE_BIN}" ]]; then
+            ROCM_REQUIREMENTS="${ROCPROF_SOURCE_DIR}/libexec/rocprofiler-compute/requirements.txt"
+        else
+            error "ROCm requirements file not found"
+            local expected_path
+            expected_path="$(dirname "$(dirname "${ROCPROF_BIN}")")/libexec/rocprofiler-compute/requirements.txt"
+            echo "Expected at: ${expected_path}"
+            echo "Set CK_ROCM_REQUIREMENTS to override"
+            return 1
+        fi
+    fi
+
+    # Check GPU access
+    if [ ! -r /dev/kfd ]; then
+        warn "No read access to /dev/kfd - GPU profiling may fail"
+        warn "Add user to video/render group: sudo usermod -a -G video,render \$USER"
+    fi
+
+    # For source installations, the venv is already set up - just create wrapper
+    if [[ "${ROCPROF_BIN}" == "${ROCPROF_SOURCE_BIN}" ]]; then
+        # Source install already has everything set up
+        local wrapper
+        wrapper=$(get_rocprof_wrapper)
+        mkdir -p "$(dirname "${wrapper}")"
+
+        # For source install, wrapper just calls the source binary
+        cat > "${wrapper}" << WRAPPER_EOF
+#!/bin/bash
+# rocprof-compute wrapper (using source installation)
+exec "${ROCPROF_BIN}" "\$@"
+WRAPPER_EOF
+        chmod +x "${wrapper}"
+        info "Wrapper created at ${wrapper}"
+
+        # Create marker file for venv directory
+        mkdir -p "${VENV_PATH}/bin"
+        touch "${VENV_PATH}/.source-install"
+    else
+        # System install - need to set up venv with dependencies
+        # Install uv if needed
+        if ! command -v uv &>/dev/null; then
+            info "Installing uv package manager via pip..."
+            if ! python3 -m pip install --user uv; then
+                error "Failed to install uv package manager"
+                return 1
+            fi
+            export PATH="${HOME}/.local/bin:${PATH}"
+            if ! command -v uv &>/dev/null; then
+                error "uv installed but not found in PATH"
+                echo "Try adding ~/.local/bin to your PATH"
+                return 1
+            fi
+        fi
+
+        # Create venv
+        if [ -d "${VENV_PATH}" ]; then
+            info "Python venv already exists at ${VENV_PATH}"
+        else
+            info "Creating Python venv at ${VENV_PATH}..."
+            uv venv "${VENV_PATH}"
+        fi
+
+        # Install dependencies
+        info "Installing dependencies..."
+        uv pip install --python "${VENV_PATH}/bin/python" -r "${ROCM_REQUIREMENTS}"
+        uv pip install --python "${VENV_PATH}/bin/python" 'pandas<3.0'
+
+        # Create wrapper script
+        local wrapper
+        wrapper=$(get_rocprof_wrapper)
+        mkdir -p "$(dirname "${wrapper}")"
+        cat > "${wrapper}" << WRAPPER_EOF
+#!/bin/bash
+# rocprof-compute wrapper using venv Python
+VENV_DIR="\$(cd "\$(dirname "\$0")/.." && pwd)"
+exec "\${VENV_DIR}/bin/python" "${ROCPROF_BIN}" "\$@"
+WRAPPER_EOF
+        chmod +x "${wrapper}"
+        info "Wrapper created at ${wrapper}"
+    fi
+
+    # Create workload directory
+    mkdir -p "${WORKLOAD_DIR}"
+    info "Workload directory: ${WORKLOAD_DIR}"
+
+    echo ""
+    info "Setup complete! You can now use:"
+    echo "  ck-rocprof run <name> <executable>"
+}
+
+# Detect GPU architecture
+detect_gpu_arch() {
+    # Allow override via environment variable
+    if [ -n "${CK_GPU_TARGET:-}" ]; then
+        echo "${CK_GPU_TARGET}"
+        return 0
+    fi
+
+    if command -v rocminfo &>/dev/null; then
+        # Try marketing name first (MI350, MI300X)
+        local marketing_name
+        marketing_name=$(rocminfo 2>/dev/null | grep 'Marketing Name:' | grep -oE 'MI[0-9]+[A-Z]*' | head -1)
+        if [ -n "$marketing_name" ]; then
+            echo "$marketing_name"
+            return 0
+        fi
+
+        # Fallback to gfx name
+        local gfx_name
+        gfx_name=$(rocminfo 2>/dev/null | grep -oE 'gfx[0-9a-z]+' | head -1)
+        if [ -n "$gfx_name" ]; then
+            echo "$gfx_name"
+            return 0
+        fi
+    fi
+
+    # Try existing workload directories
+    if [ -d "${WORKLOAD_DIR}" ]; then
+        local first_dir
+        first_dir=$(find "${WORKLOAD_DIR}" -maxdepth 2 -type d \( -name 'gfx*' -o -name 'MI*' \) 2>/dev/null | head -1)
+        if [ -n "$first_dir" ]; then
+            basename "$first_dir"
+            return 0
+        fi
+    fi
+
+    # Final fallback - use gfx950 consistent with common.sh
+    echo "gfx950"
+}
+
+# Run profiling
+cmd_run() {
+    # Validate argument count before shifting
+    if [ $# -lt 2 ]; then
+        error "name and executable required"
+        echo "Usage: ck-rocprof run <name> <executable> [args]"
+        return 1
+    fi
+
+    local name="$1"
+    local executable="$2"
+    shift 2
+    local -a exe_args=("$@")
+
+    # Validate workload name (prevents path traversal)
+    if ! validate_workload_name "$name"; then
+        return 1
+    fi
+
+    # Check setup
+    if ! is_setup_complete; then
+        error "Profiling environment not set up"
+        echo "Run: ck-rocprof setup"
+        return 1
+    fi
+
+    # Check if executable exists
+    if [ ! -f "$executable" ]; then
+        error "Executable not found: $executable"
+        return 1
+    fi
+
+    local wrapper
+    wrapper=$(get_rocprof_wrapper)
+    local gpu_arch
+    gpu_arch=$(detect_gpu_arch)
+
+    echo "Profiling: $executable ${exe_args[*]}"
+    echo "Run name: $name"
+    echo "GPU arch: $gpu_arch"
+    echo "==========================================="
+
+    # Build command with proper escaping to prevent shell injection
+    # --no-roof skips roofline analysis to speed up profiling
+    local escaped_executable
+    escaped_executable=$(printf '%q' "$executable")
+    local escaped_workload_dir
+    escaped_workload_dir=$(printf '%q' "${WORKLOAD_DIR}/${name}")
+
+    local cmd="${wrapper} profile --no-roof --path ${escaped_workload_dir} --name ${name} -- ${escaped_executable}"
+    for arg in "${exe_args[@]}"; do
+        cmd="${cmd} $(printf '%q' "$arg")"
+    done
+
+    # Run profiling
+    bash -c "${cmd}"
+
+    echo ""
+    info "Profiling complete"
+    echo "Results saved to: ${WORKLOAD_DIR}/${name}/"
+    echo ""
+    echo "Analyze with: ck-rocprof analyze ${name}"
+}
+
+# Find workload path for a given run name
+find_workload_path() {
+    local name="$1"
+    local run_dir="${WORKLOAD_DIR}/${name}"
+
+    if [ ! -d "$run_dir" ]; then
+        return 1
+    fi
+
+    # Check if profiling data exists
+    if [ -f "${run_dir}/pmc_perf.csv" ]; then
+        echo "$run_dir"
+        return 0
+    fi
+
+    return 1
+}
+
+# Analyze profiling results
+cmd_analyze() {
+    local name="$1"
+    local block="${2:-12}"  # Default to block 12 (LDS metrics)
+
+    if [ -z "$name" ]; then
+        error "name required"
+        echo "Usage: ck-rocprof analyze <name> [block]"
+        return 1
+    fi
+
+    # Validate workload name (prevents path traversal)
+    if ! validate_workload_name "$name"; then
+        return 1
+    fi
+
+    # Check setup
+    if ! is_setup_complete; then
+        error "Profiling environment not set up"
+        echo "Run: ck-rocprof setup"
+        return 1
+    fi
+
+    local wrapper
+    wrapper=$(get_rocprof_wrapper)
+    local workload_path
+    workload_path=$(find_workload_path "${name}")
+
+    if [ -z "$workload_path" ]; then
+        error "Profiling results not found for '${name}'"
+        echo ""
+        echo "Available runs:"
+        cmd_list
+        return 1
+    fi
+
+    echo "Analyzing: ${name} (Block ${block})"
+    echo "==========================================="
+    echo ""
+
+    "${wrapper}" analyze --path "${workload_path}" --block "${block}"
+}
+
+# Compare two profiling runs
+cmd_compare() {
+    local name1="$1"
+    local name2="$2"
+
+    if [ -z "$name1" ] || [ -z "$name2" ]; then
+        error "two run names required"
+        echo "Usage: ck-rocprof compare <name1> <name2>"
+        return 1
+    fi
+
+    # Validate workload names (prevents path traversal)
+    if ! validate_workload_name "$name1"; then
+        return 1
+    fi
+    if ! validate_workload_name "$name2"; then
+        return 1
+    fi
+
+    # Check setup
+    if ! is_setup_complete; then
+        error "Profiling environment not set up"
+        echo "Run: ck-rocprof setup"
+        return 1
+    fi
+
+    # Verify both runs exist
+    local path1
+    path1=$(find_workload_path "${name1}")
+    local path2
+    path2=$(find_workload_path "${name2}")
+
+    if [ -z "$path1" ]; then
+        error "Profiling results not found for '${name1}'"
+        return 1
+    fi
+
+    if [ -z "$path2" ]; then
+        error "Profiling results not found for '${name2}'"
+        return 1
+    fi
+
+    echo "Comparing profiling runs:"
+    echo "  Baseline:  ${name1}"
+    echo "  Optimized: ${name2}"
+    echo "==========================================="
+    echo ""
+
+    echo "=== ${name1} - Block 12 (LDS) ==="
+    cmd_analyze "${name1}" 12 2>/dev/null | head -40
+
+    echo ""
+    echo "=== ${name2} - Block 12 (LDS) ==="
+    cmd_analyze "${name2}" 12 2>/dev/null | head -40
+
+    echo ""
+    echo "==========================================="
+    echo "For detailed analysis, run:"
+    echo "  ck-rocprof analyze ${name1} 12"
+    echo "  ck-rocprof analyze ${name2} 12"
+}
+
+# List available profiling runs
+cmd_list() {
+    if [ ! -d "${WORKLOAD_DIR}" ]; then
+        echo "No profiling runs found (workload directory doesn't exist)"
+        return 0
+    fi
+
+    local runs
+    runs=$(find "${WORKLOAD_DIR}" -maxdepth 1 -mindepth 1 -type d -exec basename {} \; 2>/dev/null | sort)
+
+    if [ -z "$runs" ]; then
+        echo "No profiling runs found in ${WORKLOAD_DIR}"
+        return 0
+    fi
+
+    echo "Available profiling runs:"
+    echo "==========================================="
+
+    while IFS= read -r run; do
+        local path
+        path=$(find_workload_path "$run")
+
+        if [ -n "$path" ]; then
+            local size
+            size=$(get_size "$path")
+            local date
+            date=$(get_date "$path")
+            printf "  %-25s [%s, %s]\n" "$run" "$size" "$date"
+        else
+            printf "  %-25s [no data]\n" "$run"
+        fi
+    done <<< "$runs"
+
+    echo ""
+    echo "Analyze with: ck-rocprof analyze <name>"
+}
+
+# Clean (remove) profiling runs
+cmd_clean() {
+    local name="${1:-}"
+
+    if [ -z "$name" ]; then
+        error "name required (or use --all to remove all runs)"
+        echo "Usage: ck-rocprof clean <name>"
+        echo "       ck-rocprof clean --all"
+        return 1
+    fi
+
+    if [ "$name" = "--all" ]; then
+        # Remove all profiling runs
+        if [ ! -d "${WORKLOAD_DIR}" ]; then
+            echo "No profiling runs to clean"
+            return 0
+        fi
+
+        echo "This will remove ALL profiling runs in ${WORKLOAD_DIR}"
+        read -r -p "Are you sure? [y/N] " confirm
+        if [[ ! "$confirm" =~ ^[Yy]$ ]]; then
+            echo "Cancelled"
+            return 0
+        fi
+
+        rm -rf "${WORKLOAD_DIR:?}"/*
+        info "All profiling runs removed"
+    else
+        # Validate name
+        if ! validate_workload_name "$name"; then
+            return 1
+        fi
+
+        local run_dir="${WORKLOAD_DIR}/${name}"
+        if [ ! -d "$run_dir" ]; then
+            error "Profiling run not found: ${name}"
+            return 1
+        fi
+
+        rm -rf "${run_dir}"
+        info "Removed profiling run: ${name}"
+    fi
+}
+
+# Show status information
+cmd_status() {
+    echo "CK ROCProf Status"
+    echo "==========================================="
+    echo ""
+
+    # rocprof-compute binary
+    if [ -n "${ROCPROF_BIN}" ] && [ -f "${ROCPROF_BIN}" ]; then
+        echo "rocprof-compute: ${ROCPROF_BIN}"
+    else
+        echo "rocprof-compute: not found"
+    fi
+    echo ""
+
+    # Paths
+    echo "Paths:"
+    echo "  Venv:      ${VENV_PATH}"
+    echo "  Workloads: ${WORKLOAD_DIR}"
+    echo ""
+
+    # Setup status
+    echo "Setup status:"
+    if is_setup_complete; then
+        echo "  Profiling environment: ready"
+    else
+        echo "  Profiling environment: not configured (run 'ck-rocprof setup')"
+    fi
+    echo ""
+
+    # Workload count
+    if [ -d "${WORKLOAD_DIR}" ]; then
+        local count
+        count=$(find "${WORKLOAD_DIR}" -maxdepth 1 -mindepth 1 -type d 2>/dev/null | wc -l)
+        echo "Profiling runs: ${count}"
+    else
+        echo "Profiling runs: 0"
+    fi
+}
+
+# ============================================================================
+# Main command dispatcher
+# ============================================================================
+
+case "${1:-}" in
+    setup)
+        cmd_setup
+        ;;
+    run)
+        shift
+        cmd_run "$@"
+        ;;
+    analyze)
+        shift
+        cmd_analyze "$@"
+        ;;
+    compare)
+        shift
+        cmd_compare "$@"
+        ;;
+    list)
+        cmd_list
+        ;;
+    clean)
+        shift
+        cmd_clean "$@"
+        ;;
+    status)
+        cmd_status
+        ;;
+    help|--help|-h)
+        show_help
+        ;;
+    *)
+        if [ -z "${1:-}" ]; then
+            show_help
+        else
+            echo "Unknown command: ${1}"
+            echo ""
+            show_help
+            exit 1
+        fi
+        ;;
+esac
--- a/script/tools/ck-rocprof.md
+++ b/script/tools/ck-rocprof.md
@@ -0,0 +1,167 @@
+# CK ROCProf Tool
+
+GPU performance profiling for Composable Kernel applications using AMD rocprof-compute.
+
+**Note:** This is a native-only tool. For Docker usage, run via `ck-docker exec ck-rocprof ...`
+
+## Quick Start
+
+```bash
+# One-time setup (requires rocprofiler-compute installed)
+./script/tools/ck-rocprof setup
+
+# Profile executable
+cd build
+../script/tools/ck-rocprof run baseline ./bin/tile_example_gemm_universal
+
+# Analyze LDS metrics
+../script/tools/ck-rocprof analyze baseline
+
+# Compare optimizations
+../script/tools/ck-rocprof run optimized ./bin/tile_example_gemm_universal
+../script/tools/ck-rocprof compare baseline optimized
+```
+
+## Commands
+
+### `setup`
+One-time setup: creates Python venv, installs dependencies, configures rocprof-compute.
+
+### `run <name> <executable> [args]`
+Profile executable and save results.
+
+```bash
+# Basic profiling
+ck-rocprof run baseline ./bin/gemm_example
+
+# With arguments
+ck-rocprof run large_matrix ./bin/gemm_example -m 8192 -n 8192 -k 4096
+
+# Test filtering
+ck-rocprof run unit_test ./bin/test_gemm --gtest_filter="*Fp16*"
+```
+
+### `analyze <name> [block]`
+Display profiling metrics (default: Block 12 - LDS).
+
+```bash
+ck-rocprof analyze baseline        # LDS metrics
+ck-rocprof analyze baseline 2      # L2 Cache
+ck-rocprof analyze baseline 7      # Instruction Mix
+```
+
+### `compare <name1> <name2>`
+Side-by-side comparison of two runs.
+
+### `list`
+List all profiling runs with size and date.
+
+### `clean <name>` / `clean --all`
+Remove profiling runs. Use `--all` to remove all runs.
+
+### `status`
+Show current configuration: mode (native/Docker), paths, setup status.
+
+## Key LDS Metrics (Block 12)
+
+**Target Values:**
+- Bank Conflicts/Access: <0.01 (1% conflict rate)
+- Bank Conflict Rate: >90% of peak bandwidth
+
+**Critical Metrics:**
+- **12.2.9 Bank Conflicts/Access**: Direct conflict measure
+  - Baseline (naive): ~0.04 (4% conflicts)
+  - Optimized: <0.005 (<0.5% conflicts)
+- **12.2.12 Bank Conflict Cycles**: Wasted cycles per kernel
+- **12.2.17 LDS Data FIFO Full**: Memory system pressure
+
+## Optimization Workflow
+
+```bash
+# 1. Baseline
+ck-rocprof run baseline ./bin/my_kernel
+
+# 2. Check conflicts
+ck-rocprof analyze baseline
+# Look for Bank Conflicts/Access > 0.02
+
+# 3. Optimize code (XOR transforms, padding, etc.)
+# ... edit source ...
+
+# 4. Test optimization
+ninja my_kernel
+ck-rocprof run optimized ./bin/my_kernel
+
+# 5. Verify improvement
+ck-rocprof compare baseline optimized
+# Target: 8-10x reduction in conflicts
+```
+
+## Environment Variables
+
+- `CK_PROFILE_VENV`: Python venv path (default: `$PROJECT/.ck-rocprof-venv`)
+- `CK_ROCPROF_BIN`: rocprof-compute binary path (auto-detected from PATH or /opt/rocm)
+- `CK_ROCM_REQUIREMENTS`: Path to rocprofiler-compute requirements.txt (auto-detected)
+- `CK_WORKLOAD_DIR`: Results directory (default: `$PROJECT/build/workloads`)
+- `CK_GPU_TARGET`: Override GPU detection (e.g., `gfx950`, `MI300X`)
+
+## Interpreting Results
+
+**Good Performance:**
+```
+Bank Conflicts/Access: <0.01
+Bank Conflict Rate: >90% of peak
+LDS Data FIFO Full: Minimal cycles
+```
+
+**Needs Optimization:**
+```
+Bank Conflicts/Access: >0.02
+Bank Conflict Cycles: High MAX values
+LDS Data FIFO Full: High memory pressure
+```
+
+## Troubleshooting
+
+**"Profiling environment not set up"**
+```bash
+ck-rocprof setup
+```
+
+**"rocprof-compute not found"**
+```bash
+export CK_ROCPROF_BIN=/custom/path/rocprof-compute
+ck-rocprof setup
+```
+
+**"Profiling results not found"**
+```bash
+ck-rocprof list                    # Check available runs
+rocminfo | grep gfx               # Verify GPU arch
+export CK_GPU_TARGET=gfx950       # Override if needed
+```
+
+## Storage Layout
+
+Results stored in `workloads/<name>/`:
+- `pmc_perf.csv`: Performance counters (primary data file)
+- `perfmon/`: Input metric files
+- `out/`: Raw output data from profiler runs
+- `log.txt`: Profiling log
+
+## Technical Details
+
+- **Setup**: Creates isolated Python venv, installs dependencies
+- **Profiling**: Runs `rocprof-compute profile --name <name> -- <executable>`
+- **Analysis**: Runs `rocprof-compute analyze --path <path> --block <block>`
+- **GPU Support**: MI300/MI350 series, auto-detects architecture
+
+## Related Tools
+
+- `ck-docker`: Container management
+- `rocprof-compute`: AMD GPU profiler v2
+- `rocm-smi`: System monitoring
+
+## License
+
+Copyright (c) Advanced Micro Devices, Inc. SPDX-License-Identifier: MIT
--- a/script/tools/ck-test
+++ b/script/tools/ck-test
@@ -2,7 +2,8 @@
 # Copyright (c) Advanced Micro Devices, Inc., or its affiliates.
 # SPDX-License-Identifier: MIT

-# CK Test - Build and test Composable Kernel in Docker
+# CK Test - Run Composable Kernel tests
+# Environment-agnostic: works natively on ROCm hosts or inside containers

 set -e
 set -o pipefail
@@ -12,155 +13,219 @@ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
 source "${SCRIPT_DIR}/common.sh"

 # Initialize configuration
-PROJECT_ROOT=$(get_project_root "${SCRIPT_DIR}")
-CONTAINER_NAME=$(get_container_name "${PROJECT_ROOT}")
+PROJECT_ROOT=$(find_project_root "${SCRIPT_DIR}" || get_project_root "${SCRIPT_DIR}")
+BUILD_DIR=$(get_build_dir "${PROJECT_ROOT}")

 # Help message
 show_help() {
    cat << EOF
-CK Test - Build and test Composable Kernel in Docker
+CK Test - Run Composable Kernel tests

-Usage: ck-test [options] <test_name> [test_options]
+Usage: ck-test [options] [test_name] [-- gtest_options]

 Options:
  -h, --help              Show this help message
-  --name <name>           Specify container name
-  --reconfigure           Reconfigure CMake before building
+  --build-dir <dir>       Build directory (default: ./build)
  --no-build              Skip building, run test directly
+  --list                  List available tests
+  --smoke                 Run all smoke tests (via CTest -L SMOKE_TEST)
+  --regression            Run all regression tests (via CTest -L REGRESSION_TEST)
+  --all                   Run all tests (via CTest)
+  --filter <pattern>      Shorthand for --gtest_filter=<pattern>

 Arguments:
-  test_name               Name of test executable (required)
-  test_options            Additional options passed to test (e.g., --gtest_filter=*)
+  test_name               Name of test executable (optional for --smoke/--regression/--all)
+  gtest_options           Additional options passed to test (after --)

 Environment:
-  CK_CONTAINER_NAME - Override default container name
-  GPU_TARGET        - Override GPU target detection (e.g., gfx950, gfx942)
+  CK_BUILD_DIR   - Override build directory

 Examples:
-  ck-test test_amdgcn_mma
-  ck-test test_amdgcn_mma --gtest_filter=*Fp16*
-  ck-test --name my_container test_amdgcn_mma
-  ck-test --reconfigure test_amdgcn_mma
+  ck-test test_amdgcn_mma                         # Build and run specific test
+  ck-test test_amdgcn_mma --filter '*Fp16*'       # Run with gtest filter
+  ck-test test_amdgcn_mma -- --gtest_filter=*Fp16*  # Explicit gtest options
+  ck-test --no-build test_amdgcn_mma              # Run without rebuilding
+  ck-test --list                                  # List available tests
+  ck-test --smoke                                 # Run all smoke tests
+  ck-test --regression                            # Run all regression tests
+  ck-test --all                                   # Run all tests

 EOF
 }

 # Parse arguments
 test_name=""
-reconfigure=false
 no_build=false
-test_options=()
+list_tests=false
+run_smoke=false
+run_regression=false
+run_all=false
+gtest_filter=""
+gtest_options=()
+parsing_gtest=false

 while [[ $# -gt 0 ]]; do
+    if [ "$parsing_gtest" = true ]; then
+        gtest_options+=("$1")
+        shift
+        continue
+    fi
+
    case $1 in
        -h|--help)
            show_help
            exit 0
            ;;
-        --name)
-            CONTAINER_NAME="$2"
+        --build-dir)
+            require_arg "$1" "${2:-}"
+            BUILD_DIR="$2"
            shift 2
            ;;
-        --reconfigure)
-            reconfigure=true
-            shift
-            ;;
        --no-build)
            no_build=true
            shift
            ;;
-        --gtest_*|--help)
-            test_options+=("$1")
+        --list)
+            list_tests=true
+            shift
+            ;;
+        --smoke)
+            run_smoke=true
+            shift
+            ;;
+        --regression)
+            run_regression=true
+            shift
+            ;;
+        --all)
+            run_all=true
+            shift
+            ;;
+        --filter)
+            require_arg "$1" "${2:-}"
+            gtest_filter="$2"
+            shift 2
+            ;;
+        --)
+            parsing_gtest=true
+            shift
+            ;;
+        --gtest_*)
+            gtest_options+=("$1")
            shift
            ;;
        *)
            if [ -z "$test_name" ]; then
                test_name="$1"
            else
-                test_options+=("$1")
+                gtest_options+=("$1")
            fi
            shift
            ;;
    esac
 done

-# Validate test name
+# Add filter to gtest options if specified
+if [ -n "$gtest_filter" ]; then
+    gtest_options+=("--gtest_filter=${gtest_filter}")
+fi
+
+# Validate mutual exclusivity of test suite options
+suite_count=0
+[ "$run_smoke" = true ] && suite_count=$((suite_count + 1))
+[ "$run_regression" = true ] && suite_count=$((suite_count + 1))
+[ "$run_all" = true ] && suite_count=$((suite_count + 1))
+
+if [ "$suite_count" -gt 1 ]; then
+    error "Options --smoke, --regression, and --all are mutually exclusive"
+    exit 1
+fi
+
+# Check build is configured
+if ! is_build_configured "${BUILD_DIR}"; then
+    error "Build not configured. Run 'ck-configure' first"
+    exit 1
+fi
+
+# Handle --list
+if [ "$list_tests" = true ]; then
+    info "Available tests:"
+    if [ -d "${BUILD_DIR}/bin" ]; then
+        ls -1 "${BUILD_DIR}/bin/" 2>/dev/null | grep -E '^test_' | sort || echo "  (No test binaries found)"
+    else
+        echo "  (No bin directory found)"
+    fi
+    echo ""
+    echo "CTest labels:"
+    cd "${BUILD_DIR}"
+    ctest -N 2>/dev/null | head -20 || echo "  (Run 'ctest -N' for full list)"
+    exit 0
+fi
+
+# Handle CTest-based test suites
+if [ "$run_smoke" = true ] || [ "$run_regression" = true ] || [ "$run_all" = true ]; then
+    cd "${BUILD_DIR}"
+
+    ctest_cmd=(ctest --output-on-failure)
+
+    if [ "$run_smoke" = true ]; then
+        ctest_cmd+=(-L SMOKE_TEST)
+        info "Running smoke tests..."
+    elif [ "$run_regression" = true ]; then
+        ctest_cmd+=(-L REGRESSION_TEST)
+        info "Running regression tests..."
+    else
+        info "Running all tests..."
+    fi
+
+    "${ctest_cmd[@]}"
+    exit_code=$?
+
+    echo ""
+    if [ $exit_code -eq 0 ]; then
+        info "Tests completed successfully"
+    else
+        error "Tests failed with exit code: ${exit_code}"
+    fi
+    exit $exit_code
+fi
+
+# Validate test name for individual test runs
 if [ -z "$test_name" ]; then
-    echo "Error: test_name required"
+    error "test_name required (or use --smoke/--regression/--all for test suites)"
    echo ""
    show_help
    exit 1
 fi

-# Ensure container is running
-if ! container_is_running "${CONTAINER_NAME}"; then
-    echo "Container '${CONTAINER_NAME}' not running. Starting..."
-    "${SCRIPT_DIR}/ck-start" "${CONTAINER_NAME}"
-    echo ""
-fi
-
-# Configure CMake if needed or requested
-if [ "$reconfigure" = true ] || ! docker exec "${CONTAINER_NAME}" test -f /workspace/build/build.ninja 2>/dev/null; then
-    echo "Detecting GPU target..."
-    GPU_TARGET_DETECTED=$(detect_gpu_target "${CONTAINER_NAME}")
-
-    if [ "$reconfigure" = true ]; then
-        echo "Reconfiguring CMake from scratch for GPU target: ${GPU_TARGET_DETECTED}"
-    else
-        echo "Configuring build with CMake for GPU target: ${GPU_TARGET_DETECTED}"
-    fi
-
-    docker exec "${CONTAINER_NAME}" bash -c "
-        cd /workspace || exit 1
-        rm -rf /workspace/build
-        mkdir /workspace/build
-        cd /workspace/build || exit 1
-        cmake .. -GNinja \
-            -DGPU_TARGETS=${GPU_TARGET_DETECTED} \
-            -DCMAKE_BUILD_TYPE=Release \
-            -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++ \
-            -DBUILD_TESTING=ON 2>&1 | tail -30
-    "
-    echo ""
-fi
-
 # Build test if needed (unless --no-build is specified)
 if [ "$no_build" = false ]; then
-    if ! docker exec "${CONTAINER_NAME}" test -f "/workspace/build/bin/${test_name}" 2>/dev/null; then
-        echo "Building ${test_name}..."
-        docker exec "${CONTAINER_NAME}" bash -c "
-            cd /workspace/build || exit 1
-            ninja ${test_name} 2>&1
-        "
-        echo ""
-    else
-        echo "Test executable found, rebuilding to ensure latest version..."
-        docker exec "${CONTAINER_NAME}" bash -c "
-            cd /workspace/build || exit 1
-            ninja ${test_name} 2>&1
-        "
-        echo ""
-    fi
+    info "Building ${test_name}..."
+    "${SCRIPT_DIR}/ck-build" --build-dir "${BUILD_DIR}" "${test_name}"
+    echo ""
+fi
+
+# Verify test executable exists
+test_binary="${BUILD_DIR}/bin/${test_name}"
+if [ ! -f "$test_binary" ]; then
+    error "Test executable not found: ${test_binary}"
+    echo "Run 'ck-build ${test_name}' first"
+    exit 1
 fi

 # Run test
-echo "Running: ${test_name} ${test_options[*]}"
-echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+echo "Running: ${test_name} ${gtest_options[*]}"
+echo "---"

-# Build the command with proper quoting
-cmd="cd /workspace/build && ./bin/${test_name}"
-for opt in "${test_options[@]}"; do
-    cmd="${cmd} $(printf '%q' "$opt")"
-done
-
-docker exec "${CONTAINER_NAME}" bash -c "${cmd}"
+cd "${BUILD_DIR}"
+"./bin/${test_name}" "${gtest_options[@]}"
 exit_code=$?

-echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
+echo "---"
 if [ $exit_code -eq 0 ]; then
-    echo "Test completed successfully"
+    info "Test completed successfully"
 else
-    echo "Test failed with exit code: ${exit_code}"
+    error "Test failed with exit code: ${exit_code}"
 fi

 exit $exit_code
--- a/script/tools/common.sh
+++ b/script/tools/common.sh
@@ -74,14 +74,14 @@ container_is_running() {
 detect_gpu_target() {
    local container="$1"

-    # Allow override via GPU_TARGET environment variable
-    if [ -n "${GPU_TARGET:-}" ]; then
-        echo "${GPU_TARGET}"
+    # Allow override via CK_GPU_TARGET environment variable
+    if [ -n "${CK_GPU_TARGET:-}" ]; then
+        echo "${CK_GPU_TARGET}"
        return 0
    fi

    docker exec "${container}" bash -c "
-        rocminfo 2>/dev/null | grep -oP 'gfx[0-9a-z]+' | head -1 || echo 'gfx950'
+        rocminfo 2>/dev/null | grep -oE 'gfx[0-9a-z]+' | head -1 || echo 'gfx950'
    " | tr -d '\r\n'
 }

@@ -95,3 +95,87 @@ ensure_container_running() {
        "${script_dir}/ck-docker" start "${container}"
    fi
 }
+
+# ============================================================================
+# Native (non-Docker) utilities
+# ============================================================================
+
+# Output utilities
+info()  { echo "[info] $*"; }
+warn()  { echo "[warn] $*" >&2; }
+error() { echo "[error] $*" >&2; }
+
+# Require argument for option (validates $2 exists and is not another flag)
+require_arg() {
+    local option="$1"
+    local value="$2"
+    if [ -z "$value" ] || [[ "$value" == -* ]]; then
+        error "Option $option requires an argument"
+        exit 1
+    fi
+}
+
+# Native GPU detection (no Docker required)
+detect_gpu_native() {
+    # Allow override via CK_GPU_TARGET environment variable
+    if [ -n "${CK_GPU_TARGET:-}" ]; then
+        echo "${CK_GPU_TARGET}"
+        return 0
+    fi
+
+    # Try rocminfo if available
+    if command -v rocminfo &>/dev/null; then
+        local gpu
+        gpu=$(rocminfo 2>/dev/null | grep -oE 'gfx[0-9a-z]+' | head -1)
+        if [ -n "$gpu" ]; then
+            echo "$gpu"
+            return 0
+        fi
+    fi
+
+    # Fallback
+    echo "gfx950"
+}
+
+# Get build directory (respects CK_BUILD_DIR env var)
+get_build_dir() {
+    local project_root="${1:-$(get_project_root "$(dirname "${BASH_SOURCE[0]}")")}"
+    echo "${CK_BUILD_DIR:-${project_root}/build}"
+}
+
+# Check if build is configured (build.ninja exists)
+is_build_configured() {
+    local build_dir="${1:-$(get_build_dir)}"
+    [ -f "${build_dir}/build.ninja" ]
+}
+
+# Find project root from any subdirectory (walks up to find .git)
+find_project_root() {
+    local dir="${1:-$(pwd)}"
+    while [ "$dir" != "/" ]; do
+        if [ -d "$dir/.git" ]; then
+            echo "$dir"
+            return 0
+        fi
+        dir=$(dirname "$dir")
+    done
+    return 1
+}
+
+# List available CMake presets
+list_cmake_presets() {
+    local project_root="${1:-$(find_project_root)}"
+    local presets_file="${project_root}/CMakePresets.json"
+
+    if [ ! -f "$presets_file" ]; then
+        return 1
+    fi
+
+    # Extract non-hidden preset names
+    if command -v jq &>/dev/null; then
+        jq -r '.configurePresets[] | select(.hidden != true) | .name' "$presets_file" 2>/dev/null
+    else
+        # Fallback: sed-based extraction (more portable than grep -P)
+        sed -n 's/.*"name"[[:space:]]*:[[:space:]]*"\([^"]*\)".*/\1/p' "$presets_file" | grep -v '^use-'
+    fi
+}