Files
Vidyasagar Ananthan 92c67a824f [DOCS] Documentation Addition (Readme updates) (#2495)
* GH-2368 Adding a basic glossary

GH-2368 Minor edits

GH-2368 Adding missing READMEs and standardization.

resolving readme updates

GH-2368 Minor improvements to documentation.

Improving some readmes.

Further improvement for readmes.

Cleaned up the documentation in 'client_example' (#2468)

Update for PR

Update ACRONYMS.md to remove trivial terms

Update ACRONYMS.md to provide detailed explanations for BF16 and BF8 formats

Apply suggestion from @spolifroni-amd

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

Apply suggestion from @spolifroni-amd

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

Update README.md to clarify CK Tile API description and remove outdated references to the Tile Engine.

revise 37_transpose readme

revise 36_copy readme

Remove references to the Tile Engine in README files for 19_gemm_multi_d and 35_batched_transpose, and update distribution links for clarity.

Remove references to the Tile Engine in multiple README files and update distribution links for consistency and clarity.

Remove references to the Tile Engine in README files across multiple examples

* GH-2368 Adding a basic glossary

GH-2368 Minor edits

GH-2368 Adding missing READMEs and standardization.

resolving readme updates

GH-2368 Minor improvements to documentation.

Improving some readmes.

Further improvement for readmes.

Cleaned up the documentation in 'client_example' (#2468)

Update for PR

Update ACRONYMS.md to remove trivial terms

Update ACRONYMS.md to provide detailed explanations for BF16 and BF8 formats

Apply suggestion from @spolifroni-amd

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

Apply suggestion from @spolifroni-amd

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

Update README.md to clarify CK Tile API description and remove outdated references to the Tile Engine.

revise 37_transpose readme

revise 36_copy readme

Remove references to the Tile Engine in README files for 19_gemm_multi_d and 35_batched_transpose, and update distribution links for clarity.

Remove references to the Tile Engine in multiple README files and update distribution links for consistency and clarity.

Remove references to the Tile Engine in README files across multiple examples

Refine README files by removing outdated references to the Tile Engine

* Updates based on PR feedback 1

* Updates based on PR feedback 2

* Updates based on PR feedback 3

* Updates based on PR feedback 4

* Updates based on PR feedback 5

* Updates based on PR feedback 6

* Updates based on PR feedback 7

* Updates based on PR feedback 8

* Content Modification of CK Tile Example

* Modify the ck_tile gemm config

---------

Co-authored-by: AviralGoelAMD <aviral.goel@amd.com>
Co-authored-by: ThomasNing <thomas.ning@amd.com>
2025-10-16 03:10:57 -07:00

5.9 KiB
Raw Permalink Blame History

Complex Tensor Contraction with Bilinear Operations

This example demonstrates a complex tensor contraction combined with bilinear operations. This advanced operation handles complex-valued tensors (with real and imaginary components) and performs both tensor contractions and bilinear transformations, which is particularly important for applications in quantum computing, signal processing, and advanced scientific computing.

Mathematical Formulation

The operation combines complex tensor contraction with bilinear operations on complex-valued data.

Given complex tensors with real and imaginary components:

  • Complex tensor A = A_real + i × A_imag
  • Complex tensor B = B_real + i × B_imag
  • Auxiliary complex tensors D, E, ...
  1. Complex Tensor Contraction: Perform tensor contraction using Einstein summation on complex tensors. C_{temp} = \text{einsum}(\text{pattern}, A, B)

    For complex multiplication: (a + bi)(c + di) = (ac - bd) + (ad + bc)i

  2. Bilinear Operations: Apply bilinear transformations involving the contraction result and auxiliary tensors. F = \text{BilinearOp}(C_{temp}, D, E, \ldots)

The bilinear operations can include various combinations such as:

  • F = C_{temp} \odot D + E (elementwise multiply and add)
  • F = \alpha \cdot C_{temp} + \beta \cdot (D \odot E) (scaled combinations)
  • More complex multi-term bilinear expressions

Algorithmic Strategy: Complex-Arithmetic GEMM with Bilinear Epilogue

The implementation handles complex arithmetic throughout the computation pipeline.

  1. Complex Tensor-to-GEMM Mapping:

    • Real/Imaginary Separation: Complex tensors are logically separated into real and imaginary components
    • Complex GEMM: Four real GEMM operations represent one complex GEMM:
      • C_{real} = A_{real} \times B_{real} - A_{imag} \times B_{imag}
      • C_{imag} = A_{real} \times B_{imag} + A_{imag} \times B_{real}
  2. Multi-Component Computation: Within each thread block:

    • Parallel Real/Imaginary Processing: Simultaneously compute real and imaginary components
    • Complex Accumulation: Maintain separate accumulators for real and imaginary parts
    • Register Management: Carefully orchestrate register usage for multiple complex components
  3. Complex Bilinear Epilogue:

    • Load Complex Auxiliary Tensors: Read real and imaginary components of auxiliary tensors
    • Complex Bilinear Operations: Apply the specified bilinear transformations using complex arithmetic
    • Complex Result Storage: Store final complex result with proper real/imaginary organization

Source Code Organization

Build and Run

Prerequisites

Ensure the Composable Kernel library is built and installed.

cd /path/to/composable_kernel/build
make -j install

Build the Example

cd /path/to/composable_kernel/example/66_complex_contraction_bilinear
mkdir build && cd build

cmake \
  -DCMAKE_CXX_COMPILER=/opt/rocm/bin/hipcc \
  -DCMAKE_PREFIX_PATH="/opt/rocm;${CK_INSTALL_PATH}" \
  ..

make -j

Run the Example

#arg1: verification (0=no, 1=yes)
#arg2: initialization (0=no init, 1=integer value, 2=decimal value)
#arg3: time kernel (0=no, 1=yes)
./bin/example_contraction_bilinear_xdl_fp32 1 1 1

Applications

Complex tensor operations with bilinear transformations are essential in several advanced domains:

  • Quantum Computing: Quantum circuit simulations require complex tensor contractions for state evolution and gate operations
  • Signal Processing: Digital signal processing with complex-valued signals, such as in communications and radar systems
  • Fourier Analysis: FFT-related computations that naturally involve complex arithmetic and tensor operations
  • Quantum Chemistry: Electronic structure calculations often involve complex-valued wavefunctions and operators
  • Machine Learning: Some advanced neural network architectures use complex-valued weights and activations
  • Scientific Computing: Simulations involving wave equations, electromagnetic fields, or quantum mechanical systems

Complex Arithmetic Considerations

Working with complex numbers introduces several computational challenges:

  • Memory Layout: Efficient storage of real and imaginary components (interleaved vs. separate arrays)
  • Arithmetic Complexity: Complex multiplication requires 4 real multiplications and 2 real additions
  • Numerical Precision: Maintaining accuracy across multiple complex operations
  • Performance Trade-offs: Balancing between computational complexity and memory bandwidth

Performance Characteristics

Complex operations have unique performance profiles:

  • Computational Intensity: ~2× the arithmetic operations compared to real-valued equivalents
  • Memory Bandwidth: 2× the memory requirements for storing complex values
  • Register Pressure: Higher register usage due to separate real/imaginary components
  • Instruction Complexity: More complex instruction sequences for complex arithmetic

This kernel demonstrates the ability to handle sophisticated mathematical operations efficiently while maintaining the benefits of deep fusion for complex-valued computations.