Improved formatting, clarity, and content of several documents. (#64)

* Improved formatting, clarity, and content of several documents.
This commit is contained in:
Andrew Kerr
2019-11-20 10:42:15 -08:00
committed by GitHub
parent f4d9c8f755
commit 8aca98f9a7
8 changed files with 118 additions and 65 deletions

View File

@@ -90,14 +90,15 @@ is able to unroll the loop bodies, map array elements to registers, and construc
All loops expected to be unrolled should be annotated with `CUTLASS_PRAGMA_UNROLL` to explicitly direct the compiler
to unroll them.
```
```c++
int const kN = 8;
Array<float, kN> x; // Array we would like to store in registers
Array<float, kN> x; // Array we would like to store in registers
CUTLASS_PRAGMA_UNROLL // Directs the CUDA compiler to unroll this loop.
for (int idx = 0; idx < kN; ++idx) { // Loop has constant number of iterations
CUTLASS_PRAGMA_UNROLL // Directs the CUDA compiler to unroll this loop.
for (int idx = 0; idx < kN; ++idx) { // Loop has constant number of iterations.
x[i] = float(idx); // Indirect access by induction variable results in direct register access
x[i] = float(idx); // Indirect access by induction variable results in
// direct register access.
}
```