mirror of
https://github.com/amd/blis.git
synced 2026-04-19 23:28:52 +00:00
Further updates to KernelsHowTo.md, BLISTypedAPI.md.
Details: - Added missing level-1v operations to BLISTypedAPI (e.g. axpbyv, xpbyv). - Updated broken linkes in KernelsHowTo.md based on misnamed anchors. - Other minor changes.
This commit is contained in:
@@ -185,7 +185,7 @@ Notes for interpreting function descriptions:
|
||||
## Operation index
|
||||
|
||||
* **[Level-1v](BLISTypedAPI.md#level-1v-operations)**: Operations on vectors:
|
||||
* [addv](BLISTypedAPI.md#addv), [amaxv](BLISTypedAPI.md#amaxv), [axpyv](BLISTypedAPI.md#axpyv), [copyv](BLISTypedAPI.md#copyv), [dotv](BLISTypedAPI.md#dotv), [dotxv](BLISTypedAPI.md#dotxv), [invertv](BLISTypedAPI.md#invertv), [scal2v](BLISTypedAPI.md#scal2v), [scalv](BLISTypedAPI.md#scalv), [setv](BLISTypedAPI.md#setv), [subv](BLISTypedAPI.md#subv), [swapv](BLISTypedAPI.md#swapv)
|
||||
* [addv](BLISTypedAPI.md#addv), [amaxv](BLISTypedAPI.md#amaxv), [axpyv](BLISTypedAPI.md#axpyv), [axpbyv](BLISTypedAPI.md#axpbyv), [copyv](BLISTypedAPI.md#copyv), [dotv](BLISTypedAPI.md#dotv), [dotxv](BLISTypedAPI.md#dotxv), [invertv](BLISTypedAPI.md#invertv), [scal2v](BLISTypedAPI.md#scal2v), [scalv](BLISTypedAPI.md#scalv), [setv](BLISTypedAPI.md#setv), [subv](BLISTypedAPI.md#subv), [swapv](BLISTypedAPI.md#swapv), [xpbyv](BLISTypedAPI.md#xpbyv)
|
||||
* **[Level-1d](BLISTypedAPI.md#level-1d-operations)**: Element-wise operations on matrix diagonals:
|
||||
* [addd](BLISTypedAPI.md#addd), [axpyd](BLISTypedAPI.md#axpyd), [copyd](BLISTypedAPI.md#copyd), [invertd](BLISTypedAPI.md#invertd), [scald](BLISTypedAPI.md#scald), [scal2d](BLISTypedAPI.md#scal2d), [setd](BLISTypedAPI.md#setd), [setid](BLISTypedAPI.md#setid), [subd](BLISTypedAPI.md#subd)
|
||||
* **[Level-1m](BLISTypedAPI.md#level-1m-operations)**: Element-wise operations on matrices:
|
||||
@@ -214,7 +214,7 @@ Level-1v operations perform various level-1 BLAS-like operations on vectors (hen
|
||||
void bli_?addv
|
||||
(
|
||||
conj_t conjx,
|
||||
dim_t m,
|
||||
dim_t n,
|
||||
ctype* x, inc_t incx,
|
||||
ctype* y, inc_t incy
|
||||
);
|
||||
@@ -223,8 +223,7 @@ Perform
|
||||
```
|
||||
y := y + conjx(x)
|
||||
```
|
||||
|
||||
where `y` and `x` are vectors of length _m_.
|
||||
where `x` and `y` are vectors of length _n_.
|
||||
|
||||
---
|
||||
|
||||
@@ -237,7 +236,9 @@ void bli_?amaxv
|
||||
dim_t* index
|
||||
);
|
||||
```
|
||||
Find the element of vector `x` which contains the maximum absolute value. The index of the element found is stored to `index`.
|
||||
Given a vector of length _n_, return the zero-based index `index` of the element of vector `x` that contains the largest absolute value (or, in the complex domain, the largest complex modulus).
|
||||
|
||||
If `NaN` is encountered, it is treated as if it were a valid value that was smaller than any other value in the vector. If more than one element contains the same maximum value, the index of the latter element is returned via `index`.
|
||||
|
||||
**Note:** This function attempts to mimic the algorithm for finding the element with the maximum absolute value in the netlib BLAS routines `i?amax()`.
|
||||
|
||||
@@ -248,7 +249,7 @@ Find the element of vector `x` which contains the maximum absolute value. The in
|
||||
void bli_?axpyv
|
||||
(
|
||||
conj_t conjx,
|
||||
dim_t m,
|
||||
dim_t n,
|
||||
ctype* alpha,
|
||||
ctype* x, inc_t incx,
|
||||
ctype* y, inc_t incy
|
||||
@@ -258,8 +259,28 @@ Perform
|
||||
```
|
||||
y := y + alpha * conjx(x)
|
||||
```
|
||||
where `x` and `y` are vectors of length _n_, and `alpha` is a scalar.
|
||||
|
||||
where `y` and `x` are vectors of length _m_.
|
||||
---
|
||||
|
||||
#### axpbyv
|
||||
```
|
||||
void bli_?axpbyv
|
||||
(
|
||||
conj_t conjx,
|
||||
dim_t n,
|
||||
ctype* alpha,
|
||||
ctype* x, inc_t incx,
|
||||
ctype* beta,
|
||||
ctype* y, inc_t incy,
|
||||
cntx_t* cntx
|
||||
)
|
||||
```
|
||||
Perform
|
||||
```
|
||||
y := beta * y + alpha * conjx(x)
|
||||
```
|
||||
where `x` and `y` are vectors of length _n_, and `alpha` and `beta` are scalars.
|
||||
|
||||
---
|
||||
|
||||
@@ -268,7 +289,7 @@ where `y` and `x` are vectors of length _m_.
|
||||
void bli_?copyv
|
||||
(
|
||||
conj_t conjx,
|
||||
dim_t m,
|
||||
dim_t n,
|
||||
ctype* x, inc_t incx,
|
||||
ctype* y, inc_t incy
|
||||
);
|
||||
@@ -277,8 +298,7 @@ Perform
|
||||
```
|
||||
y := conjx(x)
|
||||
```
|
||||
|
||||
where `y` and `x` are vectors of length _m_.
|
||||
where `x` and `y` are vectors of length _n_.
|
||||
|
||||
---
|
||||
|
||||
@@ -288,7 +308,7 @@ void bli_?dotv
|
||||
(
|
||||
conj_t conjx,
|
||||
conj_t conjy,
|
||||
dim_t m,
|
||||
dim_t n,
|
||||
ctype* x, inc_t incx,
|
||||
ctype* y, inc_t incy,
|
||||
ctype* rho
|
||||
@@ -298,8 +318,7 @@ Perform
|
||||
```
|
||||
rho := conjx(x)^T * conjy(y)
|
||||
```
|
||||
|
||||
where `y` and `x` are vectors of length _m_ and `rho` is a scalar.
|
||||
where `x` and `y` are vectors of length _n_, and `rho` is a scalar.
|
||||
|
||||
---
|
||||
|
||||
@@ -309,7 +328,7 @@ void bli_?dotxv
|
||||
(
|
||||
conj_t conjx,
|
||||
conj_t conjy,
|
||||
dim_t m,
|
||||
dim_t n,
|
||||
ctype* alpha,
|
||||
ctype* x, inc_t incx,
|
||||
ctype* y, inc_t incy,
|
||||
@@ -321,8 +340,7 @@ Perform
|
||||
```
|
||||
rho := beta * rho + alpha * conjx(x)^T * conjy(y)
|
||||
```
|
||||
|
||||
where `y` and `x` are vectors of length _m_ and `rho` is a scalar.
|
||||
where `x` and `y` are vectors of length _n_, and `alpha`, `beta`, and `rho` are scalars.
|
||||
|
||||
---
|
||||
|
||||
@@ -330,11 +348,11 @@ where `y` and `x` are vectors of length _m_ and `rho` is a scalar.
|
||||
```c
|
||||
void bli_?invertv
|
||||
(
|
||||
dim_t m,
|
||||
dim_t n,
|
||||
ctype* x, inc_t incx
|
||||
);
|
||||
```
|
||||
Invert all elements of an _m_-length vector `x`.
|
||||
Invert all elements of an _n_-length vector `x`.
|
||||
|
||||
---
|
||||
|
||||
@@ -343,7 +361,7 @@ Invert all elements of an _m_-length vector `x`.
|
||||
void bli_?scalv
|
||||
(
|
||||
conj_t conjalpha,
|
||||
dim_t m,
|
||||
dim_t n,
|
||||
ctype* alpha,
|
||||
ctype* x, inc_t incx
|
||||
);
|
||||
@@ -352,8 +370,7 @@ Perform
|
||||
```
|
||||
x := conjalpha(alpha) * x
|
||||
```
|
||||
|
||||
where `x` is a vector of length _m_.
|
||||
where `x` is a vector of length _n_, and `alpha` is a scalar.
|
||||
|
||||
---
|
||||
|
||||
@@ -362,7 +379,7 @@ where `x` is a vector of length _m_.
|
||||
void bli_?scal2v
|
||||
(
|
||||
conj_t conjx,
|
||||
dim_t m,
|
||||
dim_t n,
|
||||
ctype* alpha,
|
||||
ctype* x, inc_t incx,
|
||||
ctype* y, inc_t incy
|
||||
@@ -372,8 +389,7 @@ Perform
|
||||
```
|
||||
y := alpha * conjx(x)
|
||||
```
|
||||
|
||||
where `y` and `x` are vectors of length _m_.
|
||||
where `x` and `y` are vectors of length _n_, and `alpha` is a scalar.
|
||||
|
||||
---
|
||||
|
||||
@@ -382,12 +398,12 @@ where `y` and `x` are vectors of length _m_.
|
||||
void bli_?setv
|
||||
(
|
||||
conj_t conjalpha,
|
||||
dim_t m,
|
||||
dim_t n,
|
||||
ctype* alpha,
|
||||
ctype* x, inc_t incx
|
||||
);
|
||||
```
|
||||
Set all elements of an _m_-length vector `x` to `conjalpha(alpha)`.
|
||||
Set all elements of an _n_-length vector `x` to scalar `conjalpha(alpha)`.
|
||||
|
||||
---
|
||||
|
||||
@@ -396,7 +412,7 @@ Set all elements of an _m_-length vector `x` to `conjalpha(alpha)`.
|
||||
void bli_?subv
|
||||
(
|
||||
conj_t conjx,
|
||||
dim_t m,
|
||||
dim_t n,
|
||||
ctype* x, inc_t incx,
|
||||
ctype* y, inc_t incy
|
||||
);
|
||||
@@ -405,8 +421,7 @@ Perform
|
||||
```
|
||||
y := y - conjx(x)
|
||||
```
|
||||
|
||||
where `y` and `x` are vectors of length _m_.
|
||||
where `x` and `y` are vectors of length _n_.
|
||||
|
||||
---
|
||||
|
||||
@@ -414,12 +429,32 @@ where `y` and `x` are vectors of length _m_.
|
||||
```c
|
||||
void bli_?swapv
|
||||
(
|
||||
dim_t m,
|
||||
dim_t n,
|
||||
ctype* x, inc_t incx,
|
||||
ctype* y, inc_t incy
|
||||
);
|
||||
```
|
||||
Swap corresponding elements of two _m_-length vectors `x` and `y`.
|
||||
Swap corresponding elements of two _n_-length vectors `x` and `y`.
|
||||
|
||||
---
|
||||
|
||||
#### xpbyv
|
||||
```
|
||||
void bli_?xpbyv
|
||||
(
|
||||
conj_t conjx,
|
||||
dim_t n,
|
||||
ctype* x, inc_t incx,
|
||||
ctype* beta,
|
||||
ctype* y, inc_t incy,
|
||||
cntx_t* cntx
|
||||
)
|
||||
```
|
||||
Perform
|
||||
```
|
||||
y := beta * y + conjx(x)
|
||||
```
|
||||
where `x` and `y` are vectors of length _n_, and `beta` is a scalar.
|
||||
|
||||
---
|
||||
|
||||
@@ -1557,11 +1592,11 @@ Print an _m x n_ matrix `a` to standard output. This function call is equivalent
|
||||
```c
|
||||
void bli_?randv
|
||||
(
|
||||
dim_t m,
|
||||
dim_t n,
|
||||
ctype* x, inc_t incx
|
||||
);
|
||||
```
|
||||
Set the elements of a vector `x` of length _m_ to random values on the interval `[-1,1)`.
|
||||
Set the elements of a vector `x` of length _n_ to random values on the interval `[-1,1)`.
|
||||
|
||||
**Note:** For complex datatypes, the real and imaginary components of each element are randomized individually and independently of one another.
|
||||
|
||||
@@ -1588,13 +1623,13 @@ Set the elements of an _m x n_ matrix `A` to random values on the interval `[-1,
|
||||
```c
|
||||
void bli_?sumsqv
|
||||
(
|
||||
dim_t m,
|
||||
dim_t n,
|
||||
ctype* x, inc_t incx,
|
||||
rtype* scale,
|
||||
rtype* sumsq
|
||||
);
|
||||
```
|
||||
Compute the sum of the squares of the elements in a vector `x` of length _m_. The result is computed in scaled form, and in such a way that it may be used repeatedly to accumulate the sum of the squares of several vectors.
|
||||
Compute the sum of the squares of the elements in a vector `x` of length _n_. The result is computed in scaled form, and in such a way that it may be used repeatedly to accumulate the sum of the squares of several vectors.
|
||||
|
||||
The function computes scale\_new and sumsq\_new such that
|
||||
```
|
||||
|
||||
@@ -605,6 +605,8 @@ Note that these implementations are coded in C99 and lack several kinds of optim
|
||||
|
||||
### Level-1f kernels
|
||||
|
||||
---
|
||||
|
||||
#### axpy2v kernel
|
||||
```
|
||||
void bli_?axpy2v_<suffix>
|
||||
@@ -626,7 +628,9 @@ This kernel performs the following operation:
|
||||
```
|
||||
where `x`, `y`, and `z` are vectors of length _n_ stored with strides `incx`, `incy`, and `incz`, respectively. This kernel is typically implemented as the fusion of two `axpyv` operations on different input vectors `x` and `y` and with different scalars `alphax` and `alpay` to update the same output vector `z`.
|
||||
|
||||
#### dotaxpyv
|
||||
---
|
||||
|
||||
#### dotaxpyv kernel
|
||||
```
|
||||
void bli_?dotaxpyv_<suffix>
|
||||
(
|
||||
@@ -649,7 +653,9 @@ This kernel performs the following operation:
|
||||
```
|
||||
where `x`, `y`, and `z` are vectors of length _n_ stored with strides `incx`, `incy`, and `incz`, respectively, and `rho` is a scalar. This kernel is typically implemented as a `dotv` operation fused with an `axpyv` operation.
|
||||
|
||||
#### axpyf
|
||||
---
|
||||
|
||||
#### axpyf kernel
|
||||
```
|
||||
void bli_?axpyf_<suffix>
|
||||
(
|
||||
@@ -670,7 +676,9 @@ This kernel performs the following operation:
|
||||
```
|
||||
where `a` is an _m_ x _b_ matrix, `x` is a vector of length _b_, and `y` is a vector of length _m_. Vectors `x` and `y` are stored with strides `incx` and `incy`, respectively. Matrix `a` is stored with row stride `inca` and column stride `lda`, though `inca` is most often (in practice) unit. This kernel is typically implemented as a fused series of _b_ `axpyv` operations updating the same vector `y` (with the elements of `x` serving as the scalars and the columns of `a` serving as the vectors to be scaled).
|
||||
|
||||
#### dotxf
|
||||
---
|
||||
|
||||
#### dotxf kernel
|
||||
```
|
||||
void bli_?dotxf_<suffix>
|
||||
(
|
||||
@@ -694,7 +702,9 @@ where `a` is an _m_ x _b_ matrix, where `w` is a vector of length _m_, `y` is a
|
||||
Vectors `x` and `y` are stored with strides `incx` and `incy`, respectively. Matrix `a` is stored with row stride `inca` and column stride `lda`, though `inca` is most often (in practice) unit.
|
||||
This kernel is typically implemented as a series of _b_ `dotxv` operations with the same right-hand operand vector `x` (contracted with the rows of `a^T` and accumulating to the corresponding elements of vector `y`).
|
||||
|
||||
#### dotxaxpyf
|
||||
---
|
||||
|
||||
#### dotxaxpyf kernel
|
||||
```
|
||||
void bli_?dotxaxpyf_<suffix>
|
||||
(
|
||||
@@ -723,11 +733,15 @@ where `a` is an _m_ x _b_ matrix, `w` and `z` are vectors of length _m_, `x` and
|
||||
Vectors `w`, `z`, `x` and `y` are stored with strides `incw`, `incz`, `incx`, and `incy`, respectively. Matrix `a` is stored with row stride `inca` and column stride `lda`, though `inca` is most often (in practice) unit.
|
||||
This kernel is typically implemented as a series of _b_ `dotxv` operations with the same right-hand operand vector `w` fused with a series of _b_ `axpyv` operations updating the same vector `z`.
|
||||
|
||||
---
|
||||
|
||||
|
||||
|
||||
### Level-1v kernels
|
||||
|
||||
#### addv
|
||||
---
|
||||
|
||||
#### addv kernel
|
||||
```
|
||||
void bli_?addv_<suffix>
|
||||
(
|
||||
@@ -744,21 +758,25 @@ This kernel performs the following operation:
|
||||
```
|
||||
where `x` and `y` are vectors of length _n_ stored with strides `incx` and `incy`, respectively.
|
||||
|
||||
#### amaxv
|
||||
---
|
||||
|
||||
#### amaxv kernel
|
||||
```
|
||||
void bli_?amaxv_<suffix>
|
||||
(
|
||||
dim_t n,
|
||||
ctype* restrict x, inc_t incx,
|
||||
dim_t* restrict i,
|
||||
dim_t* restrict index,
|
||||
cntx_t* restrict cntx
|
||||
)
|
||||
```
|
||||
Given a vector of length _n_, this kernel returns the zero-based index `i` of the element of vector `x` that contains the largest absolute value (or, in the complex domain, complex modulus).
|
||||
If `NaN` is encountered, it is treated as if it were a valid value that was smaller than any other value in the vector.
|
||||
If more than one element contains the same maximum value, the index of the latter element is returned via `i`.
|
||||
Given a vector of length _n_, this kernel returns the zero-based index `index` of the element of vector `x` that contains the largest absolute value (or, in the complex domain, the largest complex modulus).
|
||||
|
||||
#### axpyv
|
||||
If `NaN` is encountered, it is treated as if it were a valid value that was smaller than any other value in the vector. If more than one element contains the same maximum value, the index of the latter element is returned via `index`.
|
||||
|
||||
---
|
||||
|
||||
#### axpyv kernel
|
||||
```
|
||||
void bli_?axpyv_<suffix>
|
||||
(
|
||||
@@ -776,7 +794,9 @@ This kernel performs the following operation:
|
||||
```
|
||||
where `x` and `y` are vectors of length _n_ stored with strides `incx` and `incy`, respectively, and `alpha` is a scalar.
|
||||
|
||||
#### axpbyv
|
||||
---
|
||||
|
||||
#### axpbyv kernel
|
||||
```
|
||||
void bli_?axpbyv_<suffix>
|
||||
(
|
||||
@@ -795,7 +815,9 @@ This kernel performs the following operation:
|
||||
```
|
||||
where `x` and `y` are vectors of length _n_ stored with strides `incx` and `incy`, respectively, and `alpha` and `beta` are scalars.
|
||||
|
||||
#### copyv
|
||||
---
|
||||
|
||||
#### copyv kernel
|
||||
```
|
||||
void bli_?copyv_<suffix>
|
||||
(
|
||||
@@ -812,7 +834,9 @@ This kernel performs the following operation:
|
||||
```
|
||||
where `x` and `y` are vectors of length _n_ stored with strides `incx` and `incy`, respectively.
|
||||
|
||||
#### dotv
|
||||
---
|
||||
|
||||
#### dotv kernel
|
||||
```
|
||||
void bli_?dotv_<suffix>
|
||||
(
|
||||
@@ -831,7 +855,9 @@ This kernel performs the following operation:
|
||||
```
|
||||
where `x` and `y` are vectors of length _n_ stored with strides `incx` and `incy`, respectively, and `rho` is a scalar.
|
||||
|
||||
#### dotxv
|
||||
---
|
||||
|
||||
#### dotxv kernel
|
||||
```
|
||||
void bli_?dotxv_<suffix>
|
||||
(
|
||||
@@ -852,7 +878,9 @@ This kernel performs the following operation:
|
||||
```
|
||||
where `x` and `y` are vectors of length _n_ stored with strides `incx` and `incy`, respectively, and `alpha`, `beta`, and `rho` are scalars.
|
||||
|
||||
#### invertv
|
||||
---
|
||||
|
||||
#### invertv kernel
|
||||
```
|
||||
void bli_?invertv_<suffix>
|
||||
(
|
||||
@@ -861,13 +889,11 @@ void bli_?invertv_<suffix>
|
||||
cntx_t* restrict cntx
|
||||
)
|
||||
```
|
||||
This kernel performs the following operation:
|
||||
```
|
||||
x := inv(x)
|
||||
```
|
||||
where inv() denotes element-wise inversion.
|
||||
This kernel inverts all elements of an _n_-length vector `x`.
|
||||
|
||||
#### scalv
|
||||
---
|
||||
|
||||
#### scalv kernel
|
||||
```
|
||||
void bli_?scalv_<suffix>
|
||||
(
|
||||
@@ -884,7 +910,9 @@ This kernel performs the following operation:
|
||||
```
|
||||
where `x` is a vector of length _n_ stored with stride `incx` and `alpha` is a scalar.
|
||||
|
||||
#### scal2v
|
||||
---
|
||||
|
||||
#### scal2v kernel
|
||||
```
|
||||
void bli_?scal2v_<suffix>
|
||||
(
|
||||
@@ -902,7 +930,9 @@ This kernel performs the following operation:
|
||||
```
|
||||
where `x` and `y` are vectors of length _n_ stored with strides `incx` and `incy`, respectively, and `alpha` is a scalar.
|
||||
|
||||
#### setv
|
||||
---
|
||||
|
||||
#### setv kernel
|
||||
```
|
||||
void bli_?setv_<suffix>
|
||||
(
|
||||
@@ -919,7 +949,9 @@ This kernel performs the following operation:
|
||||
```
|
||||
where `x` is a vector of length _n_ stored with stride `incx` and `alpha` is a scalar. Note that here, the `:=` operator represents a broadcast of `conjalpha(alpha)` to every element in `x`.
|
||||
|
||||
#### subv
|
||||
---
|
||||
|
||||
#### subv kernel
|
||||
```
|
||||
void bli_?subv_<suffix>
|
||||
(
|
||||
@@ -936,7 +968,9 @@ This kernel performs the following operation:
|
||||
```
|
||||
where `x` and `y` are vectors of length _n_.
|
||||
|
||||
#### swapv
|
||||
---
|
||||
|
||||
#### swapv kernel
|
||||
```
|
||||
void bli_?swapv_<suffix>
|
||||
(
|
||||
@@ -946,15 +980,11 @@ void bli_?swapv_<suffix>
|
||||
cntx_t* restrict cntx
|
||||
)
|
||||
```
|
||||
This kernel performs the following operation:
|
||||
```
|
||||
t := x
|
||||
x := y
|
||||
y := t
|
||||
```
|
||||
where `x` and `y` are vectors of length _n_ stored with strides `incx` and `incy`, respectively, and `t` represents a temporary vector of length _n_ for illustrative purposes only. (No additional memory is allocated as part of this operation.)
|
||||
This kernel swaps corresponding elements of two _n_-length vectors `x` and `y` stored with strides `incx` and `incy`, respectively.
|
||||
|
||||
#### xpbyv
|
||||
---
|
||||
|
||||
#### xpbyv kernel
|
||||
```
|
||||
void bli_?xpbyv_<suffix>
|
||||
(
|
||||
|
||||
Reference in New Issue
Block a user