mirror of
https://github.com/amd/blis.git
synced 2026-04-19 23:28:52 +00:00
Merge commit 'b683d01b' into amd-main
* commit 'b683d01b': Use extra #undef when including ba/ex API headers. Minor preprocessor/header cleanup. Fixed typo in cpp guard in bli_util_ft.h. Defined eqsc, eqv, eqm to test object equality. Defined setijv, getijv to set/get vector elements. Minor API breakage in bli_pack API. Add err_t* "return" parameter to malloc functions. Always stay initialized after BLAS compat calls. Renamed membrk files/vars/functions to pba. Switch allocator mutexes to static initialization. AMD-Internal: [CPUPL-2698] Change-Id: Ied2ca8619f144d4b8a7123ac45a1be0dda3875df
This commit is contained in:
@@ -53,7 +53,7 @@ This index provides a quick way to jump directly to the description for each ope
|
||||
* **[Level-3](BLISObjectAPI.md#level-3-operations)**: Operations with matrices that are multiplication-like:
|
||||
* [gemm](BLISObjectAPI.md#gemm), [hemm](BLISObjectAPI.md#hemm), [herk](BLISObjectAPI.md#herk), [her2k](BLISObjectAPI.md#her2k), [symm](BLISObjectAPI.md#symm), [syrk](BLISObjectAPI.md#syrk), [syr2k](BLISObjectAPI.md#syr2k), [trmm](BLISObjectAPI.md#trmm), [trmm3](BLISObjectAPI.md#trmm3), [trsm](BLISObjectAPI.md#trsm)
|
||||
* **[Utility](BLISObjectAPI.md#Utility-operations)**: Miscellaneous operations on matrices and vectors:
|
||||
* [asumv](BLISObjectAPI.md#asumv), [norm1v](BLISObjectAPI.md#norm1v), [normfv](BLISObjectAPI.md#normfv), [normiv](BLISObjectAPI.md#normiv), [norm1m](BLISObjectAPI.md#norm1m), [normfm](BLISObjectAPI.md#normfm), [normim](BLISObjectAPI.md#normim), [mkherm](BLISObjectAPI.md#mkherm), [mksymm](BLISObjectAPI.md#mksymm), [mktrim](BLISObjectAPI.md#mktrim), [fprintv](BLISObjectAPI.md#fprintv), [fprintm](BLISObjectAPI.md#fprintm),[printv](BLISObjectAPI.md#printv), [printm](BLISObjectAPI.md#printm), [randv](BLISObjectAPI.md#randv), [randm](BLISObjectAPI.md#randm), [sumsqv](BLISObjectAPI.md#sumsqv), [getijm](BLISObjectAPI.md#getijm), [setijm](BLISObjectAPI.md#setijm)
|
||||
* [asumv](BLISObjectAPI.md#asumv), [norm1v](BLISObjectAPI.md#norm1v), [normfv](BLISObjectAPI.md#normfv), [normiv](BLISObjectAPI.md#normiv), [norm1m](BLISObjectAPI.md#norm1m), [normfm](BLISObjectAPI.md#normfm), [normim](BLISObjectAPI.md#normim), [mkherm](BLISObjectAPI.md#mkherm), [mksymm](BLISObjectAPI.md#mksymm), [mktrim](BLISObjectAPI.md#mktrim), [fprintv](BLISObjectAPI.md#fprintv), [fprintm](BLISObjectAPI.md#fprintm),[printv](BLISObjectAPI.md#printv), [printm](BLISObjectAPI.md#printm), [randv](BLISObjectAPI.md#randv), [randm](BLISObjectAPI.md#randm), [sumsqv](BLISObjectAPI.md#sumsqv), [getsc](BLISObjectAPI.md#getsc), [getijv](BLISObjectAPI.md#getijv), [getijm](BLISObjectAPI.md#getijm), [setsc](BLISObjectAPI.md#setsc), [setijv](BLISObjectAPI.md#setijv), [setijm](BLISObjectAPI.md#setijm), [eqsc](BLISObjectAPI.md#eqsc), [eqv](BLISObjectAPI.md#eqv), [eqm](BLISObjectAPI.md#eqm)
|
||||
|
||||
|
||||
|
||||
@@ -790,6 +790,8 @@ Perform
|
||||
```
|
||||
where `x` and `y` are vectors of length _n_.
|
||||
|
||||
Observed object properties: `conj?(x)`.
|
||||
|
||||
---
|
||||
|
||||
#### dotv
|
||||
@@ -807,6 +809,8 @@ Perform
|
||||
```
|
||||
where `x` and `y` are vectors of length _n_, and `rho` is a scalar.
|
||||
|
||||
Observed object properties: `conj?(x)`, `conj?(y)`.
|
||||
|
||||
---
|
||||
|
||||
#### dotxv
|
||||
@@ -826,6 +830,8 @@ Perform
|
||||
```
|
||||
where `x` and `y` are vectors of length _n_, and `alpha`, `beta`, and `rho` are scalars.
|
||||
|
||||
Observed object properties: `conj?(alpha)`, `conj?(beta)`, `conj?(x)`, `conj?(y)`.
|
||||
|
||||
---
|
||||
|
||||
#### invertv
|
||||
@@ -2125,6 +2131,34 @@ where, on entry, `scale` and `sumsq` contain `scale_old` and `sumsq_old`, respec
|
||||
|
||||
---
|
||||
|
||||
#### getsc
|
||||
```c
|
||||
void bli_getsc
|
||||
(
|
||||
obj_t* chi,
|
||||
double* zeta_r,
|
||||
double* zeta_i
|
||||
)
|
||||
```
|
||||
Copy the real and imaginary values from the scalar object `chi` to `zeta_r` and `zeta_i`. If `chi` is stored as a real type, then `zeta_i` is set to zero. (If `chi` is stored in single precision, the corresponding elements are typecast/promoted during the copy.)
|
||||
|
||||
---
|
||||
|
||||
#### getijv
|
||||
```c
|
||||
err_t bli_getijv
|
||||
(
|
||||
dim_t i,
|
||||
obj_t* b,
|
||||
double* ar,
|
||||
double* ai
|
||||
)
|
||||
```
|
||||
Copy the real and imaginary values at the `i`th element of vector object `x` to `ar` and `ai`. If elements of `x` are stored as real types, then only `ar` is overwritten and `ai` is left unchanged. (If `x` contains elements stored in single precision, the corresponding elements are typecast/promoted during the copy.)
|
||||
If either the element offset `i` is beyond the vector dimension of `x` or less than zero, the function returns `BLIS_FAILURE` without taking any action. Similarly, if `x` is a global scalar constant such as `BLIS_ONE`, the function returns `BLIS_FAILURE`.
|
||||
|
||||
---
|
||||
|
||||
#### getijm
|
||||
```c
|
||||
err_t bli_getijm
|
||||
@@ -2136,8 +2170,38 @@ err_t bli_getijm
|
||||
double* ai
|
||||
)
|
||||
```
|
||||
Copy the real and imaginary values at the (`i`,`j`) element of object `b` to `ar` and `ai`. f elements of `b` are stored as real types, then only `ar` is overwritten and `ai` is left unchanged. (If `b` contains elements stored in single precision, the corresponding elements are typecast/promoted during the copy.)
|
||||
If either the row offset `i` is beyond the _m_ dimension of `b`, or column offset `j` is beyond the _n_ dimension of `b`, the function does not perform any copy and returns `BLIS_FAILURE`. Similarly, if `b` is a global scalar constant such as `BLIS_ONE`, `BLIS_FAILURE` is returned.
|
||||
Copy the real and imaginary values at the (`i`,`j`) element of object `b` to `ar` and `ai`. If elements of `b` are stored as real types, then only `ar` is overwritten and `ai` is left unchanged. (If `b` contains elements stored in single precision, the corresponding elements are typecast/promoted during the copy.)
|
||||
If either the row offset `i` is beyond the _m_ dimension of `b` or less than zero, or column offset `j` is beyond the _n_ dimension of `b` or less than zero, the function returns `BLIS_FAILURE` without taking any action. Similarly, if `b` is a global scalar constant such as `BLIS_ONE`, the function returns `BLIS_FAILURE`.
|
||||
|
||||
---
|
||||
|
||||
#### setsc
|
||||
```c
|
||||
void bli_setsc
|
||||
(
|
||||
double* zeta_r,
|
||||
double* zeta_i,
|
||||
obj_t* chi
|
||||
);
|
||||
```
|
||||
Copy real and imaginary values `zeta_r` and `zeta_i` to the scalar object `chi`. If `chi` is stored as a real type, then `zeta_i` is ignored. (If `chi` is stored in single precision, the contents are typecast/demoted during the copy.)
|
||||
|
||||
---
|
||||
|
||||
#### setijv
|
||||
```c
|
||||
err_t bli_setijv
|
||||
(
|
||||
double ar,
|
||||
double ai,
|
||||
dim_t i,
|
||||
obj_t* x
|
||||
);
|
||||
```
|
||||
Copy real and imaginary values `ar` and `ai` to the `i`th element of vector object `x`. If elements of `x` are stored as real types, then only `ar` is copied and `ai` is ignored. (If `x` contains elements stored in single precision, the corresponding elements are typecast/demoted during the copy.)
|
||||
If the element offset `i` is beyond the vector dimension of `x` or less than zero, the function returns `BLIS_FAILURE` without taking any action. Similarly, if `x` is a global scalar constant such as `BLIS_ONE`, the function returns `BLIS_FAILURE`.
|
||||
|
||||
---
|
||||
|
||||
#### setijm
|
||||
```c
|
||||
@@ -2151,7 +2215,59 @@ err_t bli_setijm
|
||||
);
|
||||
```
|
||||
Copy real and imaginary values `ar` and `ai` to the (`i`,`j`) element of object `b`. If elements of `b` are stored as real types, then only `ar` is copied and `ai` is ignored. (If `b` contains elements stored in single precision, the corresponding elements are typecast/demoted during the copy.)
|
||||
If either the row offset `i` is beyond the _m_ dimension of `b`, or column offset `j` is beyond the _n_ dimension of `b`, the function does not perform any copy and returns `BLIS_FAILURE`. Similarly, if `b` is a global scalar constant such as `BLIS_ONE`, `BLIS_FAILURE` is returned.
|
||||
If either the row offset `i` is beyond the _m_ dimension of `b` or less than zero, or column offset `j` is beyond the _n_ dimension of `b` or less than zero, the function returns `BLIS_FAILURE` without taking any action. Similarly, if `b` is a global scalar constant such as `BLIS_ONE`, the function returns `BLIS_FAILURE`.
|
||||
|
||||
---
|
||||
|
||||
#### eqsc
|
||||
```c
|
||||
void bli_eqsc
|
||||
(
|
||||
obj_t chi,
|
||||
obj_t psi,
|
||||
bool* is_eq
|
||||
);
|
||||
```
|
||||
Perform an element-wise comparison between scalars `chi` and `psi` and store the boolean result in the `bool` pointed to by `is_eq`.
|
||||
If exactly one of `conj(chi)` or `conj(psi)` (but not both) indicate a conjugation, then one of the scalars will be implicitly conjugated for purposes of the comparision.
|
||||
|
||||
Observed object properties: `conj?(chi)`, `conj?(psi)`.
|
||||
|
||||
---
|
||||
|
||||
#### eqv
|
||||
```c
|
||||
void bli_eqv
|
||||
(
|
||||
obj_t x,
|
||||
obj_t y,
|
||||
bool* is_eq
|
||||
);
|
||||
```
|
||||
Perform an element-wise comparison between vectors `x` and `y` and store the boolean result in the `bool` pointed to by `is_eq`.
|
||||
If exactly one of `conj(x)` or `conj(y)` (but not both) indicate a conjugation, then one of the vectors will be implicitly conjugated for purposes of the comparision.
|
||||
|
||||
Observed object properties: `conj?(x)`, `conj?(y)`.
|
||||
|
||||
---
|
||||
|
||||
#### eqm
|
||||
```c
|
||||
void bli_eqm
|
||||
(
|
||||
obj_t a,
|
||||
obj_t b,
|
||||
bool* is_eq
|
||||
);
|
||||
```
|
||||
Perform an element-wise comparison between matrices `A` and `B` and store the boolean result in the `bool` pointed to by `is_eq`.
|
||||
Here, `A` is stored as a dense matrix, or lower- or upper-triangular/trapezoidal matrix with arbitrary diagonal offset and unit or non-unit diagonal.
|
||||
If `diag(A)` indicates a unit diagonal, the diagonals of both matrices will be ignored for purposes of the comparision.
|
||||
If `uplo(A)` indicates lower or upper storage, only that part of both matrices `A` and `B` will be referenced.
|
||||
If exactly one of `trans(A)` or `trans(B)` (but not both) indicate a transposition, then one of the matrices will be transposed for purposes of the comparison.
|
||||
Similarly, if exactly one of `trans(A)` or `trans(B)` (but not both) indicate a conjugation, then one of the matrices will be implicitly conjugated for purposes of the comparision.
|
||||
|
||||
Observed object properties: `diagoff(A)`, `diag(A)`, `uplo(A)`, `trans?(A)`, `trans?(B)`.
|
||||
|
||||
|
||||
|
||||
|
||||
@@ -48,7 +48,7 @@ This index provides a quick way to jump directly to the description for each ope
|
||||
* **[Level-3](BLISTypedAPI.md#level-3-operations)**: Operations with matrices that are multiplication-like:
|
||||
* [gemm](BLISTypedAPI.md#gemm), [hemm](BLISTypedAPI.md#hemm), [herk](BLISTypedAPI.md#herk), [her2k](BLISTypedAPI.md#her2k), [symm](BLISTypedAPI.md#symm), [syrk](BLISTypedAPI.md#syrk), [syr2k](BLISTypedAPI.md#syr2k), [trmm](BLISTypedAPI.md#trmm), [trmm3](BLISTypedAPI.md#trmm3), [trsm](BLISTypedAPI.md#trsm)
|
||||
* **[Utility](BLISTypedAPI.md#Utility-operations)**: Miscellaneous operations on matrices and vectors:
|
||||
* [asumv](BLISTypedAPI.md#asumv), [norm1v](BLISTypedAPI.md#norm1v), [normfv](BLISTypedAPI.md#normfv), [normiv](BLISTypedAPI.md#normiv), [norm1m](BLISTypedAPI.md#norm1m), [normfm](BLISTypedAPI.md#normfm), [normim](BLISTypedAPI.md#normim), [mkherm](BLISTypedAPI.md#mkherm), [mksymm](BLISTypedAPI.md#mksymm), [mktrim](BLISTypedAPI.md#mktrim), [fprintv](BLISTypedAPI.md#fprintv), [fprintm](BLISTypedAPI.md#fprintm),[printv](BLISTypedAPI.md#printv), [printm](BLISTypedAPI.md#printm), [randv](BLISTypedAPI.md#randv), [randm](BLISTypedAPI.md#randm), [sumsqv](BLISTypedAPI.md#sumsqv)
|
||||
* [asumv](BLISTypedAPI.md#asumv), [norm1v](BLISTypedAPI.md#norm1v), [normfv](BLISTypedAPI.md#normfv), [normiv](BLISTypedAPI.md#normiv), [norm1m](BLISTypedAPI.md#norm1m), [normfm](BLISTypedAPI.md#normfm), [normim](BLISTypedAPI.md#normim), [mkherm](BLISTypedAPI.md#mkherm), [mksymm](BLISTypedAPI.md#mksymm), [mktrim](BLISTypedAPI.md#mktrim), [fprintv](BLISTypedAPI.md#fprintv), [fprintm](BLISTypedAPI.md#fprintm),[printv](BLISTypedAPI.md#printv), [printm](BLISTypedAPI.md#printm), [randv](BLISTypedAPI.md#randv), [randm](BLISTypedAPI.md#randm), [sumsqv](BLISTypedAPI.md#sumsqv), [getsc](BLISTypedAPI.md#getsc), [getijv](BLISTypedAPI.md#getijv), [getijm](BLISTypedAPI.md#getijm), [setsc](BLISTypedAPI.md#setsc), [setijv](BLISTypedAPI.md#setijv), [setijm](BLISTypedAPI.md#setijm), [eqsc](BLISTypedAPI.md#eqsc), [eqv](BLISTypedAPI.md#eqv), [eqm](BLISTypedAPI.md#eqm)
|
||||
|
||||
|
||||
|
||||
@@ -1695,6 +1695,149 @@ where, on entry, `scale` and `sumsq` contain `scale_old` and `sumsq_old`, respec
|
||||
|
||||
---
|
||||
|
||||
#### getsc
|
||||
```c
|
||||
void bli_getsc
|
||||
(
|
||||
ctype* chi,
|
||||
double* zeta_r,
|
||||
double* zeta_i
|
||||
)
|
||||
```
|
||||
Copy the real and imaginary values from the scalar object `chi` to `zeta_r` and `zeta_i`. If `chi` is stored as a real type, then `zeta_i` is set to zero. (If `chi` is stored in single precision, the corresponding elements are typecast/promoted during the copy.)
|
||||
|
||||
---
|
||||
|
||||
#### getijv
|
||||
```c
|
||||
err_t bli_?getijv
|
||||
(
|
||||
dim_t i,
|
||||
ctype* x, incx,
|
||||
double* ar,
|
||||
double* ai
|
||||
)
|
||||
```
|
||||
Copy the real and imaginary values at the `i`th element of vector `x` to `ar` and `ai`. For real domain invocations, only `ar` is overwritten and `ai` is left unchanged. (If `x` contains elements stored in single precision, the corresponding elements are typecast/promoted during the copy.)
|
||||
Note that the object-based analogue of [getijv](BLISObjectAPI.md#getijv) does bounds checking of the vector element offset `i` against the vector length while the typed functions specified above do not (since the vector length is not given).
|
||||
|
||||
---
|
||||
|
||||
#### getijm
|
||||
```c
|
||||
err_t bli_?getijm
|
||||
(
|
||||
dim_t i,
|
||||
dim_t j,
|
||||
ctype* b, inc_t rs_b, inc_t cs_b,
|
||||
double* ar,
|
||||
double* ai
|
||||
)
|
||||
```
|
||||
Copy the real and imaginary values at the (`i`,`j`) element of object `b` to `ar` and `ai`. For real domain invocations, only `ar` is overwritten and `ai` is left unchanged. (If `b` contains elements stored in single precision, the corresponding elements are typecast/promoted during the copy.)
|
||||
Note that the object-based analogue of [getijm](BLISObjectAPI.md#getijm) does bounds checking of the matrix element offsets (`i`,`j`) against the matrix dimensions while the typed functions specified above do not (since the matrix dimensions are not given).
|
||||
|
||||
---
|
||||
|
||||
#### setsc
|
||||
```c
|
||||
void bli_setsc
|
||||
(
|
||||
double* zeta_r,
|
||||
double* zeta_i,
|
||||
ctype* chi
|
||||
);
|
||||
```
|
||||
Copy real and imaginary values `zeta_r` and `zeta_i` to the scalar object `chi`. If `chi` is stored as a real type, then `zeta_i` is ignored. (If `chi` is stored in single precision, the contents are typecast/demoted during the copy.)
|
||||
|
||||
---
|
||||
|
||||
#### setijv
|
||||
```c
|
||||
err_t bli_?setijv
|
||||
(
|
||||
double ar,
|
||||
double ai,
|
||||
dim_t i,
|
||||
ctype* x, incx
|
||||
);
|
||||
```
|
||||
Copy real and imaginary values `ar` and `ai` to the `i`th element of vector object `x`. For real domain invocations, only `ar` is copied and `ai` is ignored. (If `x` contains elements stored in single precision, the corresponding elements are typecast/demoted during the copy.)
|
||||
Note that the object-based analogue of [setijv](BLISObjectAPI.md#setijv) does bounds checking of the vector element offset `i` against the vector length while the typed functions specified above do not (since the vector length is not given).
|
||||
|
||||
---
|
||||
|
||||
#### setijm
|
||||
```c
|
||||
err_t bli_?setijm
|
||||
(
|
||||
double ar,
|
||||
double ai,
|
||||
dim_t i,
|
||||
dim_t j,
|
||||
ctype* b, inc_t rs_b, inc_t cs_b
|
||||
);
|
||||
```
|
||||
Copy real and imaginary values `ar` and `ai` to the (`i`,`j`) element of object `b`. For real domain invocations, only `ar` is copied and `ai` is ignored. (If `b` contains elements stored in single precision, the corresponding elements are typecast/demoted during the copy.)
|
||||
Note that the object-based analogue of [setijm](BLISObjectAPI.md#setijm) does bounds checking of the matrix element offsets (`i`,`j`) against the matrix dimensions while the typed functions specified above do not (since the matrix dimensions are not given).
|
||||
|
||||
---
|
||||
|
||||
#### eqsc
|
||||
```c
|
||||
void bli_?eqsc
|
||||
(
|
||||
conj_t conjchi,
|
||||
ctype* chi,
|
||||
ctype* psi,
|
||||
bool* is_eq
|
||||
);
|
||||
```
|
||||
Perform an element-wise comparison between scalars `chi` and `psi` and store the boolean result in the `bool` pointed to by `is_eq`.
|
||||
If `conjchi` indicates a conjugation, `chi` will be implicitly conjugated for purposes of the comparision.
|
||||
|
||||
---
|
||||
|
||||
#### eqv
|
||||
```c
|
||||
void bli_?eqv
|
||||
(
|
||||
conj_t conjx,
|
||||
dim_t n,
|
||||
ctype* x, inc_t incx,
|
||||
ctype* y, inc_t incy,
|
||||
bool* is_eq
|
||||
);
|
||||
```
|
||||
Perform an element-wise comparison between length _n_ vectors `x` and `y` and store the boolean result in the `bool` pointed to by `is_eq`.
|
||||
If `conjx` indicates a conjugation, `x` will be implicitly conjugated for purposes of the comparision.
|
||||
|
||||
---
|
||||
|
||||
#### eqm
|
||||
```c
|
||||
void bli_?eqm
|
||||
(
|
||||
doff_t diagoffa,
|
||||
diag_t diaga,
|
||||
uplo_t uploa,
|
||||
trans_t transa,
|
||||
dim_t m,
|
||||
dim_t n,
|
||||
ctype* a, inc_t rs_a, inc_t cs_a,
|
||||
ctype* b, inc_t rs_b, inc_t cs_b,
|
||||
bool* is_eq
|
||||
)
|
||||
```
|
||||
Perform an element-wise comparison between matrices `A` and `B` and store the boolean result in the `bool` pointed to by `is_eq`.
|
||||
Here, `B` is an _m x n_ matrix, `A` is stored as a dense matrix, or lower- or upper-triangular/trapezoidal matrix with arbitrary diagonal offset and unit or non-unit diagonal.
|
||||
If `diaga` indicates a unit diagonal, the diagonals of both matrices will be ignored for purposes of the comparision.
|
||||
If `uploa` indicates lower or upper storage, only that part of matrix `A` will be referenced in the comparison.
|
||||
If `transa` indicates a conjugation and/or transposition, then `A` will be conjugated and/or transposed for purposes of the comparison.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
## Level-3 microkernels
|
||||
|
||||
|
||||
11
docs/FAQ.md
11
docs/FAQ.md
@@ -17,6 +17,7 @@ project, as well as those we think a new user or developer might ask. If you do
|
||||
* [What is a macrokernel?](FAQ.md#what-is-a-macrokernel)
|
||||
* [What is a context?](FAQ.md#what-is-a-context)
|
||||
* [I am used to thinking in terms of column-major/row-major storage and leading dimensions. What is a "row stride" / "column stride"?](FAQ.md#im-used-to-thinking-in-terms-of-column-majorrow-major-storage-and-leading-dimensions-what-is-a-row-stride--column-stride)
|
||||
* [Why does BLIS have vector (level-1v) and matrix (level-1m) variations of most level-1 operations?](FAQ.md#why-does-blis-have-vector-level-1v-and-matrix-level-1m-variations-of-most-level-1-operations)
|
||||
* [What does it mean when a matrix with general stride is column-tilted or row-tilted?](FAQ.md#what-does-it-mean-when-a-matrix-with-general-stride-is-column-tilted-or-row-tilted)
|
||||
* [I am not really interested in all of these newfangled features in BLIS. Can I just use BLIS as a BLAS library?](FAQ.md#im-not-really-interested-in-all-of-these-newfangled-features-in-blis-can-i-just-use-blis-as-a-blas-library)
|
||||
* [What about CBLAS?](FAQ.md#what-about-cblas)
|
||||
@@ -117,6 +118,16 @@ In generalized storage, we have a row stride and a column stride. The row stride
|
||||
|
||||
BLIS also supports situations where both the row stride and column stride are non-unit. We call this situation "general stride".
|
||||
|
||||
### Why does BLIS have vector (level-1v) and matrix (level-1m) variations of most level-1 operations?
|
||||
|
||||
At first glance, it might appear that an element-wise operation such as `copym` or `axpym` would be sufficiently general purpose to cover the cases where the operands are vectors. After all, an *m x 1* matrix can be viewed as a vector of length m and vice versa. But in BLIS, operations on vectors are treated slightly differently than operations on matrices.
|
||||
|
||||
If an application wishes to perform an element-wise operation on two objects, and the application calls a level-1m operation, the dimensions of those objects must be conformal, or "match up" (after any transposition implied by the object properties). This includes situations where one of the dimensions is unit.
|
||||
|
||||
However, if an application instead decides to perform an element-wise operation on two objects, and the application calls a level-1v operation, the dimension constraints are slightly relaxed. In this scenario, BLIS only checks that the vector *lengths* are equal. This allows for the vectors to have different orientations (row vs column) while still being considered conformal. So, you could perform a `copyv` operation to copy from an *m x 1* vector to a *1 x m* vector. A `copym` operation on such objects would not be allowed (unless it was executed with the source object containing an implicit transposition).
|
||||
|
||||
Another way to think about level-1v operations is that they will work with any two matrix objects in situations where (a) the corresponding level-1m operation *would have* worked if the input had been transposed, and (b) all operands happen to be vectors (i.e., have one unit dimension).
|
||||
|
||||
### What does it mean when a matrix with general stride is column-tilted or row-tilted?
|
||||
|
||||
When a matrix is stored with general stride, both the row stride and column stride (let's call them `rs` and `cs`) are non-unit. When `rs` < `cs`, we call the general stride matrix "column-tilted" because it is "closer" to being column-stored (than row-stored). Similarly, when `rs` > `cs`, the matrix is "row-tilted" because it is closer to being row-stored.
|
||||
|
||||
Reference in New Issue
Block a user