Added "Known issues" section to Multithreading.md.

Details:
- Added known issues section to Multithreading.md.
- Trivial changes to MixedDatatypes.md, Sandboxes.md.
This commit is contained in:
Field G. Van Zee
2018-10-19 17:42:40 -05:00
parent 49d3f9fcbb
commit c9be5889fb
3 changed files with 14 additions and 4 deletions

View File

@@ -6,7 +6,7 @@
* **[Computation precision](MixedDatatypes.md#computation-precision)**
* **[Computation domain](MixedDatatypes.md#computation-domain)**
* **[Performing gemm with mixed datatypes](MixedDatatypes.md#performing-gemm-with-mixed-datatypes)**
* **[Known Issues](MixedDatatypes.md#known-issues)**
* **[Known issues](MixedDatatypes.md#known-issues)**
* **[Conclusion](MixedDatatypes.md#conclusion)**
## Introduction
@@ -180,7 +180,7 @@ of initializing an matrix object with arbitrary values, please review the
example code found in the `examples/oapi` directory of the BLIS source
distribution.
## Known Issues
## Known issues
While BLIS implements 128 mixed-datatype combinations of `gemm`, there may be
odd behavior in the current implementation that does not conform to the reader's

View File

@@ -15,6 +15,8 @@
* [The automatic way](Multithreading.md#locally-at-runtime-the-automatic-way)
* [The manual way](Multithreading.md#locally-at-runtime-the-manual-way)
* [Using the expert interface](Multithreading.md#locally-at-runtime-using-the-expert-interface)
* **[Known issues](Multithreading.md#known-issues)**
* **[Conclusion](Multithreading.md#conclusion)**
# Introduction
@@ -220,6 +222,14 @@ Note that `rntm_t` objects may be reused over and over again once they are initi
Also, you may pass in `NULL` for the `rntm_t*` parameter of an expert interface. This causes the current global settings to be used.
# Known issues
* **Internal transposition and manual parallelism.** BLIS supports both row- and column-stored matrices (and tensor-like general storage). However, typically the `gemm` microkernel prefers to read and write microtiles of matrix C by rows, or by columns. If the storage of the user-provided matrix C does not match that of the microkernel preference, BLIS logically transpose the entire operation so that by the time the microkernel sees matrix C, it will appear to be stored according to its storage preference. If the caller is employing the automatic style of parallelism, whereby only the total number of threads is specified, this transposition happens *before* the the total number of threads is factored into the various loop-specific ways of parallelism and everything works as expected. However, if the caller employs the manual style of parallelism, the transposition must (by definition) happen *after* the thread factorization is done since, in this situation, the caller has taken responsibility for providing that factorization explicitly.
This situation could lead to unexpectedly low multithreaded performance. Suppose the user calls `gemm` on a problem with a large m dimension and small k and n dimensions, and explicitly requests parallelism only in the IC loop, but also suppose that the storage of C does not match that of the microkernel's preference. After BLIS transposes the operation internally, the *effective* m dimension will no longer be large; instead, it will be small (because the original m and n dimension will have been swapped). The multithreaded implementation will then proceed to parallelize this small m dimension.
There are currently no good *and* easy solutions to this problem. Eventually, though, we plan to add support for two microkernels per datatype per configuration--one for use with matrices C that are row-stored, and one for those that are column-stored. This will obviate the logic within BLIS that sometimes induces the operation transposition, and the problem will go away.
# Conclusion
Please send us feedback if you have any concerns or questions, or [open an issue](http://github.com/flame/blis/issues) if you observe any reproducible behavior that you think is erroneous. (You are welcome to use the issue feature to start any non-trivial dialogue; we don't restrict them only to bug reports!)

View File

@@ -4,7 +4,7 @@
* **[Enabling a sandbox](Sandboxes.md#enabling-a-sandbox)**
* **[Sandbox rules](Sandboxes.md#sandbox-rules)**
* **[Caveats](Sandboxes.md#caveats)**
* **[Known Issues](Sandboxes.md#known-issues)**
* **[Known issues](Sandboxes.md#known-issues)**
* **[Conclusion](Sandboxes.md#conclusion)**
@@ -183,7 +183,7 @@ guidance from BLIS developers by opening a
Notwithstanding these limitations, hopefully you still find BLIS sandboxes
useful!
## Known Issues
## Known issues
* **Mixed datatype support.** Unless you *really* know what you are doing, you
should probably disable mixed datatype support when using a sandbox. (Mixed