mirror of
https://github.com/amd/blis.git
synced 2026-04-20 15:48:50 +00:00
Added support for systemless build (no pthreads).
Details: - Added a configure option, --[enable|disable]-system, which determines whether the modest operating system dependencies in BLIS are included. The most notable example of this on Linux and BSD/OSX is the use of POSIX threads to ensure thread safety for when application-level threads call BLIS. When --disable-system is given, the bli_pthreads implementation is dummied out entirely, allowing the calling code within BLIS to remain unchanged. Why would anyone want to build BLIS like this? The motivating example was submitted via #454 in which a user wanted to build BLIS for a simulator such as gem5 where thread safety may not be a concern (and where the operating system is largely absent anyway). Thanks to Stepan Nassyr for suggesting this feature. - Another, more minor side effect of the --disable-system option is that the implementation of bli_clock() unconditionally returns 0.0 instead of the time elapsed since some fixed point in the past. The reasoning for this is that if the operating system is truly minimal, the system function call upon which bli_clock() would normally be implemented (e.g. clock_gettime()) may not be available. - Refactored preprocess-guarded code in bli_pthread.c and bli_pthread.h to remove redundancies. - Removed old comments and commented #include of "bli_pthread_wrap.h" from bli_system.h. - Documented bli_clock() and bli_clock_min_diff() in BLISObjectAPI.md and BLISTypedAPI.md, with a note that both are non-functional when BLIS is configured with --disable-system.
This commit is contained in:
@@ -31,6 +31,7 @@
|
||||
* [Specific configuration](BLISObjectAPI.md#specific-configuration)
|
||||
* [General configuration](BLISObjectAPI.md#general-configuration)
|
||||
* [Kernel information](BLISObjectAPI.md#kernel-information)
|
||||
* [Clock functions](BLISObjectAPI.md#clock-functions)
|
||||
* **[Example code](BLISObjectAPI.md#example-code)**
|
||||
|
||||
|
||||
@@ -2235,6 +2236,54 @@ Possible microkernel types (ie: the return values for `bli_info_get_*_ukr_impl_s
|
||||
* `BLIS_OPTIMIZED_UKERNEL` (`"optimzd"`): This value is returned when the queried microkernel is provided by an implementation that is neither reference nor virtual, and thus we assume the kernel author would deem it to be "optimized". Such a microkernel may not be optimal in the literal sense of the word, but nonetheless is _intended_ to be optimized, at least relative to the reference microkernels.
|
||||
* `BLIS_NOTAPPLIC_UKERNEL` (`"notappl"`): This value is returned usually when performing a `gemmtrsm` or `trsm` microkernel type query for any `method` value that is not `BLIS_NAT` (ie: native). That is, induced methods cannot be (purely) used on `trsm`-based microkernels because these microkernels perform more a triangular inversion, which is not matrix multiplication.
|
||||
|
||||
|
||||
## Clock functions
|
||||
|
||||
---
|
||||
|
||||
#### clock
|
||||
```c
|
||||
double bli_clock
|
||||
(
|
||||
void
|
||||
);
|
||||
```
|
||||
Return the amount of time that has elapsed since some fixed time in the past. The return values of `bli_clock()` typically feature nanosecond precision, though this is not guaranteed.
|
||||
|
||||
**Note:** On Linux, `bli_clock()` is implemented in terms of `clock_gettime()` using the `clockid_t` value of `CLOCK_MONOTONIC`. On OS X, `bli_clock` is implemented in terms of `mach_absolute_time()`. And on Windows, `bli_clock` is implemented in terms of `QueryPerformanceFrequency()`. Please see [frame/base/bli_clock.c](https://github.com/flame/blis/blob/master/frame/base/bli_clock.c) for more details.
|
||||
**Note:** This function is returns meaningless values when BLIS is configured with `--disable-system`.
|
||||
|
||||
---
|
||||
|
||||
#### clock_min_diff
|
||||
```c
|
||||
double bli_clock_min_diff
|
||||
(
|
||||
double time_prev_min,
|
||||
double time_start
|
||||
);
|
||||
```
|
||||
This function computes an intermediate value, `time_diff`, equal to `bli_clock() - time_start`, and then tentatively prepares to return the minimum value of `time_diff` and `time_min`. If that minimum value is extremely small (close to zero), the function returns `time_min` instead.
|
||||
|
||||
This function is meant to be used in conjuction with `bli_clock()` for
|
||||
performance timing within applications--specifically in loops where only
|
||||
the fastest timing is of interest. For example:
|
||||
```c
|
||||
double t_save = DBL_MAX;
|
||||
for( i = 0; i < 3; ++i )
|
||||
{
|
||||
double t = bli_clock();
|
||||
bli_gemm( ... );
|
||||
t_save = bli_clock_min_diff( t_save, t );
|
||||
}
|
||||
double gflops = ( 2.0 * m * k * n ) / ( t_save * 1.0e9 );
|
||||
```
|
||||
This code calls `bli_gemm()` three times and computes the performance, in GFLOPS, of the fastest of the three executions.
|
||||
|
||||
---
|
||||
|
||||
|
||||
|
||||
# Example code
|
||||
|
||||
BLIS provides lots of example code in the [examples/oapi](https://github.com/flame/blis/tree/master/examples/oapi) directory of the BLIS source distribution. The example code in this directory is set up like a tutorial, and so we recommend starting from the beginning. Topics include creating and managing objects, printing vectors and matrices, setting and querying object properties, and calling a representative subset of the computational level-1v, -1m, -2, -3, and utility operations documented above. Please read the `README` contained within the `examples/oapi` directory for further details.
|
||||
|
||||
@@ -26,6 +26,7 @@
|
||||
* [Specific configuration](BLISTypedAPI.md#specific-configuration)
|
||||
* [General configuration](BLISTypedAPI.md#general-configuration)
|
||||
* [Kernel information](BLISTypedAPI.md#kernel-information)
|
||||
* [Clock functions](BLISTypedAPI.md#clock-functions)
|
||||
* **[Example code](BLISTypedAPI.md#example-code)**
|
||||
|
||||
|
||||
@@ -1902,6 +1903,54 @@ char* bli_info_get_trmm3_impl_string( num_t dt );
|
||||
char* bli_info_get_trsm_impl_string( num_t dt );
|
||||
```
|
||||
|
||||
|
||||
## Clock functions
|
||||
|
||||
---
|
||||
|
||||
#### clock
|
||||
```c
|
||||
double bli_clock
|
||||
(
|
||||
void
|
||||
);
|
||||
```
|
||||
Return the amount of time that has elapsed since some fixed time in the past. The return values of `bli_clock()` typically feature nanosecond precision, though this is not guaranteed.
|
||||
|
||||
**Note:** On Linux, `bli_clock()` is implemented in terms of `clock_gettime()` using the `clockid_t` value of `CLOCK_MONOTONIC`. On OS X, `bli_clock` is implemented in terms of `mach_absolute_time()`. And on Windows, `bli_clock` is implemented in terms of `QueryPerformanceFrequency()`. Please see [frame/base/bli_clock.c](https://github.com/flame/blis/blob/master/frame/base/bli_clock.c) for more details.
|
||||
**Note:** This function is returns meaningless values when BLIS is configured with `--disable-system`.
|
||||
|
||||
---
|
||||
|
||||
#### clock_min_diff
|
||||
```c
|
||||
double bli_clock_min_diff
|
||||
(
|
||||
double time_prev_min,
|
||||
double time_start
|
||||
);
|
||||
```
|
||||
This function computes an intermediate value, `time_diff`, equal to `bli_clock() - time_start`, and then tentatively prepares to return the minimum value of `time_diff` and `time_min`. If that minimum value is extremely small (close to zero), the function returns `time_min` instead.
|
||||
|
||||
This function is meant to be used in conjuction with `bli_clock()` for
|
||||
performance timing within applications--specifically in loops where only
|
||||
the fastest timing is of interest. For example:
|
||||
```c
|
||||
double t_save = DBL_MAX;
|
||||
for( i = 0; i < 3; ++i )
|
||||
{
|
||||
double t = bli_clock();
|
||||
bli_gemm( ... );
|
||||
t_save = bli_clock_min_diff( t_save, t );
|
||||
}
|
||||
double gflops = ( 2.0 * m * k * n ) / ( t_save * 1.0e9 );
|
||||
```
|
||||
This code calls `bli_gemm()` three times and computes the performance, in GFLOPS, of the fastest of the three executions.
|
||||
|
||||
---
|
||||
|
||||
|
||||
|
||||
# Example code
|
||||
|
||||
BLIS provides lots of example code in the [examples/tapi](https://github.com/flame/blis/tree/master/examples/tapi) directory of the BLIS source distribution. The example code in this directory is set up like a tutorial, and so we recommend starting from the beginning. Topics include printing vectors and matrices and calling a representative subset of the computational level-1v, -1m, -2, -3, and utility operations documented above. Please read the `README` contained within the `examples/tapi` directory for further details.
|
||||
|
||||
Reference in New Issue
Block a user