Files
blis/frame/include
Sireesha Sanga 22af681a11 Runtime Thread Control Feature Update
Details:
1.  Runtime Thread Control Feature is enhanced to create a provision
    for the application to allocate a different number of threads to
    BLIS from the number of threads application is using for itself.

2.  In the previous implementation, if application sets BLIS_NUM_THREADS
    with a valid value, BLIS internally calls omp_set_num_threads() API with
    same value. Due to this, application could not differentiate between
    the number of threads used in BLIS library and the application.

3.  With the current solution, if Application wants to allocate
    different number of threads for BLIS API and application, Application
    can choose either BLIS_NUM_THREADS environment variable or
    bli_thread_set_num_threads(nt) API for BLIS,
    and OpenMP APIs or environment variables for itself, respectively.

4.  If BLIS_NUM_THREADS is set with a valid value, same value
    will be used in the subsequent parallel regions unless
    bli_thread_set_num_threads() API is used by the Application
    to modify the desired number of threads during BLIS API execution.

5.  Once BLIS_NUM_THREADS environment variable or
    bli_thread_set_num_threads(nt) API is used by the application,
    BLIS module would always give precedence to these values. BLIS API would
    not consider the values set using OpenMP API omp_set_num_threads(nt) API
    or OMP_NUM_THREADS environment variable.

6.  If BLIS_NUM_THREADS is not set, then if Application is multithreaded and
    issued omp_set_num_threads(nt) with desired number of threads,
    omp_get_max_threads() API will fetch the number of threads set earlier.

7.  If BLIS_NUM_THREADS is not set, omp_set_num_threads(nt) is not called
    by the application, but only OMP_NUM_THREADS is set,
    omp_get_max_threads() API will fetch the value of OMP_NUM_THREADS.

8.  If both environment variables are not set, or if they are set with
    invalid values, and omp_set_num_threads(nt) is not issued by
    application, omp_get_max_threads() API will return the number of the
    cores in the current context.

9.  BLIS will initialize rntm->num_threads with the same value.
    However if omp_set_nested is false - BLIS APIs called from parallel
    threads will run in sequential. But if nested parallelism is enabled
    Then each application will launch MT BLIS.

10. Order of precedence used for number of threads:
      0. value set using bli_thread_set_num_threads(nt) by the application
      1. valid value set for BLIS_NUM_THREADS environment variable
      2. omp_set_num_threads(nt) issued by the application
      3. valid value set for OMP_NUM_THREADS environment variable
      4. Number of cores

11. If nt is not a valid value for omp_set_num_threads(nt) API,
    number of threads would be set to 1.
    omp_get_max_threads() API will return 1.

12. OMP_NUM_THREADS env. variable is applicable only when OpenMP is enabled.

AMD-Internal: [CPUPL-2342]
Change-Id: I2041ac1d824f0b57a23a2a69abd6017c800f21b6
2022-08-19 05:43:01 -04:00
..
2021-04-27 11:09:48 +05:30
2021-04-27 11:09:48 +05:30
2022-04-01 13:55:30 +05:30