mirror of
https://github.com/amd/blis.git
synced 2026-06-05 20:23:58 +00:00
Tweaked/added notes to docs/Multithreading.md.
Details: - Added language to docs/Multithreading.md cautioning the reader about the nuances of setting multithreading parameters via the manual and automatic ways simultaneously, and also about how these parameters behave when multithreading is disabled at configure-time. These changes are an attempt to address the issues that arose in issue #362. Thanks to Jérémie du Boisberranger for his feedback on this topic. - CREDITS file update.
This commit is contained in:
149
CREDITS
149
CREDITS
@@ -5,88 +5,89 @@ Acknowledgements
|
||||
|
||||
The BLIS framework was primarily authored by
|
||||
|
||||
Field Van Zee @fgvanzee (The University of Texas at Austin)
|
||||
Field Van Zee @fgvanzee (The University of Texas at Austin)
|
||||
|
||||
but many others have contributed code and feedback, including
|
||||
|
||||
Sameer Agarwal @sandwichmaker (Google)
|
||||
Murtaza Ali (Texas Instruments)
|
||||
Sajid Ali @s-sajid-ali (Northwestern University)
|
||||
Erling Andersen @erling-d-andersen
|
||||
Alex Arslan @ararslan
|
||||
Vernon Austel (IBM, T.J. Watson Research Center)
|
||||
Matthew Brett @matthew-brett (University of Birmingham)
|
||||
Jed Brown @jedbrown (Argonne National Laboratory)
|
||||
Robin Christ @robinchrist
|
||||
Kay Dewhurst @jkd2016 (Max Planck Institute, Halle, Germany)
|
||||
Jeff Diamond (Oracle)
|
||||
Johannes Dieterich @iotamudelta
|
||||
Krzysztof Drewniak @krzysz00
|
||||
Marat Dukhan @Maratyszcza (Google)
|
||||
Victor Eijkhout @VictorEijkhout (Texas Advanced Computing Center)
|
||||
Evgeny Epifanovsky @epifanovsky (Q-Chem)
|
||||
Isuru Fernando @isuruf
|
||||
Roman Gareev @gareevroman
|
||||
Richard Goldschmidt @SuperFluffy
|
||||
Sameer Agarwal @sandwichmaker (Google)
|
||||
Murtaza Ali (Texas Instruments)
|
||||
Sajid Ali @s-sajid-ali (Northwestern University)
|
||||
Erling Andersen @erling-d-andersen
|
||||
Alex Arslan @ararslan
|
||||
Vernon Austel (IBM, T.J. Watson Research Center)
|
||||
Matthew Brett @matthew-brett (University of Birmingham)
|
||||
Jed Brown @jedbrown (Argonne National Laboratory)
|
||||
Robin Christ @robinchrist
|
||||
Kay Dewhurst @jkd2016 (Max Planck Institute, Halle, Germany)
|
||||
Jeff Diamond (Oracle)
|
||||
Johannes Dieterich @iotamudelta
|
||||
Krzysztof Drewniak @krzysz00
|
||||
Marat Dukhan @Maratyszcza (Google)
|
||||
Victor Eijkhout @VictorEijkhout (Texas Advanced Computing Center)
|
||||
Evgeny Epifanovsky @epifanovsky (Q-Chem)
|
||||
Isuru Fernando @isuruf
|
||||
Roman Gareev @gareevroman
|
||||
Richard Goldschmidt @SuperFluffy
|
||||
Chris Goodyer
|
||||
John Gunnels @jagunnels (IBM, T.J. Watson Research Center)
|
||||
Ali Emre Gülcü @Lephar
|
||||
Jeff Hammond @jeffhammond (Intel)
|
||||
Jacob Gorm Hansen @jacobgorm
|
||||
Jean-Michel Hautbois @jhautbois
|
||||
Ian Henriksen @insertinterestingnamehere (The University of Texas at Austin)
|
||||
Minh Quan Ho @hominhquan
|
||||
Matthew Honnibal @honnibal
|
||||
Stefan Husmann @stefanhusmann
|
||||
Francisco Igual @figual (Universidad Complutense de Madrid)
|
||||
Tony Kelman @tkelman
|
||||
Lee Killough @leekillough (Cray)
|
||||
Mike Kistler @mkistler (IBM, Austin Research Laboratory)
|
||||
Michael Lehn @michael-lehn
|
||||
@ShmuelLevine
|
||||
Dave Love @loveshack
|
||||
Tze Meng Low (The University of Texas at Austin)
|
||||
Ye Luo @ye-luo (Argonne National Laboratory)
|
||||
Ricardo Magana @magania (Hewlett Packard Enterprise)
|
||||
Bryan Marker @bamarker (The University of Texas at Austin)
|
||||
Devin Matthews @devinamatthews (The University of Texas at Austin)
|
||||
Stefanos Mavros @smavros
|
||||
Nisanth Padinharepatt (AMD)
|
||||
Devangi Parikh @dnparikh (The University of Texas at Austin)
|
||||
Elmar Peise @elmar-peise (RWTH-Aachen)
|
||||
Clément Pernet @ClementPernet
|
||||
John Gunnels @jagunnels (IBM, T.J. Watson Research Center)
|
||||
Ali Emre Gülcü @Lephar
|
||||
Jeff Hammond @jeffhammond (Intel)
|
||||
Jacob Gorm Hansen @jacobgorm
|
||||
Jérémie du Boisberranger @jeremiedbb
|
||||
Jean-Michel Hautbois @jhautbois
|
||||
Ian Henriksen @insertinterestingnamehere (The University of Texas at Austin)
|
||||
Minh Quan Ho @hominhquan
|
||||
Matthew Honnibal @honnibal
|
||||
Stefan Husmann @stefanhusmann
|
||||
Francisco Igual @figual (Universidad Complutense de Madrid)
|
||||
Tony Kelman @tkelman
|
||||
Lee Killough @leekillough (Cray)
|
||||
Mike Kistler @mkistler (IBM, Austin Research Laboratory)
|
||||
Michael Lehn @michael-lehn
|
||||
@ShmuelLevine
|
||||
Dave Love @loveshack
|
||||
Tze Meng Low (The University of Texas at Austin)
|
||||
Ye Luo @ye-luo (Argonne National Laboratory)
|
||||
Ricardo Magana @magania (Hewlett Packard Enterprise)
|
||||
Bryan Marker @bamarker (The University of Texas at Austin)
|
||||
Devin Matthews @devinamatthews (The University of Texas at Austin)
|
||||
Stefanos Mavros @smavros
|
||||
Nisanth Padinharepatt (AMD)
|
||||
Devangi Parikh @dnparikh (The University of Texas at Austin)
|
||||
Elmar Peise @elmar-peise (RWTH-Aachen)
|
||||
Clément Pernet @ClementPernet
|
||||
Ilya Polkovnichenko
|
||||
Jack Poulson @poulson (Stanford)
|
||||
Mathieu Poumeyrol @kali
|
||||
Christos Psarras @ChrisPsa (RWTH-Aachen)
|
||||
@qnerd
|
||||
Michael Rader @mrader1248
|
||||
Pradeep Rao @pradeeptrgit (AMD)
|
||||
Jack Poulson @poulson (Stanford)
|
||||
Mathieu Poumeyrol @kali
|
||||
Christos Psarras @ChrisPsa (RWTH-Aachen)
|
||||
@qnerd
|
||||
Michael Rader @mrader1248
|
||||
Pradeep Rao @pradeeptrgit (AMD)
|
||||
Aleksei Rechinskii
|
||||
Karl Rupp @karlrupp
|
||||
Martin Schatz (The University of Texas at Austin)
|
||||
Nico Schlömer @nschloe
|
||||
Karl Rupp @karlrupp
|
||||
Martin Schatz (The University of Texas at Austin)
|
||||
Nico Schlömer @nschloe
|
||||
Rene Sitt
|
||||
Tony Skjellum @tonyskjellum (The University of Tennessee at Chattanooga)
|
||||
Mikhail Smelyanskiy (Intel, Parallel Computing Lab)
|
||||
Nathaniel Smith @njsmith
|
||||
Shaden Smith @ShadenSmith
|
||||
Tyler Smith @tlrmchlsmth (The University of Texas at Austin)
|
||||
Paul Springer @springer13 (RWTH-Aachen)
|
||||
Adam J. Stewart @adamjstewart (University of Illinois at Urbana-Champaign)
|
||||
Tony Skjellum @tonyskjellum (The University of Tennessee at Chattanooga)
|
||||
Mikhail Smelyanskiy (Intel, Parallel Computing Lab)
|
||||
Nathaniel Smith @njsmith
|
||||
Shaden Smith @ShadenSmith
|
||||
Tyler Smith @tlrmchlsmth (The University of Texas at Austin)
|
||||
Paul Springer @springer13 (RWTH-Aachen)
|
||||
Adam J. Stewart @adamjstewart (University of Illinois at Urbana-Champaign)
|
||||
Vladimir Sukarev
|
||||
Santanu Thangaraj (AMD)
|
||||
Nicholai Tukanov @nicholaiTukanov (The University of Texas at Austin)
|
||||
Rhys Ulerich @RhysU (The University of Texas at Austin)
|
||||
Robert van de Geijn @rvdg (The University of Texas at Austin)
|
||||
Kiran Varaganti @kvaragan (AMD)
|
||||
Natalia Vassilieva (Hewlett Packard Enterprise)
|
||||
Zhang Xianyi @xianyi (Chinese Academy of Sciences)
|
||||
Benda Xu @heroxbd
|
||||
Costas Yamin @cosstas
|
||||
Chenhan Yu @ChenhanYu (The University of Texas at Austin)
|
||||
Roman Yurchak @rth (Symerio)
|
||||
M. Zhou @cdluminate
|
||||
Santanu Thangaraj (AMD)
|
||||
Nicholai Tukanov @nicholaiTukanov (The University of Texas at Austin)
|
||||
Rhys Ulerich @RhysU (The University of Texas at Austin)
|
||||
Robert van de Geijn @rvdg (The University of Texas at Austin)
|
||||
Kiran Varaganti @kvaragan (AMD)
|
||||
Natalia Vassilieva (Hewlett Packard Enterprise)
|
||||
Zhang Xianyi @xianyi (Chinese Academy of Sciences)
|
||||
Benda Xu @heroxbd
|
||||
Costas Yamin @cosstas
|
||||
Chenhan Yu @ChenhanYu (The University of Texas at Austin)
|
||||
Roman Yurchak @rth (Symerio)
|
||||
M. Zhou @cdluminate
|
||||
|
||||
BLIS's development was partially funded by grants from industry
|
||||
partners, including
|
||||
|
||||
@@ -107,7 +107,11 @@ This pattern--automatic or manual--holds regardless of which of the three method
|
||||
|
||||
Regardless of which method is employed, and which specific way within each method, after setting the number of threads, the application may call the desired level-3 operation (via either the [typed API](docs/BLISTypedAPI.md) or the [object API](docs/BLISObjectAPI.md)) and the operation will execute in a multithreaded manner. (When calling BLIS via the BLAS API, only the first two (global) methods are available.)
|
||||
|
||||
**Note**: Please be aware of what happens if you try to specify both the automatic and manual ways, as it could otherwise confuse new users. Regardless of which broad method is used, **if multithreading is specified via both the automatic and manual ways, the manual way will always take precedence.** Also, specifying parallelism for even *one* loop counts as specifying the manual way (in which case the ways of parallelism for the remaining loops will be assumed to be 1).
|
||||
**Note**: Please be aware of what happens if you try to specify both the automatic and manual ways, as it could otherwise confuse new users. Here are the important points:
|
||||
* Regardless of which broad method is used, **if multithreading is specified via both the automatic and manual ways, the values set via the manual way will always take precedence.**
|
||||
* Specifying parallelism for even *one* loop counts as specifying the manual way (in which case the ways of parallelism for the remaining loops will be assumed to be 1).
|
||||
* If you have specified multithreading via *both* the automatic and manual ways, BLIS will **not** complain if the values are inconsistent with one another. (For example, you may request 8 total threads be used while also specifing 4 ways of parallelism within each of two matrix multiplication loops, for a total of 16 ways.) Furthermore, you will be able to query these inconsistent values via the runtime API both before and after multithreading executes.
|
||||
* If multithreading is disabled, you **may still** specify multithreading values via either the manual or automatic ways. However, BLIS will silently ignore **all** of these values. A BLIS library that is built with multithreading disabled at configure-time will always run sequentially (from the prespective of a single application thread).
|
||||
|
||||
## Globally via environment variables
|
||||
|
||||
|
||||
Reference in New Issue
Block a user