mirror of
https://github.com/amd/blis.git
synced 2026-04-19 23:28:52 +00:00
Fixed Data Race in Native code-path (#251)
```c
_Pragma( "omp parallel num_threads(n_threads)" )
{
// ... thread work ...
// Free the current thread's thrinfo_t structure.
bli_l3_thrinfo_free( rntm_p, thread ); // Line 183
}
// *** MISSING BARRIER HERE! ***
// Check the array_t back into the small block allocator...
bli_sba_checkin_array( array ); // Line 200
```
```c
// DANGEROUS execution timeline:
Thread 0 (chief):
completes func()
calls bli_l3_cntl_free()
calls bli_l3_thrinfo_free() → frees gl_comm ✓
exits OpenMP parallel region
calls bli_sba_checkin_array(array) → frees array ✗
Thread 1,2,3 (still executing):
still in func() or bli_l3_cntl_free()
trying to access freed gl_comm → CRASH!
trying to access freed array pools → CRASH!
```
This is **exactly the same issue** that PR #702 fixed in other files! The function needs a barrier before threads exit the parallel region to ensure:
1. **All threads complete their work** before any cleanup starts
2. **Global communicator isn't freed** while other threads are using it
3. **Array pools aren't freed** while other threads are accessing them
This commit is contained in:
@@ -179,6 +179,13 @@ void bli_l3_thread_decorator
|
||||
#ifdef PRINT_THRINFO
|
||||
threads[tid] = thread;
|
||||
#else
|
||||
|
||||
// NOTE: The barrier here is very important as it prevents memory being
|
||||
// released by the chief of some thread sub-group before its peers are done
|
||||
// using it. See PR #702 for more info [1].
|
||||
// [1] https://github.com/flame/blis/pull/702
|
||||
bli_thread_barrier( thread );
|
||||
|
||||
// Free the current thread's thrinfo_t structure.
|
||||
bli_l3_thrinfo_free( rntm_p, thread );
|
||||
#endif
|
||||
|
||||
Reference in New Issue
Block a user