Bugfix for parallel BLAS1 and BLAS2 routines. Threading information
was not being set correctly when initializing local rntm from global.
Also ensure th_rntm is initialized along with global_rntm by updating
it in bli_thread_init(), called by bli_init_once()
AMD-Internal: [CPUPL-2433]
Change-Id: Iba658f87ae13fe16a57ca1fc279e149b7fa294cf