Details:
- Fixed a bug in bli_acquire_mpart_t2b() and bli_acquire_mpart_l2r()
that cause incorrect partitioning when SUBPART0 was requested. This
bug was introduced in 46d3d09d49. Thanks to Bryan for isolating
this bug.
- Removed dupl kernels from kernels/x86_64/3 directory.
- Uncommented beta == 0 optimizaition code in
kernels/x86_64/3/bli_gemm_opt_d4x4.c.