Description:
Implemented sigmoid, tanh as fused post-ops in
aocl_gemm_bf16bf16f32o<f32|bf16) API's
Sigmoid(x) = 1/1+e^(-x)
Tanh(x) = (1-e^(-2x))/(1+e^(2x))
Updated bench_lpgemm to recognize sigmod, tanh
as options for post-ops from bench_input and verified.
AMD-Internal: [SWLCSG-3178]
Change-Id: I78a3ba4a67ab63f9d671fbe315f977b016a0d969