* add client example for elementwise_normalization
* clang format elementwise_layernorm2d.cpp
* changed some naming to make it more understandable
* changed naming of input into ab_input
* fixed bug for threadwise_x_store
* add elementwise operation to reference