Fix#126
- Put `std::shared_ptr<SmDevice2DeviceSemaphore>` into the `SmChannel`
- add a `DeviceHandle` struct in `SmChannel`
- add `DeviceHandle` template
Users need to write code like this to use channel in device side:
```
using DeviceHandle = mscclpp::DeviceHandle<T>;
__device__ DeviceHandle<mscclpp::SimpleProxyChannel> channel;
__device__ DeviceHandle<mscclpp::SmChannel> smChannel;
```
To cover a channel to deviceHandle, need to call this function:
`mscclpp::deviceHandle(SimpleProxyChannel or SmChannel)`
---------
Co-authored-by: Changho Hwang <changhohwang@microsoft.com>