Azure’s Cognitive Services are a quick and easy way to add machine learning to many different types of applications. Available as REST APIs, they can be quickly hooked into your code using simple asynchronous calls from their own dedicated SDKs and libraries. It doesn’t matter what language or platform you’re building on, as long as your code can deliver HTTP calls and parse JSON documents.
Not all applications have the luxury of a low-latency connection to Azure. That’s why Microsoft is rolling out an increasing number of its Cognitive Services as containers, for use on appropriate hardware that may only have intermittent connectivity. That often requires using systems with a relatively high-end GPU, as the underlying neural nets used by the ML inferencing models require a lot of compute. Even so, with devices like Intel’s NUC9 hardware with an Nvidia Tesla-series GPU, that can be very small indeed.