Parallelizing across multiple CPU/GPUs to speed up deep learning inference at the edge

Favorite AWS customers often choose to run machine learning (ML) inferences at the edge to minimize latency. In many of these situations, ML predictions must be run on a large number of inputs independently. For example, running an object detection model on each frame of a video. In these cases, parallelizing
Read More Shared by AWS Machine Learning August 21, 2019