Friday, July 15, 2022

AWS Inferentia

 AWS’s vision is to make deep learning pervasive for everyday developers and to democratize access to cutting edge infrastructure made available in a low-cost pay-as-you-go usage model. AWS Inferentia is Amazon's first custom silicon designed to accelerate deep learning workloads and is part of a long-term strategy to deliver on this vision. AWS Inferentia is designed to provide high performance inference in the cloud, to drive down the total cost of inference, and to make it easy for developers to integrate machine learning into their business applications.

The AWS Neuron software development kit (SDK), consists of a compiler, run-time, and profiling tools that help optimize the performance of workloads for AWS Inferentia. Developers can deploy complex neural network models that have been built and trained on popular frameworks such as Tensorflow, PyTorch, and MXNet, and deploy them on AWS Inferentia-based Amazon EC2 Inf1 instances. You can continue to use the same ML frameworks you use today and migrate your models onto Inf1 with minimal code changes and without tie-in to vendor specific solutions.

No comments:

Post a Comment