At the re:Invent conference, AWS not only announced the SageMaker Studio Machine Learning Integration Development Environment, but also introduced the Inferentia chip. The chip, first announced last year, accelerates the inference of machine learning. With the addition of inferentia chips, researchers can deliver more obvious speed and more cost-effective speed strains and more cost-effective lybeing with previously trained models.
(Pictured from: AWS)
Andy Jassy, CEO of AWS, points out that many businesses put a lot of effort into custom chips trained in models, and that while reasoning is already performed better on conventional CPUs, custom chips are significantly more efficient.
Inferentia enables AWS to deliver lower latency, triple throughput, and 40% less cost per trip than regular G4 instances on EC4.
(Screenshot via AWS)
New Inf1 instances enable features up to 2000 TOPS, integration with TensorFlow, PyTorch, and MXNet, and support for ONNX model formats that can be migrated between frameworks.
Currently only available in EC2 computing services, AWS will soon introduce support for SageMaker machine learning and other container services.