Come Partner with Us

Deploy SageMaker AI inference endpoints with set GPU capacity using training plans

Favorite Deploying large language models (LLMs) for inference requires reliable GPU capacity, especially during critical evaluation periods, limited-duration production testing, or burst workloads. Capacity constraints can delay deployments and impact application performance. Customers can use Amazon SageMaker AI training plans to reserve compute capacity for specified time periods. Originally designed

Read More
Shared by AWS Machine Learning March 28, 2026

Introducing Amazon Polly Bidirectional Streaming: Real-time speech synthesis for conversational AI

Favorite Building natural conversational experiences requires speech synthesis that keeps pace with real-time interactions. Today, we’re excited to announce the new Bidirectional Streaming API for Amazon Polly, enabling streamlined real-time text-to-speech (TTS) synthesis where you can start sending text and receiving audio simultaneously. This new API is built for conversational

Read More
Shared by AWS Machine Learning March 28, 2026