Managed Tiered KV Cache and Intelligent Routing for Amazon SageMaker HyperPod

Favorite Modern AI applications demand fast, cost-effective responses from large language models, especially when handling long documents or extended conversations. However, LLM inference can become prohibitively slow and expensive as context length increases, with latency growing exponentially and costs mounting with each interaction. LLM inference requires recalculating attention mechanisms for

Read More
Shared by AWS Machine Learning November 27, 2025

How Myriad Genetics achieved fast, accurate, and cost-efficient document processing using the AWS open-source Generative AI Intelligent Document Processing Accelerator

Favorite This post was written with Martyna Shallenberg and Brode Mccrady from Myriad Genetics. Healthcare organizations face challenges in processing and managing high volumes of complex medical documentation while maintaining quality in patient care. These organizations need solutions to process documents effectively to meet growing demands. Myriad Genetics, a provider of

Read More
Shared by AWS Machine Learning November 27, 2025