Managed Tiered KV Cache and Intelligent Routing for Amazon SageMaker HyperPod

Favorite Modern AI applications demand fast, cost-effective responses from large language models, especially when handling long documents or extended conversations. However, LLM inference can become prohibitively slow and expensive as context length increases, with latency growing exponentially and costs mounting with each interaction. LLM inference requires recalculating attention mechanisms for

Read More
Shared by AWS Machine Learning November 27, 2025

How Myriad Genetics achieved fast, accurate, and cost-efficient document processing using the AWS open-source Generative AI Intelligent Document Processing Accelerator

Favorite This post was written with Martyna Shallenberg and Brode Mccrady from Myriad Genetics. Healthcare organizations face challenges in processing and managing high volumes of complex medical documentation while maintaining quality in patient care. These organizations need solutions to process documents effectively to meet growing demands. Myriad Genetics, a provider of

Read More
Shared by AWS Machine Learning November 27, 2025

Amazon SageMaker AI introduces EAGLE based adaptive speculative decoding to accelerate generative AI inference

Favorite Generative AI models continue to expand in scale and capability, increasing the demand for faster and more efficient inference. Applications need low latency and consistent performance without compromising output quality. Amazon SageMaker AI introduces new enhancements to its inference optimization toolkit that bring EAGLE based adaptive speculative decoding to

Read More
Shared by AWS Machine Learning November 26, 2025

Enhanced performance for Amazon Bedrock Custom Model Import

Favorite You can now achieve significant performance improvements when using Amazon Bedrock Custom Model Import, with reduced end-to-end latency, faster time-to-first-token, and improved throughput through advanced PyTorch compilation and CUDA graph optimizations. With Amazon Bedrock Custom Model Import you can to bring your own foundation models to Amazon Bedrock for

Read More
Shared by AWS Machine Learning November 26, 2025

Beyond the technology: Workforce changes for AI

Favorite Workplaces are increasingly integrating AI tools into daily operations, with AI assistants supporting teams, predictive analytics informing strategies, and automation streamlining workflows. AI has moved from experimental technology to standard business practice, changing how work gets done. Organizations need to understand what AI can do and how it affects

Read More
Shared by AWS Machine Learning November 26, 2025