Come Partner with Us

Parallelize speculative decoding with P-EAGLE on Amazon SageMaker AI

Favorite As large language models (LLMs) grow in size and complexity, maximizing inference throughput while minimizing latency remains a critical challenge for enterprise production deployments. Speculative decoding is one effective strategy to address this, utilizing a lightweight draft model to guess future tokens which are then verified by the target LLM in a single forward pass.

Read More
Shared by AWS Machine Learning June 16, 2026

Introducing Gemma 4 models on Amazon Bedrock

Favorite Today, we are announcing the availability of the Gemma 4 family on Amazon Bedrock. Built by Google DeepMind and released under the Apache 2.0 license, Gemma 4 is a family of open-weight models designed with a focus on intelligence-per-parameter across a broad range of deployment scenarios. The family includes

Read More
Shared by AWS Machine Learning June 15, 2026

Optimize blueprint extraction accuracy in Amazon Bedrock Data Automation

Favorite Extracting structured data from unstructured documents such as invoices, contracts, tax forms, and enrollment applications is a common automation goal for organizations. Achieving high extraction precision remains a key challenge. Accuracy degrades when documents diverge from expected templates, formats vary across vendors, or scan quality is poor. With Amazon

Read More
Shared by AWS Machine Learning June 12, 2026