Announcing the new cluster creation experience for Amazon SageMaker HyperPod

Favorite Today, Amazon SageMaker HyperPod is announcing a new one-click, validated cluster creation experience that accelerates setup and prevents common misconfigurations, so you can launch your distributed training and inference clusters complete with Slurm or Amazon Elastic Kubernetes Service (Amazon EKS) orchestration, Amazon Virtual Private Cloud (Amazon VPC) networking, high-performance

Read More
Shared by AWS Machine Learning September 3, 2025

Document intelligence evolved: Building and evaluating KIE solutions that scale

Favorite Intelligent document processing (IDP) refers to the automated extraction, classification, and processing of data from various document formats—both structured and unstructured. Within the IDP landscape, key information extraction (KIE) serves as a fundamental component, enabling systems to identify and extract critical data points from documents with minimal human intervention.

Read More
Shared by AWS Machine Learning September 3, 2025

Deploy Amazon Bedrock Knowledge Bases using Terraform for RAG-based generative AI applications

Favorite Retrieval Augmented Generation (RAG) is a powerful approach for building generative AI applications by providing foundation models (FMs) access to additional, relevant data. This approach improves response accuracy and transparency while avoiding the potential cost and complexity of FM training or fine-tuning. Many customers use Amazon Bedrock Knowledge Bases

Read More
Shared by AWS Machine Learning September 3, 2025

Natural language-based database analytics with Amazon Nova

Favorite In this post, we explore how natural language database analytics can revolutionize the way organizations interact with their structured data through the power of large language model (LLM) agents. Natural language interfaces to databases have long been a goal in data management. Agents enhance database analytics by breaking down

Read More
Shared by AWS Machine Learning September 3, 2025

Build a serverless Amazon Bedrock batch job orchestration workflow using AWS Step Functions

Favorite As organizations increasingly adopt foundation models (FMs) for their artificial intelligence and machine learning (AI/ML) workloads, managing large-scale inference operations efficiently becomes crucial. Amazon Bedrock supports two general types of large-scale inference patterns: real-time inference and batch inference for use cases that involve processing massive datasets where immediate results

Read More
Shared by AWS Machine Learning September 3, 2025

Train and deploy models on Amazon SageMaker HyperPod using the new HyperPod CLI and SDK

Favorite Training and deploying large AI models requires advanced distributed computing capabilities, but managing these distributed systems shouldn’t be complex for data scientists and machine learning (ML) practitioners. The newly released command line interface (CLI) and software development kit (SDK) for Amazon SageMaker HyperPod simplify how you can use the

Read More
Shared by AWS Machine Learning September 3, 2025

Introducing auto scaling on Amazon SageMaker HyperPod

Favorite Today, we’re excited to announce that Amazon SageMaker HyperPod now supports managed node automatic scaling with Karpenter, so you can efficiently scale your SageMaker HyperPod clusters to meet your inference and training demands. Real-time inference workloads require automatic scaling to address unpredictable traffic patterns and maintain service level agreements

Read More
Shared by AWS Machine Learning August 30, 2025

Set up custom domain names for Amazon Bedrock AgentCore Runtime agents

Favorite When deploying AI agents to Amazon Bedrock AgentCore Runtime (currently in preview), customers often want to use custom domain names to create a professional and seamless experience. By default, AgentCore Runtime agents use endpoints like https://bedrock-agentcore.{region}.amazonaws.com/runtimes/{EncodedAgentARN}/invocations. In this post, we discuss how to transform these endpoints into user-friendly custom

Read More
Shared by AWS Machine Learning August 30, 2025

Detect Amazon Bedrock misconfigurations with Datadog Cloud Security

Favorite This post was co-written with Nick Frichette and Vijay George from Datadog.  As organizations increasingly adopt Amazon Bedrock for generative AI applications, protecting against misconfigurations that could lead to data leaks or unauthorized model access becomes critical. The AWS Generative AI Adoption Index, which surveyed 3,739 senior IT decision-makers

Read More
Shared by AWS Machine Learning August 30, 2025