Building cost-effective RAG applications with Amazon Bedrock Knowledge Bases and Amazon S3 Vectors

Favorite Vector embeddings have become essential for modern Retrieval Augmented Generation (RAG) applications, but organizations face significant cost challenges as they scale. As knowledge bases grow and require more granular embeddings, many vector databases that rely on high-performance storage such as SSDs or in-memory solutions become prohibitively expensive. This cost

Read More
Shared by AWS Machine Learning July 18, 2025

Evaluating generative AI models with Amazon Nova LLM-as-a-Judge on Amazon SageMaker AI

Favorite Evaluating the performance of large language models (LLMs) goes beyond statistical metrics like perplexity or bilingual evaluation understudy (BLEU) scores. For most real-world generative AI scenarios, it’s crucial to understand whether a model is producing better outputs than a baseline or an earlier iteration. This is especially important for

Read More
Shared by AWS Machine Learning July 18, 2025

Manage multi-tenant Amazon Bedrock costs using application inference profiles

Favorite Successful generative AI software as a service (SaaS) systems require a balance between service scalability and cost management. This becomes critical when building a multi-tenant generative AI service designed to serve a large, diverse customer base while maintaining rigorous cost controls and comprehensive usage monitoring. Traditional cost management approaches

Read More
Shared by AWS Machine Learning July 18, 2025

Deploy a full stack voice AI agent with Amazon Nova Sonic

Favorite AI-powered speech solutions are transforming contact centers by enabling natural conversations between customers and AI agents, shortening wait times, and dramatically reducing operational costs—all without sacrificing the human-like interaction customers expect. With the recent launch of Amazon Nova Sonic in Amazon Bedrock, you can now build sophisticated conversational AI

Read More
Shared by AWS Machine Learning July 18, 2025

Build real-time travel recommendations using AI agents on Amazon Bedrock

Favorite Generative AI is transforming how businesses deliver personalized experiences across industries, including travel and hospitality. Travel agents are enhancing their services by offering personalized holiday packages, carefully curated for customer’s unique preferences, including accessibility needs, dietary restrictions, and activity interests. Meeting these expectations requires a solution that combines comprehensive

Read More
Shared by AWS Machine Learning July 18, 2025

Building enterprise-scale RAG applications with Amazon S3 Vectors and DeepSeek R1 on Amazon SageMaker AI

Favorite Organizations are adopting large language models (LLMs), such as DeepSeek R1, to transform business processes, enhance customer experiences, and drive innovation at unprecedented speed. However, standalone LLMs have key limitations such as hallucinations, outdated knowledge, and no access to proprietary data. Retrieval Augmented Generation (RAG) addresses these gaps by

Read More
Shared by AWS Machine Learning July 17, 2025

Implementing on-demand deployment with customized Amazon Nova models on Amazon Bedrock

Favorite Amazon Bedrock offers model customization capabilities for customers to tailor versions of foundation models (FMs) to their specific needs through features such as fine-tuning and distillation. Today, we’re announcing the launch of on-demand deployment for customized models ready to be deployed on Amazon Bedrock. On-demand deployment for customized models

Read More
Shared by AWS Machine Learning July 17, 2025