Achieve up to ~2x higher throughput while reducing costs by ~50% for generative AI inference on Amazon SageMaker with the new inference optimization toolkit – Part 1

Favorite Today, Amazon SageMaker announced a new inference optimization toolkit that helps you reduce the time it takes to optimize generative artificial intelligence (AI) models from months to hours, to achieve best-in-class performance for your use case. With this new capability, you can choose from a menu of optimization techniques,

Read More
Shared by AWS Machine Learning July 10, 2024

Achieve up to ~2x higher throughput while reducing costs by up to ~50% for generative AI inference on Amazon SageMaker with the new inference optimization toolkit – Part 2

Favorite As generative artificial intelligence (AI) inference becomes increasingly critical for businesses, customers are seeking ways to scale their generative AI operations or integrate generative AI models into existing workflows. Model optimization has emerged as a crucial step, allowing organizations to balance cost-effectiveness and responsiveness, improving productivity. However, price-performance requirements

Read More
Shared by AWS Machine Learning July 10, 2024

Empowering everyone with GenAI to rapidly build, customize, and deploy apps securely: Highlights from the AWS New York Summit

Favorite Imagine this—all employees relying on generative artificial intelligence (AI) to get their work done faster, every task becoming less mundane and more innovative, and every application providing a more useful, personal, and engaging experience. To realize this future, organizations need more than a single, powerful large language model (LLM)

Read More
Shared by AWS Machine Learning July 10, 2024

Streamline generative AI development in Amazon Bedrock with Prompt Management and Prompt Flows (preview)

Favorite Today, we’re excited to introduce two powerful new features for Amazon Bedrock: Prompt Management and Prompt Flows, in public preview. These features are designed to accelerate the development, testing, and deployment of generative artificial intelligence (AI) applications, enabling developers and business users to create more efficient and effective solutions

Read More
Shared by AWS Machine Learning July 10, 2024

The Weather Company enhances MLOps with Amazon SageMaker, AWS CloudFormation, and Amazon CloudWatch

Favorite This blog post is co-written with Qaish Kanchwala  from The Weather Company. As industries begin adopting processes dependent on machine learning (ML) technologies, it is critical to establish machine learning operations (MLOps) that scale to support growth and utilization of this technology. MLOps practitioners have many options to establish

Read More
Shared by AWS Machine Learning July 8, 2024