Introducing Fast Model Loader in SageMaker Inference: Accelerate autoscaling for your Large Language Models (LLMs) – part 1

Favorite The generative AI landscape has been rapidly evolving, with large language models (LLMs) at the forefront of this transformation. These models have grown exponentially in size and complexity, with some now containing hundreds of billions of parameters and requiring hundreds of gigabytes of memory. As LLMs continue to expand,

Read More
Shared by AWS Machine Learning December 3, 2024

AWS DeepRacer: How to master physical racing?

Favorite As developers gear up for re:Invent 2024, they again face the unique challenges of physical racing. What are the obstacles? Let’s have a look. In this blog post, I will look at what makes physical AWS DeepRacer racing—a real car on a real track—different to racing in the virtual

Read More
Shared by AWS Machine Learning December 2, 2024

Easily deploy and manage hundreds of LoRA adapters with SageMaker efficient multi-adapter inference

Favorite The new efficient multi-adapter inference feature of Amazon SageMaker unlocks exciting possibilities for customers using fine-tuned models. This capability integrates with SageMaker inference components to allow you to deploy and manage hundreds of fine-tuned Low-Rank Adaptation (LoRA) adapters through SageMaker APIs. Multi-adapter inference handles the registration of fine-tuned adapters

Read More
Shared by AWS Machine Learning November 30, 2024

Create a generative AI assistant with Slack and Amazon Bedrock

Favorite Seamless integration of customer experience, collaboration tools, and relevant data is the foundation for delivering knowledge-based productivity gains. In this post, we show you how to integrate the popular Slack messaging service with AWS generative AI services to build a natural language assistant where business users can ask questions

Read More
Shared by AWS Machine Learning November 28, 2024