Run multiple generative AI models on GPU using Amazon SageMaker multi-model endpoints with TorchServe and save up to 75% in inference costs

Favorite Multi-model endpoints (MMEs) are a powerful feature of Amazon SageMaker designed to simplify the deployment and operation of machine learning (ML) models. With MMEs, you can host multiple models on a single serving container and host all the models behind a single endpoint. The SageMaker platform automatically manages the

Read More
Shared by AWS Machine Learning September 6, 2023

Fine-tune Llama 2 for text generation on Amazon SageMaker JumpStart

Favorite Today, we are excited to announce the capability to fine-tune Llama 2 models by Meta using Amazon SageMaker JumpStart. The Llama 2 family of large language models (LLMs) is a collection of pre-trained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. Fine-tuned

Read More
Shared by AWS Machine Learning September 6, 2023

Intelligently search Adobe Experience Manager content using Amazon Kendra

Favorite Amazon Kendra is an intelligent search service powered by machine learning (ML). With Amazon Kendra, you can easily aggregate content from a variety of content repositories into an index that lets you quickly search all your enterprise data and find the most accurate answer. Adobe Experience Manager (AEM) is

Read More
Shared by AWS Machine Learning September 6, 2023

Build a secure enterprise application with Generative AI and RAG using Amazon SageMaker JumpStart

Favorite Generative AI is a type of AI that can create new content and ideas, including conversations, stories, images, videos, and music. It’s powered by large language models (LLMs) that are pre-trained on vast amounts of data and commonly referred to as foundation models (FMs). With the advent of these

Read More
Shared by AWS Machine Learning September 6, 2023

Optimize deployment cost of Amazon SageMaker JumpStart foundation models with Amazon SageMaker asynchronous endpoints

Favorite The success of generative AI applications across a wide range of industries has attracted the attention and interest of companies worldwide who are looking to reproduce and surpass the achievements of competitors or solve new and exciting use cases. These customers are looking into foundation models, such as TII

Read More
Shared by AWS Machine Learning September 5, 2023

How Carrier predicts HVAC faults using AWS Glue and Amazon SageMaker

Favorite In their own words, “In 1902, Willis Carrier solved one of mankind’s most elusive challenges of controlling the indoor environment through modern air conditioning. Today, Carrier products create comfortable environments, safeguard the global food supply, and enable safe transport of vital medical supplies under exacting conditions.” At Carrier, the

Read More
Shared by AWS Machine Learning September 5, 2023

Build a generative AI-based content moderation solution on Amazon SageMaker JumpStart

Favorite Content moderation plays a pivotal role in maintaining online safety and upholding the values and standards of websites and social media platforms. Its significance is underscored by the protection it provides users from exposure to inappropriate content, safeguarding their well-being in digital spaces. For example, in the advertising industry,

Read More
Shared by AWS Machine Learning September 5, 2023

FMOps/LLMOps: Operationalize generative AI and differences with MLOps

Favorite Nowadays, the majority of our customers is excited about large language models (LLMs) and thinking how generative AI could transform their business. However, bringing such solutions and models to the business-as-usual operations is not an easy task. In this post, we discuss how to operationalize generative AI applications using

Read More
Shared by AWS Machine Learning September 1, 2023

Elevating the generative AI experience: Introducing streaming support in Amazon SageMaker hosting

Favorite We’re excited to announce the availability of response streaming through Amazon SageMaker real-time inference. Now you can continuously stream inference responses back to the client when using SageMaker real-time inference to help you build interactive experiences for generative AI applications such as chatbots, virtual assistants, and music generators. With

Read More
Shared by AWS Machine Learning September 1, 2023

Deploy self-service question answering with the QnABot on AWS solution powered by Amazon Lex with Amazon Kendra and large language models

Favorite Powered by Amazon Lex, the QnABot on AWS solution is an open-source, multi-channel, multi-language conversational chatbot. QnABot allows you to quickly deploy self-service conversational AI into your contact center, websites, and social media channels, reducing costs, shortening hold times, and improving customer experience and brand sentiment. Customers now want

Read More
Shared by AWS Machine Learning August 31, 2023