Benchmark and optimize endpoint deployment in Amazon SageMaker JumpStart

Favorite When deploying a large language model (LLM), machine learning (ML) practitioners typically care about two measurements for model serving performance: latency, defined by the time it takes to generate a single token, and throughput, defined by the number of tokens generated per second. Although a single request to the

Read More
Shared by AWS Machine Learning January 29, 2024

Architect defense-in-depth security for generative AI applications using the OWASP Top 10 for LLMs

Favorite Generative artificial intelligence (AI) applications built around large language models (LLMs) have demonstrated the potential to create and accelerate economic value for businesses. Examples of applications include conversational search, customer support agent assistance, customer support analytics, self-service virtual assistants, chatbots, rich media generation, content moderation, coding companions to accelerate

Read More
Shared by AWS Machine Learning January 27, 2024

Deploy a Microsoft Teams gateway for Amazon Q, your business expert

Favorite Amazon Q is a new generative AI-powered application that helps users get work done. Amazon Q can become your tailored business expert and let you discover content, brainstorm ideas, or create summaries using your company’s data safely and securely. You can use Amazon Q to have conversations, solve problems,

Read More
Shared by AWS Machine Learning January 25, 2024

Build enterprise-ready generative AI solutions with Cohere foundation models in Amazon Bedrock and Weaviate vector database on AWS Marketplace

Favorite Generative AI solutions have the potential to transform businesses by boosting productivity and improving customer experiences, and using large language models (LLMs) with these solutions has become increasingly popular. Building proofs of concept is relatively straightforward because cutting-edge foundation models are available from specialized providers through a simple API

Read More
Shared by AWS Machine Learning January 24, 2024

Build a vaccination verification solution using the Queries feature in Amazon Textract

Favorite Amazon Textract is a machine learning (ML) service that enables automatic extraction of text, handwriting, and data from scanned documents, surpassing traditional optical character recognition (OCR). It can identify, understand, and extract data from tables and forms with remarkable accuracy. Presently, several companies rely on manual extraction methods or

Read More
Shared by AWS Machine Learning January 22, 2024

Reduce inference time for BERT models using neural architecture search and SageMaker Automated Model Tuning

Favorite In this post, we demonstrate how to use neural architecture search (NAS) based structural pruning to compress a fine-tuned BERT model to improve model performance and reduce inference times. Pre-trained language models (PLMs) are undergoing rapid commercial and enterprise adoption in the areas of productivity tools, customer service, search

Read More
Shared by AWS Machine Learning January 19, 2024

Use mobility data to derive insights using Amazon SageMaker geospatial capabilities

Favorite Geospatial data is data about specific locations on the earth’s surface. It can represent a geographical area as a whole or it can represent an event associated with a geographical area. Analysis of geospatial data is sought after in a few industries. It involves understanding where the data exists

Read More
Shared by AWS Machine Learning January 18, 2024

Fine-tune and deploy Llama 2 models cost-effectively in Amazon SageMaker JumpStart with AWS Inferentia and AWS Trainium

Favorite Today, we’re excited to announce the availability of Llama 2 inference and fine-tuning support on AWS Trainium and AWS Inferentia instances in Amazon SageMaker JumpStart. Using AWS Trainium and Inferentia based instances, through SageMaker, can help users lower fine-tuning costs by up to 50%, and lower deployment costs by

Read More
Shared by AWS Machine Learning January 18, 2024

Host the Whisper Model on Amazon SageMaker: exploring inference options

Favorite OpenAI Whisper is an advanced automatic speech recognition (ASR) model with an MIT license. ASR technology finds utility in transcription services, voice assistants, and enhancing accessibility for individuals with hearing impairments. This state-of-the-art model is trained on a vast and diverse dataset of multilingual and multitask supervised data collected

Read More
Shared by AWS Machine Learning January 17, 2024

Ball position tracking in the cloud with the PGA TOUR

Favorite The PGA TOUR continues to enhance the golf experience with real-time data that brings fans closer to the game. To deliver even richer experiences, they are pursuing the development of a next-generation ball position tracking system that automatically tracks the position of the ball on the green. The TOUR

Read More
Shared by AWS Machine Learning January 12, 2024