Get started with the open-source Amazon SageMaker Distribution

Favorite Data scientists need a consistent and reproducible environment for machine learning (ML) and data science workloads that enables managing dependencies and is secure. AWS Deep Learning Containers already provides pre-built Docker images for training and serving models in common frameworks such as TensorFlow, PyTorch, and MXNet. To improve this

Read More
Shared by AWS Machine Learning June 9, 2023

Technology Innovation Institute trains the state-of-the-art Falcon LLM 40B foundation model on Amazon SageMaker

Favorite This blog post is co-written with Dr. Ebtesam Almazrouei, Executive Director–Acting Chief AI Researcher of the AI-Cross Center Unit and Project Lead for LLM Projects at TII. United Arab Emirate’s (UAE) Technology Innovation Institute (TII), the applied research pillar of Abu Dhabi’s Advanced Technology Research Council, has launched Falcon

Read More
Shared by AWS Machine Learning June 8, 2023

Announcing enhanced table extractions with Amazon Textract

Favorite Amazon Textract is a machine learning (ML) service that automatically extracts text, handwriting, and data from any document or image. Amazon Textract has a Tables feature within the AnalyzeDocument API that offers the ability to automatically extract tabular structures from any document. In this post, we discuss the improvements

Read More
Shared by AWS Machine Learning June 8, 2023

Accelerate PyTorch with DeepSpeed to train large language models with Intel Habana Gaudi-based DL1 EC2 instances

Favorite Training large language models (LLMs) with billions of parameters can be challenging. In addition to designing the model architecture, researchers need to set up state-of-the-art training techniques for distributed training like mixed precision support, gradient accumulation, and checkpointing. With large models, the training setup is even more challenging because

Read More
Shared by AWS Machine Learning June 8, 2023

Build machine learning-ready datasets from the Amazon SageMaker offline Feature Store using the Amazon SageMaker Python SDK

Favorite Amazon SageMaker Feature Store is a purpose-built service to store and retrieve feature data for use by machine learning (ML) models. Feature Store provides an online store capable of low-latency, high-throughput reads and writes, and an offline store that provides bulk access to all historical record data. Feature Store

Read More
Shared by AWS Machine Learning June 7, 2023

Arrange your transcripts into paragraphs with Amazon Transcribe

Favorite Amazon Transcribe is a speech recognition service that generates transcripts from video and audio files in multiple supported languages and accents. It comes with a rich set of features, including automatic language identification, multi-channel and multi-speaker support, custom vocabularies, and transcript redaction. Amazon Transcribe supports two modes of operation:

Read More
Shared by AWS Machine Learning June 7, 2023