Creating Amazon SageMaker Studio domains and user profiles using AWS CloudFormation
Amazon SageMaker Studio is the first fully integrated development environment (IDE) for machine learning (ML). It provides a single, web-based visual interface where you can perform all ML development steps required to build, train, tune, debug, deploy, and monitor models. In this post, we demonstrate how you can create a SageMaker Studio domain and user profile using AWS CloudFormation. AWS CloudFormation gives you an easy way to model a collection of related AWS and third-party resources, provision them quickly and consistently, and manage them throughout their lifecycle by treating infrastructure as code.
Because AWS CloudFormation isn’t natively integrated with SageMaker Studio at the time of this writing, we use AWS CloudFormation to provision two AWS Lambda functions and then invoke these functions to create, delete, and update the Studio domain and user profile. In the rest of this post, we walk through the Lambda function to create the Studio domain (the code for creating a Studio user profile works similarly) and then the CloudFormation template. All the code is available in the GitHub repo.
Lambda function for creating, deleting, and updating a Studio domain
In the Lambda function, the
lambda_handler calls one of the three functions,
handle_create, handle_update, and
handle_delete, to create, update, and delete the Studio domain, respectively. Because we invoke this function using an AWS CloudFormation custom resource, the custom resource request type is sent in the
RequestType field from AWS CloudFormation.
RequestType determines which function to call inside the
lambda_handler function. For example, when AWS CloudFormation detects any changes in the
custom::StudioDomain section of our CloudFormation template, the
RequestType is set to Update by AWS CloudFormation, and the
handle_update function is called. The following is the
The three functions for creating, updating, and deleting the domain work similarly. For this post, we walk through the code responsible for creating a domain. When invoking the Lambda function through an AWS CloudFormation custom resource, we pass key parameters that help define our Studio domain via the custom resource
Properties. We extract these parameters from the AWS CloudFormation event source in the Lambda function. In the
handle_create function, parameters are read in from the event and passed on to the
create_studio_domain function. See the following code for
We use a boto3 SageMaker client to create Studio domains. For this post, we set the domain name, the VPC and subnet that Studio uses, and the SageMaker execution role for the Studio domain. After the
create_domain API is made, we check the creation status every 5 seconds. When creation is complete, we return the Amazon Resource Name (ARN) and the URL of the created domain. The amount of time that Lambda allows a function to run before stopping it is 3 seconds by default. Therefore, make sure that the timeout limit of your Lambda function is set appropriately. We set the timeout limit to 900 seconds. The following is the
create_studio_domain code (the functions for deleting and updating domains are also implemented using boto3 and constructed in a similar fashion):
Finally, we zip the Python script, save it as
domain_function.zip, and upload it to Amazon Simple Storage Service (Amazon S3).
The Lambda function used for creating a user profile is constructed similarly. For more information, see the
UserProfile_function.py script in the GitHub repo.
In the CloudFormation template, we create an execution role for Lambda, an execution role for SageMaker Studio, and the Lambda function using the code explained in the previous section. We invoke this function by specifying it as the target of a customer resource. For more information about invoking a Lambda function with AWS CloudFormation, see Using AWS Lambda with AWS CloudFormation.
Lambda execution role
This role gives our Lambda function the permission to create an Amazon CloudWatch Logs stream and write logs to CloudWatch. Because we create, delete, and update Studio domains in our function, we also grant this role the permission to do so. See the following code:
SageMaker execution role
The following SageMaker execution role is attached to Studio (for demonstration purposes, we grant this role SageMakerFullAccess):
AWS::Lambda::Function resource creates a Lambda function. To create a function, we need a deployment package and an execution role. The deployment package contains our function code (
function.zip). The execution role, which is the
LambdaExecutionRole created from previous step, grants the function permission to create a Lambda function. We also added the
CfnResponseLayer to our function’s execution environment.
CfnResponseLayer enables the function to interact with an AWS CloudFormation custom resource. It contains a send method to send responses from Lambda to AWS CloudFormation. See the following code:
Invoking the Lambda function using an AWS CloudFormation custom resource
Custom resources provide a way for you to write custom provisioning logic in a CloudFormation template and have AWS CloudFormation run it during a stack operation, such as when you create, update, or delete a stack. For more information, see Custom resources. We get the Lambda function’s ARN created from previous step and pass it to AWS CloudFormation as our service token. This allows AWS CloudFormation to invoke the Lambda function. We pass parameters required for creating, updating, and deleting our domain under
Properties. See the following code:
In the same fashion, we invoke the Lambda function for creating a user profile:
In this post, we walked through the steps of creating, deleting, and updating SageMaker Studio domains using AWS CloudFormation and Lambda. The sample files are available in the GitHub repo. For information about creating Studio domain inside a VPC, see Securing Amazon SageMaker Studio connectivity using a private VPC. For more information about SageMaker Studio, see Get Started with Amazon SageMaker Studio.
About the Authors
Qingwei Li is a Machine Learning Specialist at Amazon Web Services. He received his Ph.D. in Operations Research after he broke his advisor’s research grant account and failed to deliver the Nobel Prize he promised. Currently he helps customers in the financial service and insurance industry build machine learning solutions on AWS. In his spare time, he likes reading and teaching.
Joseph Jegan is a Cloud Application Architect at Amazon Web Services. He helps AWS customers use AWS services to design scalable and secure applications. He has over 20 years of software development experience prior to AWS, working on developing e-commerce platform for large retail customers. He is based out of New York metro and enjoys learning emerging cloud native technologies.
David Ping is a Principal Machine Learning Solutions Architect and Sr. Manager of AI/ML Solutions Architecture at Amazon Web Services. He helps enterprise customers build and operate machine learning solutions on AWS. In his spare time, David enjoys hiking and reading the latest machine learning articles.