reCAPTCHA WAF Session Token

Deploying Your Machine Learning Model to Production in the Cloud

Image by Editor


AWS, or Amazon Web Services, is a cloud computing service used in many businesses for storage, analytics, applications, deployment services, and many others. It’s a platform utilizes several services to support business in a serverless way with pay-as-you-go schemes.

Machine learning modeling activity is also one of the activities that AWS supports. With several services, modeling activities can be supported, such as developing the model to making it into production. AWS has shown versatility, which is essential for any business that needs scalability and speed.

This article will discuss deploying a machine learning model in the AWS cloud into production. How could we do that? Let’s explore further.



Before you start this tutorial, you need to create an AWS account, as we would need them to access all the AWS services. I assume that the reader would use the free tier to follow this article.  Additionally, I assume the reader already knows how to use Python programming language and has basic knowledge of machine learning.  Also, we will focus on the model deployment part and will not concentrate on other aspects of data science activity, such as data preprocessing and model evaluation.

With that in mind, we will start our journey of deploying your machine learning model in the AWS Cloud services.



In this tutorial, we will develop a machine-learning model to predict churn from the given data. The training dataset is acquired from Kaggle, which you can download here.

After we have acquired the dataset, we would create an S3 bucket to store the dataset. Search the S3 in the AWS services and make the bucket.


Deploying Your ML Model to Production in the Cloud
Image by Author


In this article, I named the bucket “telecom-churn-dataset” and located in Singapore. You can change them if you want, but let’s go with this one for now.

After you have finished creating the bucket and uploading the data into your bucket, we will go to the AWS SageMaker service. In this service, we will use the Studio as our working environment. If you have never used the Studio, let’s create a domain and user before proceeding further.

First, choose the Domains within the Amazon SageMaker Admin configurations.


Deploying Your ML Model to Production in the Cloud
Image by Author


In the Domains, you would see a lot of buttons to select. In this screen, select the Create domain button. 


Deploying Your ML Model to Production in the Cloud
Image by Author


Choose the quick setup if you want to speed up the creation process. After it’s finished, you should see a new domain created in the dashboard. Select the new domain you just created and then click the Add user button.


Deploying Your ML Model to Production in the Cloud
Image by Author


Next, you should name the user profile according to your preferences. For the execution role, you can leave it on default for now, as it’s the one that was created during the Domain creation process.


Deploying Your ML Model to Production in the Cloud
Image by Author


Just click next until the canvas setting. In this section, I turn off several settings that we don’t need, such as Time Series Forecasting. 

After everything is set, go to the studio selection and select the Open studio button with the user name you just created.


Deploying Your ML Model to Production in the Cloud
Image by Author


Inside the Studio, navigate to the sidebar that looks like a folder icon and create a new notebook there. We can let them by default, like the image below.


Deploying Your ML Model to Production in the Cloud
Image by Author


With the new notebook, we would work to create a churn prediction model and deploy the model into API inferences that we can use in production.

First, let’s import the necessary package and read the churn data.

<code>import boto3
import pandas as pd
import sagemaker

sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()

df = pd.read_csv('s3://telecom-churn-dataset/telecom_churn.csv')</code>


Deploying Your ML Model to Production in the Cloud
Image by Author


Next, we would split the data above into training data and testing data with the following code.

<code>from sklearn.model_selection import train_test_split

train, test = train_test_split(df, test_size = 0.3, random_state = 42)</code>


We set the test data to be 30% of the original data. With our data split, we would upload them back into the S3 bucket.


train.to_csv(f's3://{bucket}/telecom_churn_train.csv', index = False)
test.to_csv(f's3://{bucket}/telecom_churn_test.csv', index = False)</code>


You can see the data inside your S3 bucket, which currently consists of three different datasets.


Deploying Your ML Model to Production in the Cloud
Image by Author


With our dataset ready, we would now develop a churn prediction model and deploy them. In the AWS, we often use a script training method for machine learning training. That’s why we would develop a script before starting the training.

For the next step, we need to create an additional Python file, which I called, in the same folder.


Deploying Your ML Model to Production in the Cloud
Image by Author


Inside this file, we would set our model development process to create the churn model. For this tutorial, I would adopt some code from Ram Vegiraju.

First, we would import all the necessary packages for developing the model.

<code>import argparse
import os
import io
import boto3
import json
import pandas as pd

from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
import joblib</code>


Next, we would use the parser method to control the variable that we can input into our training process. The overall code that we would put in our script to train our model is in the code below.

<code>if __name__ == '__main__':
    parser = argparse.ArgumentParser()

    parser.add_argument('--estimator', type=int, default=10)
    parser.add_argument('--sm-model-dir', type=str, default=os.environ.get('SM_MODEL_DIR'))
    parser.add_argument('--model_dir', type=str)
    parser.add_argument('--train', type=str, default=os.environ.get('SM_CHANNEL_TRAIN'))
    args, _ = parser.parse_known_args()
    estimator = args.estimator
    model_dir = args.model_dir
    sm_model_dir = args.sm_model_dir
    training_dir = args.train

    s3_client = boto3.client('s3')

    obj = s3_client.get_object(Bucket=bucket, Key='telecom_churn_train.csv')
    train_data = pd.read_csv(io.BytesIO(obj['Body'].read()))
    obj = s3_client.get_object(Bucket=bucket, Key='telecom_churn_test.csv')
    test_data = pd.read_csv(io.BytesIO(obj['Body'].read()))
    X_train = train_data.drop('Churn', axis =1)
    X_test = test_data.drop('Churn', axis =1)
    y_train = train_data['Churn']
    y_test = test_data['Churn']
    rfc = RandomForestClassifier(n_estimators=estimator), y_train)
    y_pred = rfc.predict(X_test)
    print('Accuracy Score: ',accuracy_score(y_test, y_pred))
    joblib.dump(rfc, os.path.join(args.sm_model_dir, "rfc_model.joblib"))</code>


Lastly,  we need to put four different functions that SageMaker requires to make inferences: model_fn, input_fn, output_fn, and predict_fn.

<code>#Deserialized model to load them

def model_fn(model_dir):
    model = joblib.load(os.path.join(model_dir, "rfc_model.joblib"))
    return model
#The request input of the application
def input_fn(request_body, request_content_type):
    if request_content_type == 'application/json':
        request_body = json.loads(request_body)
        inp_var = request_body['Input']
        return inp_var
        raise ValueError("This model only supports application/json input")
#The prediction functions
def predict_fn(input_data, model):
    return model.predict(input_data)

#The output function
def output_fn(prediction, content_type):
    res = int(prediction[0])
    resJSON = {'Output': res}
    return resJSON</code>


With our script ready, we would run the training process. In the next step, we would pass the script we created above into the SKLearn estimator. This estimator is a Sagemaker object that would handle the entire training process, and we would only need to pass all the parameters similar to the code below.

<code>from sagemaker.sklearn import SKLearn

sklearn_estimator = SKLearn(entry_point="", 
                              'estimator': 15})</code>


If the training is successful, you will end up with the following report.


Deploying Your ML Model to Production in the Cloud
Image by Author


If you want to check the Docker image for the SKLearn training and your model artifact location, you can access them using the following code.

<code>model_artifact = sklearn_estimator.model_data
image_uri = sklearn_estimator.image_uri

print(f'The model artifact is saved at: {model_artifact}')
print(f'The image URI is: {image_uri}')</code>


With the model in place, we would then deploy the model into an API endpoint that we can use for prediction. To do that, we can use the following code.

<code>import time

churn_endpoint_name="churn-rf-model-"+time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())



If the deployment is successful, the model endpoint is created, and you can access it to create a prediction. You can also see the endpoint in the Sagemaker dashboard.


Deploying Your ML Model to Production in the Cloud
Image by Author


You can now make predictions with this endpoint. To do that, you can test the endpoint with the following code.

<code>client = boto3.client('sagemaker-runtime')
content_type = "application/json"

#replace with your intended input data
request_body = {"Input": [[128,1,1,2.70,1,265.1,110,89.0, 9.87,10.0]]}

#replace with your endpoint name
endpoint_name = "churn-rf-model-2023-09-24-12-29-04" 
#Data serialization
data = json.loads(json.dumps(request_body))
payload = json.dumps(data)

#Invoke the endpoint
response = client.invoke_endpoint(
result = json.loads(response['Body'].read().decode())['Output']


Congratulation. You have now successfully deployed your model in the AWS Cloud. After you have finished the testing process, don’t forget to clean up the endpoint. You can use the following code to do that.

<code>from sagemaker import Session

sagemaker_session = Session()


Don’t forget to shut down the instance you use and clean up the S3 storage if you don’t need it anymore.

For further reading, you can read more about the SKLearn estimator and Batch Transform inferences if you prefer to not have an endpoint model.



AWS Cloud platform is a multi-purpose platform that many companies use to support their business. One of the services often used is for data analytic purposes, especially model production. In this article, we learn to use AWS SageMaker and how to deploy the model into the endpoint.
Cornellius Yudha Wijaya is a data science assistant manager and data writer. While working full-time at Allianz Indonesia, he loves to share Python and Data tips via social media and writing media.

Leave a Reply

Your email address will not be published. Required fields are marked *

WP Twitter Auto Publish Powered By :