DEPLOY MLFLOW MODEL TO A BATCH ENDPOINT

Ksama Arora

(I) DEPLOY MLFLOW MODEL TO A BATCH ENDPOINT*

Note: When you deploy Mlflow models to Batch endpoint, a Scoring Script and Environment will be automatically generated.

REGISTER AN MLFLOW MODEL:
To avoid needing scoring script and environment, Mlflow needs to be registered in Azure ML workspace before you can deploy to batch endpoint.

from azure.ai.ml.entities import Model # Model class used
from azure.ai.ml.constants import AssetTypes

# We are taking model files from local path. The files are stored in local folder called model.
model_name = 'mlflow-model' # model type as mlflow_model
model = ml_client.models.create_or_update(
    Model(name=model_name, path='./model', type=AssetTypes.MLFLOW_MODEL)
)

DEPLOY AN MLFLOW MODEL TO A BATCH ENDPOINT

Note: Advantage of using compute cluster is you can run scoring script on separate instances in parallel

To deploy Mlflow model to a batch endpoint, use code:

from azure.ai.ml.entities import BatchDeployment, BatchRetrySettings
from azure.ai.ml.constants import BatchDeploymentOutputAction

deployment = BatchDeployment( # use BatchDeployment class
    name="forecast-mlflow",
    description="A sales forecaster",
    endpoint_name=endpoint.name,
    model=model,
    compute="aml-cluster",
    instance_count=2,
    max_concurrency_per_instance=2,
    mini_batch_size=2,
    output_action=BatchDeploymentOutputAction.APPEND_ROW,
    output_file_name="predictions.csv",
    retry_settings=BatchRetrySettings(max_retries=3, timeout=300),
    logging_level="info",
)
ml_client.batch_deployments.begin_create_or_update(deployment)

IMP NOTE:
Configuration Setting for batch deployment:
mini_batch_size: number of records processed each time the run script is invoked
instance_count: number of compute nodes the batch job will use
output_action: determine how output should be treated, e.g. append_row or summary_only
scoring_script: used to configure path of scoring_script
max_concurrency_per_instance: Maximum number of parallel scoring script runs per compute node.
output_file_name: File to which predictions will be appended, if you choose append_row for output_action. A batch endpoint can be configured to take as input of a file and produce as output file.

Note: For the pipeline to run scoring script on multiple nodes and collate results, use apend_row to append each prediction to one output file

Two ways to add deployment to batch endpoint: