When compared to data scientists, traditional software developers have it easy. The tooling is pretty much there, the patterns are pretty much there too, it’s easy1 to decide whether or not an idea is feasible, and the list goes on.
With machine learning however, things aren’t so clear cut. We’re still deciding on the best ways to track and version everything, what patterns to use when developing (pro tip: SOLID is a solid choice), the right ways to deploy and monitor our models, etc. It’s a rapidly evolving field.
For example, one thing I really enjoy when doing traditional software development is the very straightforward way to do automatic deployments. You simply set up an Azure or GitHub pipeline, link it to your cloud of choice, push to main
and there you go, the build’s spinning away and away and away until everything’s compiled, transpiled, minified, whatever, until it’s up and running on that shiny cluster you’ve got high up in the azure sky.
This is a story about doing something similar for machine learning models. After reading it, you’ll be able to create a continuous deployment pipeline for Azure ML pipelines using Azure DevOps. Every time somebody will check in anything, your pipeline will be updated to contain the latest changes, without you having to do anything except a git push
. It’ll be awesome.
The story begins with a dream.
A Dream of Pipelines
You’ve just finished reading a cool article on online versus offline scoring, and are firmly in the “offline” camp. You’re eager to create your own data processing and model training pipeline, heck, maybe you’ve created one already. You’ve scheduled it to run regularly, maybe weekly, maybe daily, maybe more often. You’ve manually deployed it to prod, and all is well with the world.
You sit down and make yourself a foam-free latte, and all continues to go well for a while, exactly up to the point when somebody comes to you and asks for a change.
Even though change is good, you’re all up for change, change is the essence of existence and all that, in your case change is no fun, no fun indeed. Change means you’ll have to track down the pipeline in the Azure ML workspace, disable it, disable it’s schedule too, and then run the scripts to create the new, updated pipeline, in all it’s glory. All of this pretty much manually, of course.
Even if it’s no big deal the first, second, maybe third time this happens, the hundredth time might be a bit of an annoyance. You figure it may be worth automating some of it away.
CD for Azure ML Pipelines
Before you begin, it may be worth considering exactly what stuff to automate. A simple guideline would be to just look at the manual steps you used to perform, and treat them as a script.
Something like this:
schedule = find_existing_schedule(schedule_name)
disable_existing_pipeline(schedule)
disable_existing_schedule(schedule)
create_new_pipeline()
It starts by getting a reference to the previous schedule, and uses that reference to disable both the pipeline and the schedule itself. Once that’s done, it creates a new version of the pipeline. Sadly, we don’t currently have a way to update an Azure ML pipeline, so we need to first disable it and then create it again in order to get any updates2.
You might also wonder why we need to find the pipeline’s schedule first, and only then disable both of them. This is because the current version of the AML SDK doesn’t support finding a pipeline by name (or by experiment), so we need to rely on its schedule in order to get a reference to our pipeline object.
The methods might look like the ones below:
from azureml.core import Workspace
from azureml.pipeline.core.schedule import Schedule
def find_existing_schedule(schedule_name: str, workspace: Workspace):
print('Checking existing schedules')
schedules = Schedule.list(workspace)
print(f'Found {len(schedules)} schedules in the workspace')
for schedule in schedules:
if schedule.name == schedule_name:
print(f'Found schedule {schedule_name}')
return schedule
Note that we need to pass a reference to the Azure ML workspace we’re working against, which can be easily obtained using the reliable from_config method: ws = Workspace.from_config()
. This depends on the config.json
workspace config file being available in the current directory.
The disable_*
methods are quite straightforward:
from azureml.pipeline.core import PublishedPipeline
def disable_existing_pipeline(schedule: Schedule):
print('Disabling existing pipeline')
PublishedPipeline.get(workspace, schedule.pipeline_id).disable()
def disable_existing_schedule(schedule: Schedule):
print('Disabling existing schedule')
schedule.disable()
The method creating a new pipeline is a bit more complex though, so it’s best to split it into several smaller ones:
def create_new_pipeline(pipeline_name: str, schedule_name: str, experiment_name: str, compute_name: str, workspace: Workspace):
compute_target = get_or_create_compute(compute_name, workspace)
pipeline = create_pipeline_structure(compute_target, workspace)
create_time_based_schedule(pipeline, pipeline_name, schedule_name, experiment_name, path_on_datastore, workspace)
All pipelines need to run on some compute, so we’ll make sure to either retrieve an existing one, or create it if it doesn’t exist.
from azureml.core.compute import ComputeTarget, AmlCompute
def get_or_create_compute(compute_name, workspace: Workspace):
print('Acquiring a compute resource')
if compute_name in workspace.compute_targets:
compute_target = workspace.compute_targets[compute_name]
if compute_target and type(compute_target) is AmlCompute:
print(f'Using existing compute: {compute_name}')
else:
print(f'Creating new compute: {compute_name}')
provisioning_config = AmlCompute.provisioning_configuration(
vm_size = 'Standard_DS11_v2',
min_nodes = 0, max_nodes = 2,
idle_seconds_before_scaledown=900
)
compute_target = ComputeTarget.create(workspace, compute_name, provisioning_config)
compute_target.wait_for_completion(show_output=True)
return compute_target
Now let’s define the pipeline structure. We’re going to define the simplest pipeline in the world by the way, no inputs, no outputs, just a one step running a simple script. Even though you can pretty much do anything in script.py
, it’s best to just have it print('Hello world')
for now.
If you’re interested in seeing more complex pipeline setups, you’ll find them in these articles on deploying models with AML pipelines and passing data between AML pipeline steps.
from azureml.pipeline.core import Pipeline
from azureml.pipeline.steps import PythonScriptStep
def create_pipeline_structure(compute_target: ComputeTarget, workspace: Workspace):
print('Creating the pipeline structure')
step = PythonScriptStep(
name='Main',
script_name='script.py',
arguments=[],
outputs=[],
compute_target=compute_target,
source_directory='./script',
allow_reuse=False,
)
pipeline = Pipeline(workspace=workspace, steps=[step])
pipeline.validate()
return pipeline
Finally, the schedule. For this example I’ve settled on a simple schedule that runs every 45 minutes, there are several more options and examples in the schedule recurrence docs.
from azureml.pipeline.core.schedule import ScheduleRecurrence
def create_time_based_schedule(pipeline: Pipeline, pipeline_name, schedule_name, experiment_name, workspace: Workspace):
print('Publishing pipeline and creating a time based schedule')
published_pipeline = pipeline.publish(pipeline_name)
recurrence = ScheduleRecurrence(frequency='Minute', interval=45)
Schedule.create(workspace,
name=schedule_name,
pipeline_id=published_pipeline.id,
experiment_name=experiment_name,
recurrence=recurrence)
Cool, so now you have a script that can update your pipeline every time you run it. This means that the next time you need to make any changes you’ll just have to run this and wait for the pipeline to be updated.
But you do need to remember to run the script, so how about we automate that, too? 🤔
Azure Pipelines to the Rescue
And how about we use Azure Pipelines to do the automation?
I’m going to make some assumptions here, the biggest one being that you’re hosting your project on Azure DevOps, and that you know your way around it if only just a little. Maybe you’ve even taken Azure Pipelines for a spin or two. If you haven’t done any of that yet, now would be a good time to do so.
Still with me? Good. I’ll show you how to create an Azure pipeline that runs the updater script every time somebody pushes code to the project repo.
Before we continue, let’s review the things we need in order to run our pipeline-creating script:
- The
azureml-sdk
package installed in the active environment - A workspace configuration file (
config.json
) that tells the SDK how to communicate with your Azure Machine Learning workspace - Access to your AML workspace, so that the script can actually make the necessary changes
Now, getting these things locally is pretty straightforward. You create a conda environment, run pip install azureml-sdk=1.33
to install the SDK, download the config.json
in your script’s directory, and use the Azure CLI to do a quick az login
. Once that’s done, you can run the script as often as you’d like.
Doing this in a cloud pipeline is a bit different though.
We’ll start by defining an empty pipeline that runs whenever somebody pushes code to your repo. Just create an azure-pipelines.yml
file in the root of your repository and fill it with the code below.
It configures the pipeline to only run when code is pushed to the main branch, while making sure the pipeline runs on a Linux agent.
trigger:
- main
pool:
vmImage: ubuntu-latest
Let’s make sure the azureml-sdk
package is installed on your build machine. You don’t really need to use conda for this since you don’t need to worry about keeping the machine clean - every time the pipeline runs, it runs on a brand new vm. This means that running pip in a Bash task is more than enough for our needs3.
- task: Bash@3
inputs:
targetType: 'inline'
script: |
echo Installing AML SDK
pip install azureml-sdk==1.33
Making the workspace configuration available is a bit more tricky. A simple way to do it is by storing it as a secure file, and downloading it at build time so our script can access it at runtime.
- task: DownloadSecureFile@1
name: config_json
inputs:
secureFile: 'config.json'
- task: Bash@3
inputs:
targetType: 'inline'
script: |
echo Copying $(config_json.secureFilePath) to $(Build.SourcesDirectory)
cp $(config_json.secureFilePath) $(Build.SourcesDirectory)
All that’s left now is making sure we’re authorized against your subscription. There are two ways to do this, one being the right way and the other being the simple way. I’ll show you the simple way, but keep in mind that the right way is documented here.
We’ll be using the very useful Azure CLI task, which allows us to run our script against an Azure subscription and also helps with setting up access using Azure Resource Manager. In order to do this, you’ll need to create a service connection for your subscription/resource group, as documented here.
- task: AzureCLI@2
inputs:
azureSubscription: '<your subscription>'
scriptType: 'bash'
scriptLocation: 'inlineScript'
inlineScript: |
echo Updating pipeline
python update_pipeline.py
With this latest bit, your pipeline is now complete4.
You should now have a Python script able to update your Azure ML pipelines, and an Azure pipeline able to run that script every time something changes. If you’ve followed along, then congratulations!
That being said I hope you’ve found this article useful, and I definitely hope that you’ll use it to automate the deployment of your own pipelines. Life’s too short to deploy stuff manually, y’know.
If you want me to let you know as soon as I write more articles on Azure ML (and stuff in general), then make sure to subscribe below. I usually write a new article each month.
Following me on Twitter works too 😋.
I've written a short guide on doing CI/CD with Azure ML pipelines, detailing how to:
— Vlad Iliescu (@vladiliescu) September 1, 2021
- 🐍 write a simple Azure ML pipeline
- 🗓 schedule it to run hourly
- 🦾 write a script that automatically updates it
- 🚀 run script every time code is pushed to mainhttps://t.co/4YAZJkND23
-
Relatively ↩︎
-
You could use a versioned pipeline endpoint to group all updates, but that’s a bit overkill for our scenario ↩︎
-
That being said, using a
requirements.txt
or a conda yaml file will come in handy for more complex environments ↩︎ -
It rhymes, so it must be true ↩︎