Posts List

Speaking to Girls who Code

Last week I had the pleasure of giving an open AI workshop for the awesome Girls who Code Iasi community. They had approached me earlier this year about such a workshop, and I suggested teaching the audience about Azure Automated Machine Learning, since this is a topic near and dear to me - I think it’s a great way for beginners to start dabbling with machine learning, and for experts to have something up and running in no time.

Apart from this, I wanted to have a tangible goal for this workshop, so I designed it so that the audience would gradually learn to use and understand Automated ML while competing in an open Kaggle competition, the Titanic competition. This may sound a bit familiar 😋.

Long story short, I quite enjoyed the experience, and so did the audience - I guided them through understanding how Automated ML can be used, how it works, and what’s under the hood over the course of 3 hours. I underlined the elegant use of open-source components wherever possible (i.e. scikit—learn pipelines, scalers, algorithms), and taught them how to extend and improve the automatically trained models to suit their scenarios. We discussed automated ML versus hyperparameter optimization, what’s the best approach when evaluating a model, and how we can make sure the models we train can perform well when presented with unknown data.

The cherry on top was using the insights gained from running model explainability over the automatically trained models in order to generate new features that could be used to train better, faster, and stronger models.

In the end we reached top 4% in the public leaderboard (well, technically, it was 3.36% - 538 out of 16,001 teams 🤓). All of this in just three hours, without doing a lot more than just understanding and acting on the insights gained from the automatically generated models. The audience was great as always, energetic, eager to learn, engaged.

All I can say is that it was a lovely experience, one that I hope to be part of again in the future. If you’d like to try out our code and see for yourself, take a look at this GitHub repo.

Machine Learning in Azure: Service versus Studio

This is a more detailed version of my Boy meets Girl talk, created specially for Microsoft Ignite | The Tour Amsterdam 2019. Whereas Boy meets Girl was mostly focused on how to deploy a trained model using either Azure ML Service or ML Studio, here I wanted to create a more in-depth comparison of the two tools. This is what led me to the concept of having multiple rounds, with the audience voting for their favourite tool (truth be told, I think I just wanted another go at delivering something similar to my TypeScript versus CoffeeScript talk 🤓).

Once the concept was clear, I spent a significant amount of time just polishing the examples and making sure they’re as exhaustive as possible. And then, of course, another significant amount of time was spent just cutting out things because they didn’t fit with the rest of the story 🙄. All worth it of course, since it allowed me to also have a meaningful conversation with the audience (which was really really really active and involved), answering questions and going into more detail if necessary, instead of just rushing to go through all of the slides.

Recording

Resources

The resources used during the talk are available on GitHub, below is a quick rundown of what you’ll find there:

  • First things first, I used the training dataset from Kaggle’s Petfinder competition, available here, you will need this in order to be able to run the code.
  • A sample configuration file is available in aml_config, all you need to do is fill in your own subscription/workspace details here
  • Code for Round 1 - Look and Feel is available here, incuding the training script and the Jupyter notebook used for integrating with Machine Learning Service
  • Code for Round 2 - Analysing and Preparing Data is here, just a simple notebook with some very light data analysis
  • Code for Round 3 - Training and Evaluating Models is here, again just a simple training script and the corresponding Jupyter notebook
  • Last but not least, the code for Round 4 - Deploying and Consuming Models is here, where we also have the score.py and conda_dependencies.yml files needed to build the Docker image. And of course, the input.json file used for invoking the scoring web service (this uses the standard structure for Azure ML Studio, this is why the code in score.py looks the way it does)
  • The Machine Learning Studio experiments are available in the Azure AI Gallery: Round 1, Round 2, and Round 3. Since Round 4 was all about deploying the experiment as a web service, you can reuse the Round 3 experiment
  • The slides are available on Speaker Deck
  • I’m also linking to two tutorials, one for Machine Learning Studio and the other for Machine Learning Service, in case you want to learn more.

Getting Started with Machine Learning Using Azure Machine Learning Studio and Kaggle Competitions

Long title, I know 🤫. It used to be shorter, as some earlier versions of this talk were called ‘Predicting Survivability on the Titanic’, but this time I wanted to experiment a bit and make it real easy for the audience to decide whether or not this would be interesting for them. And so they did.

You see, they wanted to learn more about machine learning. And, the way I see it, the two tools I talked about - Azure Machine Learning Studio and Kaggle Competitions - can help you get started with ML, while also making it fun to do so.

So we proceeded with actually competing live in the Titanic: Machine Learning from Disaster starter competition, downloading the passengers dataset, training a very simple (and overly optimistic) model, a model which crashed and burned when pit against the other participants in the competition 🤭, learning from our mistakes and gradually fixing the issues with the dataset, creating new features, improving the model, and in the end achieving a top 20% score (which, I know, could have been better, but hey, we only had 1 hour to achieve all of this 😉).

Apart from achieving this, I must say I absolutely loved interacting with the audience, answering their questions and discussing the various approaches of parsing data, doing feature engineering, picking the right algorithm, and evaluating a model. It was awesome, and I’m very grateful for that 😁.

Recording

Resources

The resources used during the talk are available in the Azure AI Gallery - the Basic Experiment, Feature Engineering, and Binning, whereas the slides are on Speaker Deck.

Boy meets Girl: A Machine Learning Deployment Story

This was a fun talk to write :). Ever since I saw Azure ML Service being announced, I knew I wanted to compare it with ML Studio, a tool with which I had a bit more experience. And so I did.

Since 45 minutes is nowhere near enough to compare the two tools (lesson re-learned the hard way while designing Service versus Studio), I decided to only compare their deployment capabilities, given an already trained model.

Resources

The resources used during the talk are available on GitHub.