AI Adventures in Azure: Choosing VM Size

The main purpose of a VM is to accelerate scripts compared to running locally on a laptop or desktop by outsourcing the computation to a more powerful remote computer. There is an overwhelming number of options for Azure VM sizes, each of which is optimised for a particular purpose, so to get the best performance for a specific application it’s important to choose the right VM. I started with no clue which VM would be right for me. I’m using the VM to apply scikit-learn algorithms to large images obtained from drones and satellites, which is memory hungry but “embarrassingly parallel” (meaning it is easy to separate the computation into chunks and distribute the computation across several individual cores).

I started off by prioritising access to lots of cores, thinking that distributing widely would be the best way to accelerate my code, so I initially opted for the NC24 series VM which has 24 available cores. I noticed that the NC24 series was not noticeably faster than running the code locally on my laptop, which has 8 available cores. Since the NC24 is relatively expensive, and benchmark tests showed no noticeable speed up from 8 to 24 cores, I switched to a more affordable NC6. This did not slow down the script at all relative to the NC24, suggesting number of cores is not limiting the speed of my script. To be sure, I briefly allocated a 64 core VM and ran the benchmark script again. There was clearly no need to pay extra for more cores, so the NC6 became my main VM for a while.

However, the experiments with the NC series VMs showed that there was no real benefit to paying for VM access relative to running locally on my laptop, at least in terms of benchmark script completion time, so I explored some compute-optimised options instead. The F16s-v2 worked nicely and was cheap compared to the NC series, however, it suffered from memory overload when running the larger benchmark scripts. This led to a switchover to a memory-optimised E20s-v3 VM (20vcpu’s, 160GB RAM, 32000 max IOPS). This VM outperforms my laptop and the other VM sizes I’ve tested for my particular image processing application.

So far, I am very happy with the performance of the E20s-v3 VM and will stick with it for a while, although I am interested by the announcement of the new Lsv2 series.

It was obviously extremely useful to have a benchmark script and image to compare the VMs. On an image-by-image basis the acceleration has a minor impact, but it will become more important as I start to scale to automated processing of large numbers of images.

All the VMs were running an Ubuntu 16.04 LTS Data Science machine image and the benchmarking used an identical Python script run using PyCharm.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s