Having introduced the set up and configuration of a new virtual machine and the ways to interact with it, I will now show some ways to use it to start programming in Python. This post will assume that the VM is allocated and that the user is accessing the VM using a remote desktop client.
1. Using the terminal
I am running an Ubuntu virtual machine, so the command line interface is referred to as the terminal. The language used to make commands is (usually) “bash”. Since the package manager Anaconda is already installed on the data science VM, it is very easy to start building environments and running Python code in the terminal. Here is an example where I’m creating a new environment called “AzurePythonEnv” that includes some popular packages:
>> conda create -n AzurePythonEnv python=3.6 numpy matplotlib scikit-learn pandas
Now this environment can be activated any time via the terminal:
>> source activate AzurePythonEnv
Now, with the environment activated, python code can be typed directly into the terminal, or scripts can be written as text files (e.g. using the pre-installed text editors Atom or Vim) and called from the terminal:
>> python /data/home/tothepoles/Desktop/script.txt
2. Using an IDE
The data science VM includes several IDEs that can be used for developing Python Code. My preferred option at the moment in PyCharm, but Visual Studio Code is also excellent and I can envisage using this as my primary IDE later on. IDEs are available under Applications > Development in the desktop toolbar or accessible via the command line. IDEs for other languages are also pre-installed on the Linux DSVM including R-Studio. Simply open the preferred IDE and start programming. In PyCharm the bottom frame in the default view can be toggled between the terminal and the python console. This means new packages can be installed into your environment and new environments created and removed from within the IDE, along with all the other functions associated with the command line. The basic workflow for programming in the IDE is to start a new project, link it to your chosen development environment, write scripts in the editor window then run them (optionally running them in the console so that variables and datasets remain accessible after the script has finished running).
3. Using Jupyter Notebooks
Jupyter notebooks are applications that allow active code to be run in a web browser, and the outputs displayed interactively within the same window. They are a great way to make code accessible to other users. The code is written nearly indentically to a normal python script except that it is divided into individual executable cells. Jupyter notebooks can be run in the cloud using Azure notebooks, making it easy to access Azure data storage, configure custom environments, deploy scripts and present it as an accessible resource hosted in the cloud. I will be writing more about this later as I develop my own APIs on Azure. For now, the Azure Notebook documentation is here. On the DSVM JupyterLab and Jupyter Notebooks are preinstalled and accessed simply by typing the command
>> jupyter notebook