Azure, Microsoft’s ecosystem for cloud services, offers a wide range of services including standard virtual machines, managed databases, and interactive notebooks, similar to its competitors Amazon Web Services and Google Cloud Platform. In this article, you’ll get a comprehensive overview of how to get started and which resources to turn to for your data and computing needs.
To get started, head over to Azure’s website. There, you can click to sign up, where you’ll be handed $200 credit for the first 30 days to help get you up and running.
Once you finish signup, you’ll be brought to Azure Portal, which has one of two default main screens: your services dashboard, or the home screen.
The dashboard page will display various metrics of your current services including usage, reliability, and billing, while the home screen will allow you to easily create new resources, or easily navigate to various other pages including the dashboard or billing.
The homepage view.
The dashboard view.
You can customize the default view of the portal by clicking on the settings gear in the upper right hand corner.
Back at the Portal home page, you can then go about creating a new resource by clicking Create a resource, or you can use the shortcut g+n. From there, you’ll see an extensive list of options:
With that, here’s some resources to key in on when your working with data.
While Azure initially only offered Windows for the cloud, (before 2014 Microsoft Azure was known as Windows Azure) today, there are a wide range of OS options, including many flavors of linux.
As the name implies, Azure Virtual Machines provide you the capabilities of a full fledged computer, which you can connect to via ssh and command line. Although not common, you can also Remote Desktop into a virtual machine if you are looking for a GUI interface, although this will consume additional resources. From there, you can do anything from running script to adding software via the package manager, allowing you to build a server with Apache or nginx or a database with SQL. While you can manage and install resources manually on a VM using sudo apt-get (Ubuntu), or similar OS specific commands, Azure also offers a myriad of VM distributions that are preconfigured for various scenarios and are apt to fit your needs, streamlining installation and maintenance concerns. There is an image for SQL Server and others specialized for the needs of data scientists.
While there is a preconfigured image for SQL Server via an Azure VM, a more streamlined database approach is Azure’s SQL Databases. Installing SQL Server via Azure VM will give you more control over the installation details and administration, but also require additional maintenance, such as managing updates and patches. As such, Azure SQL Databases, provide a more streamlined, fully managed interface where you can simply spin up a database and not worry about further backend administration.
For further comparison between SQL Server on an Azure VM versus Azure’s SQL DB, see this video via EdX.
If you do choose a fully managed service, it’s only a few minutes of clicking through the GUI before you can begin querying and importing data into the database. For a full description, see Azure’s own quickstart guide, which will have you up and running in no time.
Another popular offering for data science is the Machine Learning Studio. Heading over to ml.azure.com will bring you to Azure Machine Learning Studio. To start, you’ll have to pick the default directory (a workspace) and subscription that you want to attach these services to.
Azure workspaces are a network of resources, serving to organize and monitor interconnected services.
For those coming from AWS, this is equivalent to their Virtual Private Cloud. Once you’ve selected a workspace and subscription, you’ll find yourself at the homescreen, where you can start building pipelines, exploring data, and deploying models.
On the machine learning studio home page, you’ll see options to code with interactive Python notebooks, as well as code free GUI interfaces for training and deploying models via Automated ML and ML Designer. Please note that these code free tools (Automated ML and ML Designer) are only available for enterprise accounts. Once you create a new notebook, you’ll need to attach a compute node in order to be able to run code. Afterwards, you’ll then begin to be able to interactively code, exploring your data and training models using familiar open source Python tools such as Pandas, scikit-learn and Tensorflow. For more information, see Azure’s own preliminary tutorial on running notebooks via Machine Learning Studio.
If you don’t need the interactivity of machine learning studio, but merely need to run scripts without having to worry about the maintenance and configuration of setting up a full virtual machine, Azure’s Function APP is the perfect offering. As with AWS’s Lambda, Azure’s Function APP allows you to upload custom scripts and execute them on set schedules, or define event triggers that will cause the script to run. You could set a script to run when new documents are added to a database, or in response to an HTTP request.
For more details, see Microsoft’s overview of Functions, and their tutorial on creating a new function.
PowerBI provides a streamlined framework for building dashboards, similar to Tableau. Once you’ve set up your backend infrastructure to ingest, process, store, and explore data, you can have a dashboard put together in minutes. This is incredibly powerful, allowing you to monitor data and analytics in real time, with a streamlined setup. You can also embed these dashboards into websites. If you’re ready to dive in, see the Power BI get started documentation.
In addition to the ability to create resources from the portal, Azure also supports software development kits allowing you to programmatically add, remove and modify services. There are a wide range of supported languages, so whether you prefer to code in Python, Javascript, C++, Java, Ruby, or Go, there should be a binding to fit your needs. You can learn more about Azure SDKs here.
With such a wide range of offerings, navigating Azure’s complex ecosystem can be a daunting task. While there are many additional offerings servers, databases, virtual interactive notebooks, drag and drop modeling, and streamlined dashboards are all valuable resources to building and scaling your cloud infrastructure.
Chisel provides end-to-end solutions for your data and analytics programs. Looking to get started migrating to the cloud? Contact us today and find the experts you need to get started.
You may not be ready for us now, but you’ll want to remember us when you are. Enter your email to stay updated on the latest in analytics and our services.