Event box

MS Azure for Research

Duke University Libraries and Microsoft Research invite you to spend a day and a half learning about the cloud computing environment that Microsoft Research uses on daily basis.  This information is targeted for researchers and those that need to use the cloud to scale out their computing environments.   We'll be learning how to use R and Machine Learning, create massive Linux Hadoop clusters, High Performance Computing, and how to analyze high-velocity data streaming / Internet of Things.

Azure For Research is an intensive class built by cloud experts at Microsoft Research.  The class is designed to familiarize researchers and data scientists with the services Azure offers to aid them in their research, especially with regard to high-performance computing, big-data analysis, and analyzing data streaming from Internet-of-Things (IoT) devices. Instruction is hands-on, with students spending most of the day working proctored labs -- we invite you to bring your laptop (Windows, Linux or Mac) with a modern browser (Chrome, Firefox, Internet Explorer or Microsoft Edge); each student will receive an Azure Pass with a $500 Azure credit that can be used well after class is over.



Faculty, grad students, or anyone interested in big data and cloud-based computing.



An open mind and a desire to learn; no experience with Azure or cloud computing required.



Agenda - Thursday, April 7th - 9:00 am - 5:00 pm


Module 1: Introduction to Microsoft Azure

This introductory session provides a broad overview of Azure and the services it offers. It also discusses cloud computing in general and ways in which the cloud can be an asset to researchers. At the conclusion, students activate their Azure Passes and explore the Azure Portal, which is the primary tool used to manage Azure resources.


Module 2: Azure Machine Learning

Azure Machine Learning is a powerful tool for performing predictive analytics on large volumes of semi-structured data. In this module, students use the interactive Azure Machine Learning Studio to build, train, and score a model. Then they put the model to work performing predictive analytics.


Module 3: Azure Storage

Azure Storage is a set of services for storing data in the cloud. Of particular interest to researchers is Azure Blob Storage, which serves as a source of input and output for Azure data services. In this session, students learn how to move data in and out of blob storage as a precursor to working with Stream Analytics and other services that use it.


Module 4: Azure Stream Analytics and the Internet of Things

Azure Stream Analytics is a service that enables researchers to query and analyze high-velocity data streaming from IoT devices and other sources in real time. In this module, students combine Azure Stream Analytics with Azure Event Hubs to perform real-time analytics on data emanating from simulated ATM machines.


Module 5: Big-Data Analytics with Apache Spark for Azure HDInsight

Some of the most commonly used tools for analyzing big data include Hadoop, Spark, and Zeppelin. In this session, students learn about Azure HDInsight (Azure’s implementation of Hadoop), deploy a Linux Spark cluster, and gain first-hand experience using Zeppelin and Jupyter to analyze data on the cluster.


Module 6: High-Performance Computing

Big problems require big solutions. One of the benefits of cloud computing is that with a few button clicks, you can bring the power of massive parallel processing to bear on projects that require it. In this module, students deploy a SLURM cluster of Linux servers and use it to perform parallel processing on image data.


Agenda - Friday, April 8th - 9:00 am - 12:00 pm

We will have a few deep-dive breakout sessions on Friday.  The exact topics are still being determined at this time.


Thursday, April 7, 2016
9:00am - 5:00pm
Bostock 127 (The Edge Workshop Room)
West Campus
Registration has closed.

Event Organizer

Hannah Rozear