New Windows Data Science VM Refresh (15-NOV-2016) RRS feed

  • General discussion

    • A new refresh of the Windows Data Science Virtual Machine Image has been released to the Azure marketplace. You will see the version as 1.6.1 on the marketplace listing page of the VM on Azure.

      The following are the highlights in this release:

      1. [BREAKING CHANGE] Jupyter Notebook Server is disabled by default. A new script (c:\dsvm\tools\setup\JupyterSetPasswordAndStart.cmd) has been created to help you set the Jupyter password and start the Jupyter server. This is a one time process. There is a convenient shortcut on the desktop that you can double click to do this operation. You may have to wait a few seconds after setting the password before accessing Jupyter from your browser. 

      2. The following are the major new software on the VM:

      • Microsoft Cognitive Toolkit 2.0 (also known as CNTK). Now a Python interface (Python 3.4)  is available to build your deep neural networks. Comes with lots of samples, notebooks and hands-on labs. 
      • Weka 3 - An open source Visual Data Mining Software in Java. 
      • Apache Drill - An open source schema-free SQL Query Engine for Hadoop, NoSQL and Cloud Storage.  Supports ODBC and JDBC interfaces to enable querying NoSQL and files from standard BI tools like PowerBI, Excel, Tableau.
      • eVince PDF reader
      • Side-by-side install of  Microsoft R Open 3.3.1 (Enhanced Open source R distribution based on and 100% compatible with R-3.3.1). This is in addition to the default R instance which is Microsoft R Server developer edition (which in turn is based on R-3.2.2). You can switch to Microsoft R Open 3.3.1 if you need to run R programs that depend on the latest R version. 
      • OpenJDK 1.8.0_102 : An open source implementation of Java  Standard edition platform.
      • A new conda environment for Python 3.4 in addition to the existing Python 2.7, Python 3.5. 
      • R libraries: RSQLServer library to access SQL Server databases through the dplyr interface; dplyrXdf library to work with Microsoft R Xdf files using the dplyr paradigm. 
      • Mozilla Firefox browser. Jupyter notebook software is better tested and supported on Firefox. If you are see issues with running notebooks on default IE browser, please retry with Firefox. 
      • Visual Studio Code - An open source code editor supporting several popular programming languages and extensions. Particularly useful for editing / previewing markdown and a lightweight client to work with Git repositories.
      • Libraries to access objects stored in Azure Data Lake Store (WebHdfs) from Microsoft R Server using the R functions RxHdfsFileSystem, RxTextData.

      We have a few more new samples. They are can be found on the Jupyter notebooks home (c:\dsvm\notebooks on the VM instance) when you login to the Jupyter server on the new data science VM. Non-notebook samples are found in c:\dsvm\samples. You can also download detailed labs for Cortana Intelligence Advanced Analytics labs and Microsoft R Server from shortcuts found on desktop or from c:\dsvm\samples\Labs. 

    Let us know your feedback or questions on this release. 


    Thursday, November 17, 2016 12:49 AM