none
Installation of sparkmagic on jupyter to access remote HDInsights cluster RRS feed

  • Question

  • Using a Windows 2016 DSVM and trying to install sparkmagic extension to access Spark flavored HDInsights cluster.

    Followed https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-jupyter-notebook-install-locally

    without success. Upon jupyter restart, no new kernels show up.

    I installer sparkmagic using pip on both Anacondas (2.7 and 3.5), both of them show:

    (C:\Anaconda) C:\Windows\system32>pip show sparkmagic
    Name: sparkmagic
    Version: 0.11.2
    Summary: SparkMagic: Spark execution via Livy
    Home-page: https://github.com/jupyter-incubator/sparkmagic/sparkmagic
    Author: Jupyter Development Team
    Author-email: jupyter@googlegroups.org
    License: BSD 3-clause
    Location: c:\anaconda\lib\site-packages
    Requires: tornado, requests, notebook, hdijupyterutils, mock, ipywidgets, autovizwidget, ipython, numpy, nose, pandas, ipykernel

    Any ideas?

    Wednesday, October 4, 2017 3:03 PM

Answers

  • Sorry about this issue. We have not had a chance to validate sparkmagic on the Windows DSVM and hence do not support it yet. We do have it on our Ubuntu Data Science VM. Here is some info that may help you do similar thing on the Windows 2016 DSVM. 

    The kernel definitions are located in c:\programdata\jupyter\kernels  directory on Windows 2016 DSVM. 

    In c:\programdata\jupyter\kernels create a directory for the kernel definition of the remote HDInsight Spark kernel.  Create the kernel.json in that directory. Pasting what we have on the Ubuntu Datascience VM. You can use this with some tweaks: 

    {"argv":["/anaconda/envs/py35/bin/python","-m","sparkmagic.kernels.pyspark3kernel.pyspark3kernel", "-f", "{connection_file}"],
    "display_name": "Python 3 Spark - HDInsight"
    }

    You have to change python exe's path to "/anaconda/envs/py35/python" (There is no bin in the Windows Path to Python).

    Then you can follow directions in the sparkmagic article: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-jupyter-notebook-install-locally#configure-spark-magic-to-connect-to-hdinsight-spark-cluster

    Hope this helps. We will add this to our backlog. 

    Please share if you are able to get it to work. 

    Wednesday, October 4, 2017 9:38 PM

All replies

  • Sorry about this issue. We have not had a chance to validate sparkmagic on the Windows DSVM and hence do not support it yet. We do have it on our Ubuntu Data Science VM. Here is some info that may help you do similar thing on the Windows 2016 DSVM. 

    The kernel definitions are located in c:\programdata\jupyter\kernels  directory on Windows 2016 DSVM. 

    In c:\programdata\jupyter\kernels create a directory for the kernel definition of the remote HDInsight Spark kernel.  Create the kernel.json in that directory. Pasting what we have on the Ubuntu Datascience VM. You can use this with some tweaks: 

    {"argv":["/anaconda/envs/py35/bin/python","-m","sparkmagic.kernels.pyspark3kernel.pyspark3kernel", "-f", "{connection_file}"],
    "display_name": "Python 3 Spark - HDInsight"
    }

    You have to change python exe's path to "/anaconda/envs/py35/python" (There is no bin in the Windows Path to Python).

    Then you can follow directions in the sparkmagic article: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-jupyter-notebook-install-locally#configure-spark-magic-to-connect-to-hdinsight-spark-cluster

    Hope this helps. We will add this to our backlog. 

    Please share if you are able to get it to work. 

    Wednesday, October 4, 2017 9:38 PM
  • Is there any news about the roadmap? Will sparkmagic be implemented in the DSVM for windows?

    Wednesday, July 25, 2018 5:48 PM