Answered by:
Installation of sparkmagic on jupyter to access remote HDInsights cluster

Question
-
Using a Windows 2016 DSVM and trying to install sparkmagic extension to access Spark flavored HDInsights cluster.
Followed https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-jupyter-notebook-install-locally
without success. Upon jupyter restart, no new kernels show up.
I installer sparkmagic using pip on both Anacondas (2.7 and 3.5), both of them show:
(C:\Anaconda) C:\Windows\system32>pip show sparkmagic
Name: sparkmagic
Version: 0.11.2
Summary: SparkMagic: Spark execution via Livy
Home-page: https://github.com/jupyter-incubator/sparkmagic/sparkmagic
Author: Jupyter Development Team
Author-email: jupyter@googlegroups.org
License: BSD 3-clause
Location: c:\anaconda\lib\site-packages
Requires: tornado, requests, notebook, hdijupyterutils, mock, ipywidgets, autovizwidget, ipython, numpy, nose, pandas, ipykernel
Any ideas?Wednesday, October 4, 2017 3:03 PM
Answers
-
Sorry about this issue. We have not had a chance to validate sparkmagic on the Windows DSVM and hence do not support it yet. We do have it on our Ubuntu Data Science VM. Here is some info that may help you do similar thing on the Windows 2016 DSVM.
The kernel definitions are located in c:\programdata\jupyter\kernels directory on Windows 2016 DSVM.
In c:\programdata\jupyter\kernels create a directory for the kernel definition of the remote HDInsight Spark kernel. Create the kernel.json in that directory. Pasting what we have on the Ubuntu Datascience VM. You can use this with some tweaks:
{"argv":["/anaconda/envs/py35/bin/python","-m","sparkmagic.kernels.pyspark3kernel.pyspark3kernel", "-f", "{connection_file}"],
"display_name": "Python 3 Spark - HDInsight"
}
You have to change python exe's path to "/anaconda/envs/py35/python" (There is no bin in the Windows Path to Python).
Then you can follow directions in the sparkmagic article: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-jupyter-notebook-install-locally#configure-spark-magic-to-connect-to-hdinsight-spark-cluster
Hope this helps. We will add this to our backlog.
Please share if you are able to get it to work.
- Marked as answer by Gopi Kumar (MSFT) Thursday, October 19, 2017 8:46 PM
Wednesday, October 4, 2017 9:38 PM
All replies
-
Sorry about this issue. We have not had a chance to validate sparkmagic on the Windows DSVM and hence do not support it yet. We do have it on our Ubuntu Data Science VM. Here is some info that may help you do similar thing on the Windows 2016 DSVM.
The kernel definitions are located in c:\programdata\jupyter\kernels directory on Windows 2016 DSVM.
In c:\programdata\jupyter\kernels create a directory for the kernel definition of the remote HDInsight Spark kernel. Create the kernel.json in that directory. Pasting what we have on the Ubuntu Datascience VM. You can use this with some tweaks:
{"argv":["/anaconda/envs/py35/bin/python","-m","sparkmagic.kernels.pyspark3kernel.pyspark3kernel", "-f", "{connection_file}"],
"display_name": "Python 3 Spark - HDInsight"
}
You have to change python exe's path to "/anaconda/envs/py35/python" (There is no bin in the Windows Path to Python).
Then you can follow directions in the sparkmagic article: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-jupyter-notebook-install-locally#configure-spark-magic-to-connect-to-hdinsight-spark-cluster
Hope this helps. We will add this to our backlog.
Please share if you are able to get it to work.
- Marked as answer by Gopi Kumar (MSFT) Thursday, October 19, 2017 8:46 PM
Wednesday, October 4, 2017 9:38 PM -
Is there any news about the roadmap? Will sparkmagic be implemented in the DSVM for windows?
Wednesday, July 25, 2018 5:48 PM