spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arif,Mubaraka" <arif.mubar...@heb.com>
Subject Help with Jupyter Notebook Settup on CDH using Anaconda
Date Sat, 03 Sep 2016 19:10:15 GMT
<html dir="ltr">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style id="owaParaStyle">P {
	MARGIN-BOTTOM: 0px; MARGIN-TOP: 0px
}
</style>
</head>
<body fPStyle="1" ocsi="0">
<div style="direction: ltr;font-family: Tahoma;color: #000000;font-size: 10pt;">
<p><font size="3" face="Verdana">On the on-premise <font color="#ff0000"><strong>Cloudera
Hadoop 5.7.2</strong></font> I have installed the anaconda package and trying
to
<font color="#ff0000"><strong>setup Jupyter notebook </strong></font>to
work with spark1.6.
</font></p>
<p><font size="3" face="Verdana"></font>&nbsp;</p>
<p><font face="Verdana"><font size="3">I have ran into problems when I trying
to use the package
<font color="#ff0000"><strong>com.databricks:spark-csv_2.10:1.4.0</strong></font>
for
<font color="#ff0000"><strong>reading and inferring the schema of the csv file
using python spark</strong></font>.</font></font></p>
<p><font size="3" face="Verdana"></font>&nbsp;</p>
<p><font size="3" face="Verdana">I have installed the<font color="#ff0000"><strong>
jar file - spark-csv_2.10-1.4.0.jar
</strong></font>in <font color="#ff0000"><strong>/var/opt/teradata/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/jar</strong>
</font>and&nbsp;c<font color="#ff0000"><strong>onfigurations</strong></font>
are set as &nbsp;:</font></p>
<p><font size="3" face="Verdana"></font>&nbsp;</p>
<p><font face="Verdana">export PYSPARK_DRIVER_PYTHON=/var/opt/teradata/cloudera/parcels/Anaconda-4.0.0/bin/jupyter<br>
export PYSPARK_DRIVER_PYTHON_OPTS=&quot;notebook --NotebookApp.open_browser=False --NotebookApp.ip='*'
--NotebookApp.port=8083&quot;<br>
export PYSPARK_PYTHON=/var/opt/teradata/cloudera/parcels/Anaconda-4.0.0/bin/python</font></p>
<p><font size="3" face="Verdana"></font>&nbsp;</p>
<p><font size="3" face="Verdana">When I run pyspark from the command line with
packages option, like :</font></p>
<p><font size="3" face="Verdana"></font>&nbsp;</p>
<p><font face="Verdana"><font color="#ff0000" size="3"><strong>$pyspark
--packages com.databricks:spark-csv_2.10:1.4.0
</strong></font></font></p>
<p><font size="3" face="Verdana"></font>&nbsp;</p>
<p><font size="3" face="Verdana">It throws the error and fails to recognize the
added dependency.</font></p>
<p><font size="3" face="Verdana"></font>&nbsp;</p>
<p><font size="3" face="Verdana">Any ideas on how to resolve this error is much
appreciated.
</font></p>
<p><font size="3" face="Verdana"></font>&nbsp;</p>
<p><font size="3" face="Verdana">Also, any ideas on the experience in installing
and running Jupyter notebook with anaconda and spark please share.</font></p>
<p><font size="3" face="Verdana"></font>&nbsp;</p>
<p><font size="3" face="Verdana">thanks,</font></p>
<p><font size="3" face="Verdana">Muby</font></p>
<p><font size="3" face="Verdana"></font>&nbsp;</p>
<p>&nbsp;</p>
</div>
</body>
</html>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message