spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Hamlin <>
Subject loading data into a trivial single cluster
Date Sat, 14 Sep 2013 16:42:24 GMT
Hi All -

I used Cloudera .debs to install a trivial Hadoop on a single host..
      log/hadoop-hdfs/hadoop-hdfs-namemode.log shows version 2.0.0- 

however I am not loading data into the cluster, for example in pySpark

tW = sc.textFile( "http://my.domain/www_shared/a_report.csv" )

   says "No file system for scheme http"

tW = sc.textFile( '/home/dbb/a_text_file.csv' )

   says "TypeError: unsupported operand type(s) for -: 'unicode' and  
'float' "

that is for both a single column float field CSV with no header,
and a multi-column CSV with header

(clearly I can load CSV and pyscopg2 in python generally)
but I am not understanding the next steps to read the data into the  
HDFS system..

ps- I see the gui port number is 4040 now, and the IPYTHON=1 f  lag  
works fine

   thanks -Brian

View raw message