spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan <bryan.jeff...@gmail.com>
Subject RE: Problems with Local Checkpoints
Date Mon, 14 Sep 2015 10:58:13 GMT
Akhil,

This looks like the issue. I'll update my path to include the (soon to be added) winutils
& assoc. DLLs.

Thank you,

Bryan

-----Original Message-----
From: "Akhil Das" <akhil@sigmoidanalytics.com>
Sent: ‎9/‎14/‎2015 6:46 AM
To: "Bryan Jeffrey" <bryan.jeffrey@gmail.com>
Cc: "user" <user@spark.apache.org>
Subject: Re: Problems with Local Checkpoints

You need to set your HADOOP_HOME and make sure the winutils.exe is available in the PATH.


Here's a discussion around the same issue http://stackoverflow.com/questions/19620642/failed-to-locate-the-winutils-binary-in-the-hadoop-binary-path
Also this JIRA https://issues.apache.org/jira/browse/SPARK-2356


Thanks
Best Regards


On Wed, Sep 9, 2015 at 11:30 PM, Bryan Jeffrey <bryan.jeffrey@gmail.com> wrote:

Hello.


I have some basic code that counts numbers using updateStateByKey.  I setup a streaming context
with checkpointing as follows:


def createStreamingContext(masterName : String, checkpointDirectory : String, timeWindow :
Int) : StreamingContext = {  val sparkConf = new SparkConf().setAppName("Program")  val ssc
= new StreamingContext(sparkConf, Seconds(timeWindow))  ssc.checkpoint(checkpointDirectory)
 ssc}

This runs fine on my distributed (Linux) cluster, writing checkpoints to local disk. However,
when I run on my Windows desktop I am seeing a number of checkpoint errors:


15/09/09 13:57:06 INFO CheckpointWriter: Saving checkpoint for time 1441821426000 ms to file
'file:/C:/Temp/sparkcheckpoint/checkpoint-1441821426000'
Exception in thread "pool-14-thread-4" java.lang.NullPointerException
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:404)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at org.apache.hadoop.util.Shell.execCommand(Shell.java:678)
 at org.apache.hadoop.util.Shell.execCommand(Shell.java:661)
 at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:639)
 at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:468)
 at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:456)
 at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:424)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:905)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:783)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:772)
 at org.apache.spark.streaming.CheckpointWriter$CheckpointWriteHandler.run(Checkpoint.scala:181)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)


JAVA_HOME is set correctly, the code runs correctly, it's not a permissions issue (I've run
this as Administrator).  Directories and files are being created in C:\Temp, although all
of the files appear to be empty.


Does anyone have an idea of what is causing these errors?  Has anyone seen something similar?


Regards,


Bryan Jeffrey
Mime
View raw message