spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From KhajaAsmath Mohammed <mdkhajaasm...@gmail.com>
Subject Re: Spark Session error with 30s
Date Tue, 13 Apr 2021 23:44:04 GMT
I was able to Resolve this by changing the Hdfs-site.xml as I mentioned in my initial thread

Thanks,
Asmath

> On Apr 12, 2021, at 8:35 PM, Peng Lei <peng.8lei@gmail.com> wrote:
> 
> 
> Hi KhajaAsmath Mohammed
>   Please check the configuration of "spark.speculation.interval", just pass the "30"
to it.
>   
>  '''
>   override def start(): Unit = {
>   backend.start()
> 
>   if (!isLocal && conf.get(SPECULATION_ENABLED)) {
>     logInfo("Starting speculative execution thread")
>     speculationScheduler.scheduleWithFixedDelay(
>       () => Utils.tryOrStopSparkContext(sc) { checkSpeculatableTasks() },
>       SPECULATION_INTERVAL_MS, SPECULATION_INTERVAL_MS, TimeUnit.MILLISECONDS)
>   }
> }
>  '''
>   
> 
> Sean Owen <srowen@gmail.com> 于2021年4月13日周二 上午3:30写道:
>> Something is passing this invalid 30s value, yes. Hard to say which property it is.
I'd check if your cluster config sets anything with the value 30s - whatever is reading this
property is not expecting it. 
>> 
>>> On Mon, Apr 12, 2021, 2:25 PM KhajaAsmath Mohammed <mdkhajaasmath@gmail.com>
wrote:
>>> Hi Sean,
>>> 
>>> Do you think anything that can cause this with DFS client?
>>> 
>>> java.lang.NumberFormatException: For input string: "30s"
>>>         at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>>>         at java.lang.Long.parseLong(Long.java:589)
>>>         at java.lang.Long.parseLong(Long.java:631)
>>>         at org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1429)
>>>         at org.apache.hadoop.hdfs.client.impl.DfsClientConf.<init>(DfsClientConf.java:247)
>>>         at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:301)
>>>         at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:285)
>>>         at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:160)
>>>         at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2859)
>>>         at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
>>>         at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2896)
>>>         at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2878)
>>>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:392)
>>>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:184)
>>>         at org.apache.spark.deploy.yarn.Client$$anonfun$8.apply(Client.scala:137)
>>>         at org.apache.spark.deploy.yarn.Client$$anonfun$8.apply(Client.scala:137)
>>>         at scala.Option.getOrElse(Option.scala:121)
>>>         at org.apache.spark.deploy.yarn.Client.<init>(Client.scala:137)
>>>         at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
>>>         at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:183)
>>>         at org.apache.spark.SparkContext.<init>(SparkContext.scala:501)
>>>         at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520)
>>>         at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:936)
>>>         at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession
>>> 
>>> Thanks,
>>> Asmath
>>> 
>>>> On Mon, Apr 12, 2021 at 2:20 PM KhajaAsmath Mohammed <mdkhajaasmath@gmail.com>
wrote:
>>>> I am using spark hbase connector provided by hortonwokrs. I was able to run
without issues in my local environment and has this issue in emr. 
>>>> 
>>>> Thanks,
>>>> Asmath
>>>> 
>>>>>> On Apr 12, 2021, at 2:15 PM, Sean Owen <srowen@gmail.com> wrote:
>>>>>> 
>>>>> 
>>>>> Somewhere you're passing a property that expects a number, but give it
"30s". Is it a time property somewhere that really just wants MS or something? But most time
properties (all?) in Spark should accept that type of input anyway. Really depends on what
property has a problem and what is setting it.
>>>>> 
>>>>>> On Mon, Apr 12, 2021 at 1:56 PM KhajaAsmath Mohammed <mdkhajaasmath@gmail.com>
wrote:
>>>>>> HI,
>>>>>> 
>>>>>> I am getting weird error when running spark job in emr cluster. Same
program runs in my local machine. Is there anything that I need to do to resolve this?
>>>>>> 
>>>>>> 21/04/12 18:48:45 ERROR SparkContext: Error initializing SparkContext.
>>>>>> java.lang.NumberFormatException: For input string: "30s"
>>>>>> 
>>>>>> I tried the solution mentioned in the link below but it didn't work
for me.
>>>>>> 
>>>>>> https://hadooptutorials.info/2020/10/11/part-5-using-spark-as-execution-engine-for-hive-2/
>>>>>> 
>>>>>> Thanks,
>>>>>> Asmath

Mime
View raw message