spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
Subject Re: OOM for HiveFromSpark example
Date Thu, 26 Mar 2015 08:26:39 GMT
The Hive command

LOAD DATA LOCAL INPATH
'/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt'
INTO TABLE src_spark

1. LOCAL INPATH. if i push to HDFS then how will it work ?

2. I cant use sc.addFile, cause i want to run Hive (Spark SQL) queries.

On Thu, Mar 26, 2015 at 1:41 PM, Akhil Das <akhil@sigmoidanalytics.com>
wrote:

> Now its clear that the workers are not having the file kv1.txt in their
> local filesystem. You can try putting that in hdfs and use the URI to that
> file or try adding the file with sc.addFile
>
> Thanks
> Best Regards
>
> On Thu, Mar 26, 2015 at 1:38 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepujain@gmail.com>
> wrote:
>
>> Does not work
>>
>> 15/03/26 01:07:05 INFO HiveMetaStore.audit: ugi=dvasthimal
>> ip=unknown-ip-addr cmd=get_table : db=default tbl=src_spark
>> 15/03/26 01:07:06 ERROR ql.Driver: FAILED: SemanticException Line 1:23
>> Invalid path
>> ''/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt'':
>> No files matching path
>> file:/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt
>> org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:23 Invalid path
>> ''/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt'':
>> No files matching path
>> file:/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt
>> at
>> org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.applyConstraints(LoadSemanticAnalyzer.java:142)
>> at
>> org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeInternal(LoadSemanticAnalyzer.java:233)
>> at
>> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
>> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:422)
>>
>>
>>
>> Does the input file needs to be passed to executor via -- jars ?
>>
>> On Thu, Mar 26, 2015 at 12:15 PM, Akhil Das <akhil@sigmoidanalytics.com>
>> wrote:
>>
>>> Try to give the complete path to the file kv1.txt.
>>> On 26 Mar 2015 11:48, "ÐΞ€ρ@Ҝ (๏̯͡๏)" <deepujain@gmail.com>
wrote:
>>>
>>>> I am now seeing this error.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 15/03/25 19:44:03 ERROR yarn.ApplicationMaster: User class threw
>>>> exception: FAILED: SemanticException Line 1:23 Invalid path
>>>> ''examples/src/main/resources/kv1.txt'': No files matching path
>>>> file:/hadoop/10/scratch/local/usercache/dvasthimal/appcache/application_1426715280024_89893/container_1426715280024_89893_01_000002/examples/src/main/resources/kv1.txt
>>>>
>>>> org.apache.spark.sql.execution.QueryExecutionException: FAILED:
>>>> SemanticException Line 1:23 Invalid path
>>>> ''examples/src/main/resources/kv1.txt'': No files matching path
>>>> file:/hadoop/10/scratch/local/usercache/dvasthimal/appcache/application_1426715280024_89893/container_1426715280024_89893_01_000002/examples/src/main/resources/kv1.txt
>>>>
>>>> at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:312)
>>>>
>>>> at
>>>> org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:280)
>>>>
>>>>
>>>>
>>>>
>>>> -sh-4.1$ pwd
>>>>
>>>> /home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4
>>>>
>>>> -sh-4.1$ ls examples/src/main/resources/kv1.txt
>>>>
>>>> examples/src/main/resources/kv1.txt
>>>>
>>>> -sh-4.1$
>>>>
>>>>
>>>>
>>>> On Thu, Mar 26, 2015 at 8:08 AM, Zhan Zhang <zzhang@hortonworks.com>
>>>> wrote:
>>>>
>>>>>  You can do it in $SPARK_HOME/conf/spark-defaults.con
>>>>>
>>>>>  spark.driver.extraJavaOptions -XX:MaxPermSize=512m
>>>>>
>>>>>  Thanks.
>>>>>
>>>>>  Zhan Zhang
>>>>>
>>>>>
>>>>>  On Mar 25, 2015, at 7:25 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepujain@gmail.com>
>>>>> wrote:
>>>>>
>>>>>  Where and how do i pass this or other JVM argument ?
>>>>> -XX:MaxPermSize=512m
>>>>>
>>>>> On Wed, Mar 25, 2015 at 11:36 PM, Zhan Zhang <zzhang@hortonworks.com>
>>>>> wrote:
>>>>>
>>>>>> I solve this by  increase the PermGen memory size in driver.
>>>>>>
>>>>>>  -XX:MaxPermSize=512m
>>>>>>
>>>>>>  Thanks.
>>>>>>
>>>>>>  Zhan Zhang
>>>>>>
>>>>>>  On Mar 25, 2015, at 10:54 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepujain@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>  I am facing same issue, posted a new thread. Please respond.
>>>>>>
>>>>>> On Wed, Jan 14, 2015 at 4:38 AM, Zhan Zhang <zzhang@hortonworks.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Folks,
>>>>>>>
>>>>>>> I am trying to run hive context in yarn-cluster mode, but met
some
>>>>>>> error. Does anybody know what cause the issue.
>>>>>>>
>>>>>>> I use following cmd to build the distribution:
>>>>>>>
>>>>>>>  ./make-distribution.sh -Phive -Phive-thriftserver  -Pyarn
>>>>>>> -Phadoop-2.4
>>>>>>>
>>>>>>> 15/01/13 17:59:42 INFO cluster.YarnClusterScheduler:
>>>>>>> YarnClusterScheduler.postStartHook done
>>>>>>> 15/01/13 17:59:42 INFO storage.BlockManagerMasterActor: Registering
>>>>>>> block manager cn122-10.l42scl.hortonworks.com:56157 with 1589.8
MB
>>>>>>> RAM, BlockManagerId(2, cn122-10.l42scl.hortonworks.com, 56157)
>>>>>>> 15/01/13 17:59:43 INFO parse.ParseDriver: Parsing command: CREATE
>>>>>>> TABLE IF NOT EXISTS src (key INT, value STRING)
>>>>>>> 15/01/13 17:59:43 INFO parse.ParseDriver: Parse Completed
>>>>>>> 15/01/13 17:59:44 INFO metastore.HiveMetaStore: 0: Opening raw
store
>>>>>>> with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
>>>>>>> 15/01/13 17:59:44 INFO metastore.ObjectStore: ObjectStore,
>>>>>>> initialize called
>>>>>>> 15/01/13 17:59:44 INFO DataNucleus.Persistence: Property
>>>>>>> datanucleus.cache.level2 unknown - will be ignored
>>>>>>> 15/01/13 17:59:44 INFO DataNucleus.Persistence: Property
>>>>>>> hive.metastore.integral.jdo.pushdown unknown - will be ignored
>>>>>>> 15/01/13 17:59:44 WARN DataNucleus.Connection: BoneCP specified
but
>>>>>>> not present in CLASSPATH (or one of dependencies)
>>>>>>> 15/01/13 17:59:44 WARN DataNucleus.Connection: BoneCP specified
but
>>>>>>> not present in CLASSPATH (or one of dependencies)
>>>>>>> 15/01/13 17:59:52 INFO metastore.ObjectStore: Setting MetaStore
>>>>>>> object pin classes with
>>>>>>> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
>>>>>>> 15/01/13 17:59:52 INFO metastore.MetaStoreDirectSql: MySQL check
>>>>>>> failed, assuming we are not on mysql: Lexical error at line 1,
column 5.
>>>>>>> Encountered: "@" (64), after : "".
>>>>>>> 15/01/13 17:59:53 INFO DataNucleus.Datastore: The class
>>>>>>> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged
as
>>>>>>> "embedded-only" so does not have its own datastore table.
>>>>>>> 15/01/13 17:59:53 INFO DataNucleus.Datastore: The class
>>>>>>> "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as
>>>>>>> "embedded-only" so does not have its own datastore table.
>>>>>>> 15/01/13 17:59:59 INFO DataNucleus.Datastore: The class
>>>>>>> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged
as
>>>>>>> "embedded-only" so does not have its own datastore table.
>>>>>>> 15/01/13 17:59:59 INFO DataNucleus.Datastore: The class
>>>>>>> "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as
>>>>>>> "embedded-only" so does not have its own datastore table.
>>>>>>> 15/01/13 18:00:00 INFO metastore.ObjectStore: Initialized ObjectStore
>>>>>>> 15/01/13 18:00:00 WARN metastore.ObjectStore: Version information
>>>>>>> not found in metastore. hive.metastore.schema.verification is
not enabled
>>>>>>> so recording the schema version 0.13.1aa
>>>>>>> 15/01/13 18:00:01 INFO metastore.HiveMetaStore: Added admin role
in
>>>>>>> metastore
>>>>>>> 15/01/13 18:00:01 INFO metastore.HiveMetaStore: Added public
role in
>>>>>>> metastore
>>>>>>> 15/01/13 18:00:01 INFO metastore.HiveMetaStore: No user is added
in
>>>>>>> admin role, since config is empty
>>>>>>> 15/01/13 18:00:01 INFO session.SessionState: No Tez session required
>>>>>>> at this point. hive.execution.engine=mr.
>>>>>>> 15/01/13 18:00:02 INFO log.PerfLogger: <PERFLOG method=Driver.run
>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>> 15/01/13 18:00:02 INFO log.PerfLogger: <PERFLOG method=TimeToSubmit
>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>> 15/01/13 18:00:02 INFO ql.Driver: Concurrency mode is disabled,
not
>>>>>>> creating a lock manager
>>>>>>> 15/01/13 18:00:02 INFO log.PerfLogger: <PERFLOG method=compile
>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG method=parse
>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>> 15/01/13 18:00:03 INFO parse.ParseDriver: Parsing command: CREATE
>>>>>>> TABLE IF NOT EXISTS src (key INT, value STRING)
>>>>>>> 15/01/13 18:00:03 INFO parse.ParseDriver: Parse Completed
>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: </PERFLOG method=parse
>>>>>>> start=1421190003030 end=1421190003031 duration=1
>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG
>>>>>>> method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
>>>>>>> 15/01/13 18:00:03 INFO parse.SemanticAnalyzer: Starting Semantic
>>>>>>> Analysis
>>>>>>> 15/01/13 18:00:03 INFO parse.SemanticAnalyzer: Creating table
src
>>>>>>> position=27
>>>>>>> 15/01/13 18:00:03 INFO metastore.HiveMetaStore: 0: get_table
:
>>>>>>> db=default tbl=src
>>>>>>> 15/01/13 18:00:03 INFO HiveMetaStore.audit: ugi=zzhang
>>>>>>> ip=unknown-ip-addr      cmd=get_table : db=default tbl=src
>>>>>>> 15/01/13 18:00:03 INFO metastore.HiveMetaStore: 0: get_database:
>>>>>>> default
>>>>>>> 15/01/13 18:00:03 INFO HiveMetaStore.audit: ugi=zzhang
>>>>>>> ip=unknown-ip-addr      cmd=get_database: default
>>>>>>> 15/01/13 18:00:03 INFO ql.Driver: Semantic Analysis Completed
>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: </PERFLOG
>>>>>>> method=semanticAnalyze start=1421190003031 end=1421190003406
duration=375
>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>> 15/01/13 18:00:03 INFO ql.Driver: Returning Hive schema:
>>>>>>> Schema(fieldSchemas:null, properties:null)
>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: </PERFLOG method=compile
>>>>>>> start=1421190002998 end=1421190003416 duration=418
>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG
>>>>>>> method=Driver.execute from=org.apache.hadoop.hive.ql.Driver>
>>>>>>> 15/01/13 18:00:03 INFO ql.Driver: Starting command: CREATE TABLE
IF
>>>>>>> NOT EXISTS src (key INT, value STRING)
>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: </PERFLOG method=TimeToSubmit
>>>>>>> start=1421190002995 end=1421190003421 duration=426
>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG method=runTasks
>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG
>>>>>>> method=task.DDL.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
>>>>>>> 15/01/13 18:00:03 INFO exec.DDLTask: Default to LazySimpleSerDe
for
>>>>>>> table src
>>>>>>> 15/01/13 18:00:05 INFO log.PerfLogger: </PERFLOG
>>>>>>> method=Driver.execute start=1421190003416 end=1421190005498 duration=2082
>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>> Exception in thread "Driver"
>>>>>>> Exception: java.lang.OutOfMemoryError thrown from the
>>>>>>> UncaughtExceptionHandler in thread "Driver"
>>>>>>> --
>>>>>>> CONFIDENTIALITY NOTICE
>>>>>>> NOTICE: This message is intended for the use of the individual
or
>>>>>>> entity to
>>>>>>> which it is addressed and may contain information that is
>>>>>>> confidential,
>>>>>>> privileged and exempt from disclosure under applicable law. If
the
>>>>>>> reader
>>>>>>> of this message is not the intended recipient, you are hereby
>>>>>>> notified that
>>>>>>> any printing, copying, dissemination, distribution, disclosure
or
>>>>>>> forwarding of this communication is strictly prohibited. If you
have
>>>>>>> received this communication in error, please contact the sender
>>>>>>> immediately
>>>>>>> and delete it from your system. Thank You.
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>>>>>> For additional commands, e-mail: user-help@spark.apache.org
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>  --
>>>>>>  Deepak
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>  --
>>>>>  Deepak
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Deepak
>>>>
>>>>
>>
>>
>> --
>> Deepak
>>
>>
>


-- 
Deepak

Mime
View raw message