spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
Subject Re: Hive Table not from from Spark SQL
Date Thu, 26 Mar 2015 15:28:44 GMT
Hello Michael,
Thanks for your time.

1. show tables from Spark program returns nothing.
2. What entities are you talking about ? (I am actually new to Hive as well)


On Thu, Mar 26, 2015 at 8:35 PM, Michael Armbrust <michael@databricks.com>
wrote:

> What does "show tables" return?  You can also run "SET <optionName>" to
> make sure that entries from you hive site are being read correctly.
>
> On Thu, Mar 26, 2015 at 4:02 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepujain@gmail.com>
> wrote:
>
>> I have tables dw_bid that is created in Hive and has nothing to do with
>> Spark.  I have data in avro that i want to join with dw_bid table, this
>> join needs to be done using Spark SQL.  However for some reason Spark says
>> dw_bid table does not exist. How do i say spark that dw_bid is a table
>> created in Hive and read it.
>>
>>
>> Query that is run from Spark SQL
>> ==============================
>>  insert overwrite table sojsuccessevents2_spark select
>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
>> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
>> isDuplicate,b.bid_date as
>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
>> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a
>> join dw_bid b  on a.itemId = b.item_id  and  a.transactionId =
>>  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
>> '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
>>
>>
>> If i create sojsuccessevents2_spark from hive command line and run above
>> command form Spark SQL program then i get error "sojsuccessevents2_spark
>> table not found".
>>
>> Hence i dropped the command from Hive and run create table
>> sojsuccessevents2_spark from Spark SQL before running above command and it
>> works until it hits next road block "dw_bid table not found"
>>
>> This makes me belive that Spark for some reason is not able to
>> read/understand the tables created outside Spark. I did copy
>> /apache/hive/conf/hive-site.xml into Spark conf directory.
>>
>> Please suggest.
>>
>>
>> Logs
>> ———
>> 15/03/26 03:50:40 INFO HiveMetaStore.audit: ugi=dvasthimal
>> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
>> 15/03/26 03:50:40 ERROR metadata.Hive:
>> NoSuchObjectException(message:default.dw_bid table not found)
>> at
>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
>>
>>
>>
>> 15/03/26 03:50:40 ERROR yarn.ApplicationMaster: User class threw
>> exception: no such table List(dw_bid); line 1 pos 843
>> org.apache.spark.sql.AnalysisException: no such table List(dw_bid); line
>> 1 pos 843
>> at
>> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
>> at
>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:178)
>> at
>> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187)
>>
>>
>>
>> Regards,
>> Deepak
>>
>>
>> On Thu, Mar 26, 2015 at 4:27 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepujain@gmail.com>
>> wrote:
>>
>>> I have this query
>>>
>>>  insert overwrite table sojsuccessevents2_spark select
>>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,
>>> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as
>>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,
>>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,
>>> isDuplicate,b.bid_date as
>>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>>> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,
>>> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a *join
>>> dw_bid b*  on a.itemId = b.item_id  and  a.transactionId =
>>>  b.transaction_id  where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >=
>>> '2015-02-16'  AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
>>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')
>>>
>>>
>>> If i create sojsuccessevents2_spark from hive command line and run above
>>> command form Spark SQL program then i get error "sojsuccessevents2_spark
>>> table not found".
>>>
>>> Hence i dropped the command from Hive and run create table
>>> sojsuccessevents2_spark from Spark SQL before running above command and it
>>> works until it hits next road block "*dw_bid table not found"*
>>>
>>> This makes me belive that Spark for some reason is not able to
>>> read/understand the tables created outside Spark. I did copy
>>>   /apache/hive/conf/hive-site.xml into Spark conf directory.
>>>
>>> Please suggest.
>>>
>>> Regards,
>>> Deepak
>>>
>>>
>>> On Thu, Mar 26, 2015 at 1:26 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepujain@gmail.com>
>>> wrote:
>>>
>>>> I have a hive table named dw_bid, when i run hive from command prompt
>>>> and run describe dw_bid, it works.
>>>>
>>>> I want to join a avro file (table) in HDFS with this hive dw_bid table
>>>> and i refer it as dw_bid from Spark SQL program, however i see
>>>>
>>>> 15/03/26 00:31:01 INFO HiveMetaStore.audit: ugi=dvasthimal
>>>> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid
>>>> 15/03/26 00:31:01 ERROR metadata.Hive:
>>>> NoSuchObjectException(message:default.dw_bid table not found)
>>>> at
>>>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>
>>>>
>>>> Code:
>>>>
>>>>     val successDetail_S1 = sqlContext.avroFile(input)
>>>>     successDetail_S1.registerTempTable("sojsuccessevents1")
>>>>     val countS1 = sqlContext.sql("select
>>>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId,"
>>>> +
>>>>         " shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id
>>>> as
>>>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId,"
>>>> +
>>>>         "
>>>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort,"
>>>> +
>>>>         " isDuplicate,b.bid_date as
>>>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid
>>>> as bidQuantity," +
>>>>     " b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as
>>>>  bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId,"
>>>> +
>>>>     " sellerStdLevel,cssSellerLevel,a.experimentChannel" +
>>>>     " from sojsuccessevents1 a join dw_bid b " +
>>>>     " on a.itemId = b.item_id  and  a.transactionId =
>>>>  b.transaction_id " +
>>>>     " where b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND (
>>>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')")
>>>>     println("countS1.first:" + countS1.first)
>>>>
>>>>
>>>>
>>>> Any suggestions on how to refer a hive table form Spark SQL?
>>>> --
>>>>
>>>> Deepak
>>>>
>>>>
>>>
>>>
>>> --
>>> Deepak
>>>
>>>
>>
>>
>> --
>> Deepak
>>
>>
>


-- 
Deepak

Mime
View raw message