spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lakshmi Nivedita <klnived...@gmail.com>
Subject Re: [Spark SQL] does pyspark udf support spark.sql inside def
Date Thu, 01 Oct 2020 04:43:45 GMT
Sure, will do that.I am using impala in pyspark. to retrieve the data

A table schema
date1 Bigint
date2 Bigint
ctry   string
sample data for table A:
date1                  date2     ctry
22-12-2012   06-01-2013  IN


B table schema

holidate Bigint
Holiday =0/1 —string

0 means holiday—-
1 means working

Country string

Sample data for table B :holidate    holiday country
                      25-12-2012  0        IN
                      01-01-2013  0    IN

Thanks
Nivedita




On Thu, Oct 1, 2020 at 9:25 AM Amit Joshi <mailtojoshiamit@gmail.com> wrote:

> Can you pls post the schema of both the tables.
>
> On Wednesday, September 30, 2020, Lakshmi Nivedita <klnivedita@gmail.com>
> wrote:
>
>> Thank you for the clarification.I would like to how can I  proceed for
>> this kind of scenario in pyspark
>>
>> I have a scenario subtracting the total number of days with the number of
>> holidays in pyspark by using dataframes
>>
>> I have a table with dates  date1  date2 in one table and number of
>> holidays in another table
>> df1 = select date1,date2 ,ctry ,unixtimestamp(date2-date1)
>> totalnumberofdays  - df2.holidays  from table A;
>>
>> df2 = select count(holiays)
>> from table B
>> where holidate >= 'date1'(table A)
>> and holidate < = date2(table A)
>> and country = A.ctry(table A)
>>
>> Except country no other column is not a unique key
>>
>>
>>
>>
>> On Wed, Sep 30, 2020 at 6:05 PM Sean Owen <srowen@gmail.com> wrote:
>>
>>> No, you can't use the SparkSession from within a function executed by
>>> Spark tasks.
>>>
>>> On Wed, Sep 30, 2020 at 7:29 AM Lakshmi Nivedita <klnivedita@gmail.com>
>>> wrote:
>>>
>>>> Here is a spark udf structure as an example
>>>>
>>>> Def sampl_fn(x):
>>>>            Spark.sql(“select count(Id) from sample Where Id = x ”)
>>>>
>>>>
>>>> Spark.udf.register(“sample_fn”, sample_fn)
>>>>
>>>> Spark.sql(“select id, sampl_fn(Id) from example”)
>>>>
>>>> Advance Thanks for the help
>>>> --
>>>> k.Lakshmi Nivedita
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

Mime
View raw message