spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amit Joshi <mailtojoshia...@gmail.com>
Subject Re: [Spark SQL] does pyspark udf support spark.sql inside def
Date Thu, 01 Oct 2020 03:54:59 GMT
Can you pls post the schema of both the tables.

On Wednesday, September 30, 2020, Lakshmi Nivedita <klnivedita@gmail.com>
wrote:

> Thank you for the clarification.I would like to how can I  proceed for
> this kind of scenario in pyspark
>
> I have a scenario subtracting the total number of days with the number of
> holidays in pyspark by using dataframes
>
> I have a table with dates  date1  date2 in one table and number of
> holidays in another table
> df1 = select date1,date2 ,ctry ,unixtimestamp(date2-date1)
> totalnumberofdays  - df2.holidays  from table A;
>
> df2 = select count(holiays)
> from table B
> where holidate >= 'date1'(table A)
> and holidate < = date2(table A)
> and country = A.ctry(table A)
>
> Except country no other column is not a unique key
>
>
>
>
> On Wed, Sep 30, 2020 at 6:05 PM Sean Owen <srowen@gmail.com> wrote:
>
>> No, you can't use the SparkSession from within a function executed by
>> Spark tasks.
>>
>> On Wed, Sep 30, 2020 at 7:29 AM Lakshmi Nivedita <klnivedita@gmail.com>
>> wrote:
>>
>>> Here is a spark udf structure as an example
>>>
>>> Def sampl_fn(x):
>>>            Spark.sql(“select count(Id) from sample Where Id = x ”)
>>>
>>>
>>> Spark.udf.register(“sample_fn”, sample_fn)
>>>
>>> Spark.sql(“select id, sampl_fn(Id) from example”)
>>>
>>> Advance Thanks for the help
>>> --
>>> k.Lakshmi Nivedita
>>>
>>>
>>>
>>>
>>
>>

Mime
View raw message