spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Moein Hosseini <moein...@gmail.com>
Subject Re: Feature request: split dataset based on condition
Date Sun, 03 Feb 2019 06:19:55 GMT
I don't consider it as method to apply filtering multiple time, instead use
it as semi-action not just transformation. Let's think that we have
something like map-partition which accept multiple lambda that each one
collect their ROW for their dataset (or something like it). Is it possible?

On Sat, Feb 2, 2019 at 5:59 PM Sean Owen <srowen@gmail.com> wrote:

> I think the problem is that can't produce multiple Datasets from one
> source in one operation - consider that reproducing one of them would mean
> reproducing all of them. You can write a method that would do the filtering
> multiple times but it wouldn't be faster. What do you have in mind that's
> different?
>
> On Sat, Feb 2, 2019 at 12:19 AM Moein Hosseini <moein7tl@gmail.com> wrote:
>
>> I've seen many application need to split dataset to multiple datasets
>> based on some conditions. As there is no method to do it in one place,
>> developers use *filter *method multiple times. I think it can be useful
>> to have method to split dataset based on condition in one iteration,
>> something like *partition* method of scala (of-course scala partition
>> just split list into two list, but something more general can be more
>> useful).
>> If you think it can be helpful, I can create Jira issue and work on it to
>> send PR.
>>
>> Best Regards
>> Moein
>>
>> --
>>
>> Moein Hosseini
>> Data Engineer
>> mobile: +98 912 468 1859 <+98+912+468+1859>
>> site: www.moein.xyz
>> email: moein7tl@gmail.com
>> [image: linkedin] <https://www.linkedin.com/in/moeinhm>
>> [image: twitter] <https://twitter.com/moein7tl>
>>
>>

-- 

Moein Hosseini
Data Engineer
mobile: +98 912 468 1859 <+98+912+468+1859>
site: www.moein.xyz
email: moein7tl@gmail.com
[image: linkedin] <https://www.linkedin.com/in/moeinhm>
[image: twitter] <https://twitter.com/moein7tl>

Mime
View raw message