spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From amit tewari <amittewar...@gmail.com>
Subject Re: Programatically create RDDs based on input
Date Mon, 02 Nov 2015 10:17:05 GMT
Thanks Natu, Ayan.

I was able to create an array of Dataframes (Spark 1.3+).

DataFrame[] dfs = new DataFrame[uniqueFileIds.length];

Thanks
Amit

On Sun, Nov 1, 2015 at 10:58 AM, Natu Lauchande <nlauchande@gmail.com>
wrote:

> Hi Amit,
>
> I don't see any default constructor in the JavaRDD docs
> https://spark.apache.org/docs/latest/api/java/org/apache/spark/api/java/JavaRDD.html
> .
>
> Have you tried the following ?
>
> JavaRDD<String> jRDD[] ;
>
> jRDD.add( jsc.textFile("/file1.txt") )
> jRDD.add( jsc.textFile("/file2.txt") )
> ..
> ;
>
> Natu
>
>
> On Sat, Oct 31, 2015 at 11:18 PM, ayan guha <guha.ayan@gmail.com> wrote:
>
>> My java knowledge is limited, but you may try with a hashmap and put RDDs
>> in it?
>>
>> On Sun, Nov 1, 2015 at 4:34 AM, amit tewari <amittewari.5@gmail.com>
>> wrote:
>>
>>> Thanks Ayan thats something similar to what I am looking at but trying
>>> the same in Java is giving compile error:
>>>
>>> JavaRDD<String> jRDD[] = new JavaRDD<String>[3];
>>>
>>> //Error: Cannot create a generic array of JavaRDD<String>
>>>
>>> Thanks
>>> Amit
>>>
>>>
>>>
>>> On Sat, Oct 31, 2015 at 5:46 PM, ayan guha <guha.ayan@gmail.com> wrote:
>>>
>>>> Corrected a typo...
>>>>
>>>> # In Driver
>>>> fileList=["/file1.txt","/file2.txt"]
>>>> rdds = []
>>>> for f in fileList:
>>>>          rdd = jsc.textFile(f)
>>>>          rdds.append(rdd)
>>>>
>>>>
>>>> On Sat, Oct 31, 2015 at 11:14 PM, ayan guha <guha.ayan@gmail.com>
>>>> wrote:
>>>>
>>>>> Yes, this can be done. quick python equivalent:
>>>>>
>>>>> # In Driver
>>>>> fileList=["/file1.txt","/file2.txt"]
>>>>> rdd = []
>>>>> for f in fileList:
>>>>>          rdd = jsc.textFile(f)
>>>>>          rdds.append(rdd)
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Oct 31, 2015 at 11:09 PM, amit tewari <amittewari.5@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi
>>>>>>
>>>>>> I need the ability to be able to create RDDs programatically inside
>>>>>> my program (e.g. based on varaible number of input files).
>>>>>>
>>>>>> Can this be done?
>>>>>>
>>>>>> I need this as I want to run the following statement inside an
>>>>>> iteration:
>>>>>>
>>>>>> JavaRDD<String> rdd1 = jsc.textFile("/file1.txt");
>>>>>>
>>>>>> Thanks
>>>>>> Amit
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>> Ayan Guha
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards,
>>>> Ayan Guha
>>>>
>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Ayan Guha
>>
>
>

Mime
View raw message