spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhishek Anand <abhis.anan...@gmail.com>
Subject Re: Concatenate the columns in dataframe to create new collumns using Java
Date Mon, 18 Jul 2016 12:14:46 GMT
Hi Nihed,

Thanks for the reply.

I am looking for something like this :

DataFrame training = orgdf.withColumn("I1",
functions.concat(orgdf.col("C0"),orgdf.col("C1")));


Here I have to give C0 and C1 columns, I am looking to write a generic
function that concatenates the columns depending on input columns.

like if I have something
String str = "C0,C1,C2"

Then it should work as

DataFrame training = orgdf.withColumn("I1",
functions.concat(orgdf.col("C0"),orgdf.col("C1"),orgdf.col("C2")));



Thanks,
Abhi

On Mon, Jul 18, 2016 at 4:39 PM, nihed mbarek <nihedmm@gmail.com> wrote:

> Hi,
>
>
> I just wrote this code to help you. Is it what you need ??
>
>
>         SparkConf conf = new
> SparkConf().setAppName("hello").setMaster("local");
>         JavaSparkContext sc = new JavaSparkContext(conf);
>         SQLContext sqlContext = new SQLContext(sc);
>         List<Person> persons = new ArrayList<>();
>         persons.add(new Person("nihed", "mbarek", "nihed.com"));
>         persons.add(new Person("mark", "zuckerberg", "facebook.com"));
>
>         DataFrame df = sqlContext.createDataFrame(persons, Person.class);
>
>         df.show();
>         final String[] columns = df.columns();
>         Column[] selectColumns = new Column[columns.length + 1];
>         for (int i = 0; i < columns.length; i++) {
>             selectColumns[i]=df.col(columns[i]);
>         }
>
>
> selectColumns[columns.length]=functions.concat(df.col("firstname"),
> df.col("lastname"));
>
>         df.select(selectColumns).show();
>       -------------------
> public static class Person {
>
>         private String firstname;
>         private String lastname;
>         private String address;
> }
>
>
>
> Regards,
>
> On Mon, Jul 18, 2016 at 12:45 PM, Abhishek Anand <abhis.anan007@gmail.com>
> wrote:
>
>> Hi,
>>
>> I have a dataframe say having C0,C1,C2 and so on as columns.
>>
>> I need to create interaction variables to be taken as input for my
>> program.
>>
>> For eg -
>>
>> I need to create I1 as concatenation of C0,C3,C5
>>
>> Similarly, I2  = concat(C4,C5)
>>
>> and so on ..
>>
>>
>> How can I achieve this in my Java code for concatenation of any columns
>> given input by the user.
>>
>> Thanks,
>> Abhi
>>
>
>
>
> --
>
> M'BAREK Med Nihed,
> Fedora Ambassador, TUNISIA, Northern Africa
> http://www.nihed.com
>
> <http://tn.linkedin.com/in/nihed>
>
>

Mime
View raw message