flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alberto Ramón <a.ramonporto...@gmail.com>
Subject Re: readCsvFile
Date Sun, 09 Oct 2016 21:14:47 GMT
I think The char delimited its OK
(I attached CSV)

val text4 = env.readCsvFile [Tuple1[String]]("file://data.csv"
  ,fieldDelimiter = ","
  ,includedFields = Array(2))
val counts4 = text3
  .map { (_, 1) }
  .groupBy(0)
  .sum(1)
counts4.print()

The result is:
[image: Imágenes integradas 1]

Can you see any bug in mi code to read only 1º column ¿?


2016-10-07 21:50 GMT+02:00 Fabian Hueske <fhueske@gmail.com>:

> I would check that the field delimiter is correctly set.
>
> With the correct delimiter your code would give
>
> ((a),1)
> ((aa),1)
>
> because the single field is wrapped in a Tuple1.
> You have to unwrap it in the map function: .map { (_._1, 1) }
>
> 2016-10-07 18:08 GMT+02:00 Alberto Ramón <a.ramonportoles@gmail.com>:
>
>> Humm
>>
>> Your solution compile with out errors, but IncludedFields Isn't working:
>> [image: Imágenes integradas 1]
>>
>> The output is incorrect:
>> [image: Imágenes integradas 2]
>>
>> The correct result must be only 1º Column
>> (a,1)
>> (aa,1)
>>
>> 2016-10-06 21:37 GMT+02:00 Fabian Hueske <fhueske@gmail.com>:
>>
>>> Hi Alberto,
>>>
>>> if you want to read a single column you have to wrap it in a Tuple1:
>>>
>>> val text4 = env.readCsvFile[Tuple1[String]]("file:data.csv" ,includedFields =
Array(1))
>>>
>>> Best, Fabian
>>>
>>> 2016-10-06 20:59 GMT+02:00 Alberto Ramón <a.ramonportoles@gmail.com>:
>>>
>>>> I'm learning readCsvFile
>>>> (I discover if the file ends on "/n", you will return a null exception)
>>>>
>>>> *if I try to read only 1 column *
>>>>
>>>> val text4 = env.readCsvFile[String]("file:data.csv" ,includedFields = Array(1))
>>>>
>>>> The error is: he type String has to be a tuple or pojo type. [null]
>>>>
>>>>
>>>>
>>>>
>>>> *If  I put > 1 column; (*1º and 2º in this case*)*
>>>>
>>>> val text4 = env.readCsvFile [(String,String)]("data.csv"
>>>>   ,fieldDelimiter = ","
>>>>   ,includedFields = Array(0,1))
>>>>
>>>> Read all columns from, CSV (3 in my example)
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message