sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jarek Cecho" <jar...@apache.org>
Subject Re: Review Request 28139: Support Time and List Type in CSV IDF
Date Wed, 19 Nov 2014 17:22:17 GMT


On Nov. 19, 2014, 1 a.m., Veena Basavaraj wrote:
> > Jarcec
> 
> Veena Basavaraj wrote:
>     please see the latest revision., it uses JSON. It is pretty universal so I rather
use JSON.
> 
> Jarek Cecho wrote:
>     Why JSON for intermediate data format? I would much rather use something that is
used by pg_dump and mysqldump (or Hive) rather then JSON. IDF is just intermediate format,
we should be as effective here as possible and we should not be doing conversions from format
to format.
> 
> Veena Basavaraj wrote:
>     it so happens that they use the same too, I would prefer you look at the patch on
why we are using an encoding such as JSON to represent arrays and maps, imagine a case of
[[a,b] [z,q] [w,e]] - will be just represented as JSONstring, so when parsing it back to actual
arays or arrays object, it is no brainer and we use the JSON parse method, the code also says
the CSV IDF to ignore any commans within the JSON string

As long as pg_dump and Hive are using the same format I'm fine. But I don't want to introduce
different format then those two without a good reason.


- Jarek


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28139/#review62081
-----------------------------------------------------------


On Nov. 19, 2014, 5:08 p.m., Veena Basavaraj wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28139/
> -----------------------------------------------------------
> 
> (Updated Nov. 19, 2014, 5:08 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-1749
>     https://issues.apache.org/jira/browse/SQOOP-1749
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> see JIRA
> 
> Supporting Nested arrays is going to be dody logic
> 
> some of the assumptions here in parsing need more docs. Esp use of special characters
to parse nested arrays.
> 
> Welcome feedback if I can improve this more.
> 
> I am sure mores tests can be added.
> 
> 
> NOTE: WS will be addressed once the patch is reviewed for functionality and this needs
a rebase since I have other patches pending review that overlap.
> 
> 
> Diffs
> -----
> 
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/CSVIntermediateDataFormat.java
39a01c1b5b290903a6c93dbb8427af525515e2b2 
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormat.java
5ef6fc64ec695fbdf2f9b6acec1eb0235fd675be 
>   connector/connector-sdk/src/main/java/org/apache/sqoop/connector/idf/IntermediateDataFormatError.java
4d41679d0b4b9e04d05eef4dd5280f201bbe58e2 
>   connector/connector-sdk/src/test/java/org/apache/sqoop/connector/idf/TestCSVIntermediateDataFormat.java
fcf6c3c15c6558cf21a0d82c78a51903d8fb371c 
> 
> Diff: https://reviews.apache.org/r/28139/diff/
> 
> 
> Testing
> -------
> 
> yes unit tests added. 
> 
> 
> File Attachments
> ----------------
> 
> SQOOP-1749-v2.patch
>   https://reviews.apache.org/media/uploaded/files/2014/11/19/cfdfc0ed-57c9-40d3-959b-f482f1600c03__SQOOP-1749-v2.patch
> 
> 
> Thanks,
> 
> Veena Basavaraj
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message