sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manikandan R <maniraj...@gmail.com>
Subject Re: Merge failed - timestamp column with null values
Date Wed, 20 May 2015 17:52:33 GMT
Hello Swati,

Thanks for your reply.

I am not using --class-name in sqoop command.

Here is my sqoop action in oozie

        <action name="sqoop-saved-job">
                <sqoop xmlns="uri:oozie:sqoop-action:0.2">
                        <job-tracker>${jobTracker}</job-tracker>
                        <name-node>${nameNode}</name-node>
                        <job-xml>/tmp/sqoop-site.xml</job-xml>
                        <arg>job</arg>
                        <arg>--create</arg>
                        <arg>${dbName}-${tableName}-sync-job</arg>
                        <arg>--</arg>
                        <arg>import</arg>
                        <arg>--connect</arg>
                        <arg>jdbc:mysql://${dbHost}/${dbName}</arg>
                        <arg>--username</arg>
                        <arg>root</arg>
                        <arg>--password-file</arg>
                        <arg>/tmp/.password</arg>
                        <arg>--table</arg>
                        <arg>${tableName}</arg>
                        <arg>--incremental</arg>
                        <arg>${incrementalMode}</arg>
                        <arg>--merge-key</arg>
                        <arg>${mergeKey}</arg>
                        <arg>--check-column</arg>
                        <arg>${checkColumn}</arg>
                        <arg>--last-value</arg>
                        <arg>${lastValue}</arg>
                        <arg>--target-dir</arg>
                        <arg>/data/${dbName}/${stgPrefix}_${tableName}</arg>
                        <arg>--fields-terminated-by</arg>
                        <arg>\001</arg>
                        <arg>--null-string</arg>
                        <arg>\\N</arg>
                        <arg>--null-non-string</arg>
                        <arg>\\N</arg>
                        <arg>${directOption}</arg>
                </sqoop>

                <ok to="sqoop-run-or-saved-job-check" />
                <error to="sqoop-run-or-saved-job-check" />
        </action>

and here is the exception -

 Error: java.lang.RuntimeException: Can't parse input data: '\N'
  at dim_scd_table.__loadFromFields(dim_scd_table.java:473)
  at dim_scd_table.parse(dim_scd_table.java:391)
  at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:53)
  at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
  at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:775)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:415)
  at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
  Caused by: java.lang.IllegalArgumentException: Timestamp format must be
yyyy-mm-dd hh:mm:ss[.fffffffff]
  at java.sql.Timestamp.valueOf(Timestamp.java:202)
  at dim_scd_table.__loadFromFields(dim_scd_table.java:455)

Table name is dim_scd_table. It has scd_end_date column of "timestamp"
datatype. When this column has NULL value, am getting the above exception.

Please let me know on this.

Thanks,
Mani

On Wed, May 20, 2015 at 10:49 PM, Swati Ambulkar -X (sambulka - PERSISTENT
SYSTEMS INC at Cisco) <sambulka@cisco.com> wrote:

>  Can you paste your sqoop command please?
>
>
>
> Are you generating your class with –class-name option? :
>
>
>
> Once you do that you should see some code for handling timestamp column
> similar to listed below. Here if it encounters \N __cur_str.length() will
> not be 0 and it will go through else part and you can check if this is what
> is failing for you.
>
>
>
>     __cur_str = __it.next();
>
>     if (__cur_str.equals("null") || __cur_str.length() == 0) {
> this.STARTTIME = null; } else {
>
>       this.STARTTIME = java.sql.Timestamp.valueOf(__cur_str);
>
>     }
>
>
>
>     __cur_str = __it.next();
>
>     if (__cur_str.equals("null") || __cur_str.length() == 0) {
> this.ENDTIME = null; } else {
>
>       this.ENDTIME = java.sql.Timestamp.valueOf(__cur_str);
>
>     }
>
>
>
> You can direct sqoop to use null string (“”) for –null-non-string option.
>
>         options.add("--null-string");
>
>         options.add("");
>
>
>
>         options.add("--null-non-string");
>
>         options.add("");
>
>
>
> This would put null string in imported row and then the abovementioned
> check should timestamp column value to null.
>
>
>
> Thanks,
>
> Swati
>
>
>
> *From:* Manikandan R [mailto:manirajv06@gmail.com]
> *Sent:* Wednesday, May 20, 2015 12:02 AM
> *To:* user@sqoop.apache.org
> *Subject:* Merge failed - timestamp column with null values
>
>
>
> Hello Everyone,
>
>
>
> I am trying to push incremental updates from mysql to hdfs using sqoop
> import command with Mergekey option and incremental mode as "lastmodified".
>
>
>
> My table has some timestamp columns. I don't see any problems as long as
> timestamp columns has some values. But, Problem arises only when it has
> NULL values. I copied the below exception from my logs. Also, Incase of
> Non-timestamp columns having null values, there is no issues.
>
>
>
> Error: java.lang.RuntimeException: Can't parse input data: '\N'
>
>   at dim_scd_table.__loadFromFields(dim_scd_table.java:473)
>
>   at dim_scd_table.parse(dim_scd_table.java:391)
>
>   at
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:53)
>
>   at
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
>
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:775)
>
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>
>   at java.security.AccessController.doPrivileged(Native Method)
>
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>
>   at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>
>   Caused by: java.lang.IllegalArgumentException: Timestamp format must be
> yyyy-mm-dd hh:mm:ss[.fffffffff]
>
>   at java.sql.Timestamp.valueOf(Timestamp.java:202)
>
>   at dim_scd_table.__loadFromFields(dim_scd_table.java:455)
>
>   ... 11 more
>
>
>
> Kindly let me know on this.
>
>
>
> Thanks,
>
> Mani
>

Mime
View raw message