sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From redshift-etl-user <redshift....@gmail.com>
Subject Re: HSQLDB issue with null strings
Date Tue, 19 Nov 2013 00:48:57 GMT
Hi Jarek,

Sure! Note that I'm running Sqoop through "Sqoop.runTool". Responses inline.

On Sun, Nov 17, 2013 at 5:47 PM, Jarek Jarcec Cecho <jarcec@apache.org>wrote:

> Hi sir,
> would you mind sharing with us more details about your use case? Sqoop and
> HSQLDB versions,


sqoop-1.4.4-hadoop100.jar
hsqldb-1.8.0.10.jar


> command that you're using,


import --connect jdbc:h2:mem:play-test-985978706 --username sa --password
sa --verbose --query SELECT id,string FROM test WHERE $CONDITIONS  ORDER BY
id LIMIT 200 -m 1 --target-dir
/var/folders/mm/m69802p900d9pqwxw3l85wd80000gn/T/1384821266308-0/data
--fields-terminated-by , --escaped-by \ --enclosed-by "
--null-non-string  *--null-string
asdf* --outdir
/var/folders/mm/m69802p900d9pqwxw3l85wd80000gn/T/1384821266308-0/classes
--class-name avyhOWkUKUQHvkr --driver org.hsqldb.jdbcDriver --split-by id
--verbose


> log generated by Sqoop with parameter --verbose.


16:34:26.380 [warn] [pool-3-thread-1]
org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the
environment. Cannot check for additional configuration.
16:34:26.433 [warn] [pool-3-thread-1]
org.apache.sqoop.tool.BaseSqoopTool - Setting your password on the
command-line is insecure. Consider using -P instead.
16:34:26.439 [warn] [pool-3-thread-1] org.apache.sqoop.ConnFactory
- $SQOOP_CONF_DIR has not been set in the environment. Cannot check for
additional configuration.
16:34:26.456 [warn] [pool-3-thread-1] org.apache.sqoop.ConnFactory
- Parameter --driver is set to an explicit driver however appropriate
connection manager is not being set (via --connection-manager). Sqoop is
going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please
specify explicitly which connection manager should be used next time.
Note:
/tmp/sqoop-romming/compile/8541deacc9cf2714256c59d89dd9bf0a/avyhOWkUKUQHvkr.java
uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
2013-11-18 16:34:27.319 java[48163:13c07] Unable to load realm info from
SCDynamicStore
16:34:27.397 [warn] [pool-3-thread-1]
org.apache.sqoop.mapreduce.JobBase - SQOOP_HOME is unset. May not be able
to find all job dependencies.
16:34:27.423 [warn] [pool-3-thread-1]
o.a.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for
your platform... using builtin-java classes where applicable


> Perhaps even a simplified data that will cause this behaviour?

CREATE TABLE IF NOT EXISTS test (id INTEGER, string TEXT, PRIMARY KEY (id));
INSERT INTO test (id, string) VALUES (1, 'test');
INSERT INTO test (id) VALUES (2);
INSERT INTO test (id, string) VALUES (3, 'test');
This produces a file containing:
1,test
2,
3,test

When it really ought to be
1,test
2,asdf
3,test



> Jarcec
>
> On Fri, Nov 15, 2013 at 01:29:36PM -0800, redshift-etl-user wrote:
> > Hi,
> >
> > I'm using the "--null-string" option to control the value of null string
> > columns for imports. I've tested this with MySQL and Postgres and it
> seems
> > to work fine. However, when I try with HSQLDB, it seems to ignore this
> > option and just return an empty string for nulls. In fact, when the
> > "--null-string" option isn't present it's supposed to return the string
> > "null" according to the spec, and it returns an empty string in this case
> > as well.
> >
> > Could someone else confirm this behavior? Seems like a bug.
> >
> > Thanks!
>

Mime
View raw message