sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gwen Shapira (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-1869) Sqoop2: Expand schema matching to support two schemaless connectors
Date Wed, 10 Dec 2014 15:26:12 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-1869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14241237#comment-14241237
] 

Gwen Shapira commented on SQOOP-1869:
-------------------------------------

I didn't get this to work yet, but the plan is to add new class ByteArraySchema with a single
Binary column.

The current logic in Matcher is:
```
if (fromSchema.isEmpty() && toSchema.isEmpty()) {
      throw new SqoopException(MatcherError.MATCHER_0000, "Neither a FROM or TO schemas been
provided.");
```

I'm planning to change it to 
```
if (fromSchema.isEmpty() && toSchema.isEmpty()) {
  this.fromSchema = ByteArraySchema.getInstance()
  this.toSchema = ByteArraySchema.getInstance()
```
This will keep the current logic that if one direction has a schema, this schema will be used,
and will not require changes to connectors.

As I said, its not working yet, so the plan may change.

> Sqoop2: Expand schema matching to support two schemaless connectors
> -------------------------------------------------------------------
>
>                 Key: SQOOP-1869
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1869
>             Project: Sqoop
>          Issue Type: Improvement
>            Reporter: Gwen Shapira
>            Assignee: Gwen Shapira
>
> Currently the schema matches errors out if both FROM and TO connectors are empty. This
prevents us from supporting HDFS->Kafka.
> I suggest to change the code to support the following:
> 1. Empty schema will contain a single byte[] field with whatever the connector writes
into it.
> 2. As happens now, one connector is null and the other has a schema, the schema that
exists will be used to parse the data.
> 3. If we have two empty schemas, the TO connector will get a byte[] and presumably know
what to do with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message