falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ajay Yadava (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-36) Ability to ingest data from databases
Date Tue, 04 Aug 2015 04:01:05 GMT

    [ https://issues.apache.org/jira/browse/FALCON-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653027#comment-14653027

Ajay Yadava commented on FALCON-36:

[~me.venkatr] I understand your point. We can allow for various functionalities by having
different types of datasources instead of top level entities. If I have to give an analogy
then oozie has several different types of actions and each action has lot of different capabilities
and parameters. However they are all still actions. Similarly we can have different types
of datasources and accommodate various parameters but we shouldn't have them as top level

Another example of such behaviour is feed lifecycles. By the rationale of common denominator
of capabilities import and retention are completely different but we look at them as just
another lifecycle of the feed.

The common denominator is not in capabilities but in what they represent - they are all sources
of data and you can import data from them. This seemingly pedantic difference is very important
IMHO because it simplifies a lot of things. It's easy to build a new feature and make it available
to all the data sources. It will otherwise be very confusing to have streaming feeds and Kafka

> Ability to ingest data from databases
> -------------------------------------
>                 Key: FALCON-36
>                 URL: https://issues.apache.org/jira/browse/FALCON-36
>             Project: Falcon
>          Issue Type: Improvement
>          Components: acquisition
>    Affects Versions: 0.3
>            Reporter: Venkatesh Seetharam
>            Assignee: Venkat Ramachandran
>         Attachments: FALCON-36.patch, FALCON-36.patch.2, FALCON-36.rebase.patch, FALCON-36.review.patch,
Falcon Data Ingestion - Proposal.docx, falcon-36.xsd.patch.1
> Attempt to address data import from RDBMS into hadoop and export of data from Hadoop
into RDBMS. The plan is to use sqoop 1.x to materialize data motion from/to RDBMS to/from
HDFS. Hive will not be integrated in the first pass until Falcon has a first class integration
with HCatalog.

This message was sent by Atlassian JIRA

View raw message