falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew O'Brien" <obrien.and...@gmail.com>
Subject MySQL datasource not importing
Date Fri, 11 Mar 2016 15:44:35 GMT
Hi everyone,

I tried out Falcon 0.6 awhile ago and it didn't quite suit my needs. But
when I saw the new datasource functionality, I decided to give it another
go. Things looked fairly promising, but I'm not able to actually get it to
kick off the import from MySQL

For reference, I'm running HDP Sandbox 2.3 with Falcon 0.9 built from the
last release. I have the movielens dataset loaded into the MySQL database
running inside the sandbox.

I started with my cluster definition:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<cluster name="Sandbox" description="Sandbox running on my local machine"
colo="local" xmlns="uri:falcon:cluster:0.1">
    <interfaces>
        <interface type="readonly" endpoint="hftp://
sandbox.hortonworks.com:50070" version="2.2.0"/>
        <interface type="write" endpoint="hdfs://
sandbox.hortonworks.com:8020" version="2.2.0"/>
        <interface type="execute" endpoint="sandbox.hortonworks.com:8050"
version="2.2.0"/>
        <interface type="workflow" endpoint="
http://sandbox.hortonworks.com:11000/oozie/" version="4.0.0"/>
        <interface type="messaging" endpoint="tcp://
sandbox.hortonworks.com:61616?daemon=true" version="5.1.6"/>
    </interfaces>
    <locations>
        <location name="staging" path="/user/falcon/staging"/>
        <location name="temp" path="/user/falcon/temp"/>
        <location name="working" path="/user/falcon/working"/>
    </locations>
    <ACL owner="falcon" group="users" permission="0x755"/>
</cluster>

And got it to accept this datasource:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<datasource name="movielens-sandbox-mysql" colo="sandbox"
description="Movielens on sandbox" type="mysql"
xmlns="uri:falcon:datasource:0.1">
    <interfaces>
        <interface type="readonly"
endpoint="jdbc:mysql://localhost:3306/movielens"/>
        <credential type="password-text">
            <userName>root</userName>
            <passwordText></passwordText>
        </credential>
    </interfaces>
    <driver>
        <clazz>com.mysql.jdbc.Driver</clazz>
        <jar>/user/oozie/share/lib/sqoop/mysql-connector-java.jar</jar>
    </driver>
    <ACL owner="falcon" group="users" permission="0755"/>
</datasource>

(I confirmed the JDBC url by using it with `sqoop eval`.)

And then declared this feed:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<feed name="movielens-genres" description="Movielens genres"
xmlns="uri:falcon:feed:0.1">
    <frequency>minutes(5)</frequency>
    <timezone>UTC</timezone>
    <clusters>
        <cluster name="Sandbox">
            <validity start="2012-07-20T03:00Z" end="2099-07-16T00:00Z"/>
            <retention limit="months(3)" action="delete"/>
            <import>
                <source name="movielens-sandbox-mysql" tableName="genres">
                    <extract type="full">
                        <mergepolicy>snapshot</mergepolicy>
                    </extract>
                </source>
            </import>
        </cluster>
    </clusters>
    <locations>
        <location type="data" path="/users/falcon/movielens/genres"/>
    </locations>
    <ACL owner="falcon" group="users" permission="0755"/>
    <schema location="/user/falcon/schemas/genre.avsc" provider="avro"/>
</feed>

I scheduled it following the instructions here:
http://falcon.apache.org/site/0.9/ImportExport.html Since then, I've tried
to rerun it with `falcon entity -touch -type feed -name movielens-genres`.

This should be enough to case a file to appear in
/user/falcon/movielens/genres, right?

I see jobs in the Oozie console and I see applications in the YARN web UI.
I've searched the log output for that path or any errors or warnings. I
turned on the MySQL general-log and didn't see any queries hitting the
`movielens.genres` tables. Anything else I can try?

Thanks,
Andrew

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message