drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Rogers <par0...@yahoo.com.INVALID>
Subject Re: EMC ECS Configuration with Apache Drill
Date Thu, 22 Aug 2019 00:15:24 GMT
Hi Prabu & Ted,

Ted is right, the next step to track this down is via debugging. As large projects go, Drill
is actually easier to debug than most. (Hat's off the the team for achieving this valuable

1. Fork, clone and build Drill: [1]
2. In your IDE (both Eclipse and Intellij work) We used to have info, but can't find it now.
[2] gives an overview. I think you can just import drill/pom.xml as a Maven project (in Eclipse).
3. Find the test TestCsvWithHeaders.java [3]. Run it to verify things work.
4. Create an ad-hoc test in this same package. You really just need a setup and a test method:

  public static void setup() throws Exception {

  public void adHocTest() throws IOException {
    String sql = "SELECT * FROM ...";
    RowSet actual = client.queryBuilder().sql(sql).rowSet();

The setup method starts your cluster. The test just runs a query and will print the results.
Put your SQL here. Works best if the file is small.

You'll need to configure your data source; the test does not hit Zookeeper where we store
the definitions you set in the Drill web UI. My tests tend to do the setup in code, but this
gets pretty messy.

Anyone know how to do the storage plugin setup in some file so it works for a unit test? Maybe
edit bootstrap-storage-plugins.json [4] for a quick & dirty solution?

Once you get past this, run the test. It will fail and print a big nasty stack dump. If you
look carefully (ignore the first few stacks, they are on the client), you should see a stack
trace on the server (which is running in the same process) where Drill is trying to open your
file. You can set a breakpoint here and start poking around to see what's what.

Quite a bit to get right, so feel free to ask here (or on dev) to get help. Note also that
there is detailed info in the "Learning Apache Drill" book for setting up your development

- Paul

[1] http://drill.apache.org/docs/compiling-drill-from-source/

[2] https://github.com/apache/drill/tree/master/docs/dev

[3] https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/exec/store/easy/text/compliant/TestCsvWithHeaders.java

[4] https://github.com/apache/drill/blob/master/exec/java-exec/src/main/resources/bootstrap-storage-plugins.json

    On Wednesday, August 21, 2019, 02:10:17 PM PDT, Ted Dunning <ted.dunning@gmail.com>

Yes. You can debug the code. It is a large codebase so that can be a bit of
a trick to get started.

I think that one of the most stable approaches is to build a test case that
accesses the data you want (this doesn't have to become a public test case,
it just makes debugging easier by being very repeatable).

I am not up to speed on how to do this, however.

Is there somebody else on the list who could advise on this?

On Wed, Aug 21, 2019 at 1:08 PM Prabu Mohan <prabu.oracle@gmail.com> wrote:

> Thanks Ted.
> This is getting complex now, I thought that I might be missing something
> simple while configuring drill, but this seems to be far beyond that.
> I'm not sure whether I can get a proxy and also just in case if any other
> issues occur as well, is there a way I can debug the code to understand
> what values are being passed ?
> On Tue, Aug 20, 2019 at 12:22 AM Ted Dunning <ted.dunning@gmail.com>
> wrote:
> > On Mon, Aug 19, 2019 at 11:33 AM Prabu Mohan <prabu.oracle@gmail.com>
> > wrote:
> >
> > > but i am able to connect to ECS via python using boto3 libraries
> without
> > > any issues, I am able to write files to the bucket and read them back
> ..
> > >
> > > not sure why i am facing issues with drill though with the same
> > credentials
> > >
> >
> >
> > The key here is your assumption that the same credentials are being
> passed
> > through Drill to AWS and that there isn't some other consideration that
> > keeps S3 from believing whatever credentials it is getting.
> >
> > That assumption has to be attacked by figuring out experiments that can
> > prove or disprove aspects of it. For instance, if you can get a proxy in
> > the middle of the connection, you should be able to see *exactly* what is
> > on the wire. Likewise if you can get better logging out of Drill.
> >
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message