drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Bates <jba...@maprtech.com>
Subject Re: How do I make json files les painful
Date Fri, 20 Mar 2015 00:43:47 GMT
Yes, it was a memory thing. I was running on a sand box and first the query
is killed, then 20 or 30 sec latter the kernel is still out of memory,
can't seam to kill anything, and then it stops the cpu. I'll send you the
query and compressed data directly.

On Thu, Mar 19, 2015 at 6:16 PM, Jacques Nadeau <jacques@apache.org> wrote:

> Kernel panic?  Can you try to share the information that causes this?  Are
> you running out of memory?  What type of system are you running on?
>
> On Thu, Mar 19, 2015 at 2:00 PM, Jim Bates <jbates@maprtech.com> wrote:
>
> > On first look I could read all the files but  doing a flatten caused all
> > kinds of things that were bad. The worst was a repeatable kernel panic.
> >
> > I think I'm back to making the initial smaller in the larger file sets.
> >
> > I have some files that are say 100M in size. Each file is a single line
> > array:
> > {"MyArrayInTheFile":[{"a":"1","b":"2"},{"a":"1","b":"2"},...]}
> >  What is the best way to represent that so it can be explored? Do I do
> what
> > was suggested before and put each array entry on its own line?
> > {"MyArrayInTheFile":[
> > {"a":"1","b":"2"},
> > {"a":"1","b":"2"},
> > ...
> > ]}
> >
> > What works best for the 0.8 code?
> >
> >
> > On Thu, Mar 19, 2015 at 12:59 PM, Jim Bates <jbates@maprtech.com> wrote:
> >
> > > Ok, went to drill-0.8.0.31020-1 and it was %1000 better.
> > >
> > > On Thu, Mar 19, 2015 at 12:16 PM, Sudhakar Thota <sthota@maprtech.com>
> > > wrote:
> > >
> > >> I got the same issue, engineering recommended me use drill-0.8.0
> > >>
> > >> Sudhakar Thota
> > >> Sent from my iPhone
> > >>
> > >> > On Mar 19, 2015, at 9:22 AM, Jim Bates <jbates@maprtech.com>
wrote:
> > >> >
> > >> > I constantly, constantly, constantly hit this.
> > >> >
> > >> > I have json files that are just a huge collection of an array of
> json
> > >> > objects
> > >> >
> > >> > example
> > >> > "MyArrayInTheFile":
> > >> > [{"a":"1","b":"2","c":"3"},{"a":"1","b":"2","c":"3"},...]
> > >> >
> > >> > My issue is in exploring the data, I hit this.
> > >> >
> > >> > Query failed: Query stopped., Record was too large to copy into
> > vector.
> > >> [
> > >> > 39186288-2e01-408c-b886-dcee0a2c25c5 on maprdemo:31010 ]
> > >> >
> > >> > I can explore csv, tab, maprdb, hive at fairly large data sets and
> > limit
> > >> > the response to what fits in my system limitations but not json in
> > this
> > >> > format.
> > >> >
> > >> > The two options I have come up with to move forward are..
> > >> >
> > >> >   1. I strip out 90% of the array values in a file and explore that
> to
> > >> get
> > >> >   to my view. then go to a larger system and see if I have enough
to
> > >> get the
> > >> >   job done.
> > >> >   2. Move to the larger system and explore there taking resources
> that
> > >> >   don't need to be spent on a science project.
> > >> >
> > >> > Hoping the smart people have a different option for me,
> > >> >
> > >> > Jim
> > >>
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message