drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Bates <jba...@maprtech.com>
Subject Re: How do I make json files les painful
Date Thu, 19 Mar 2015 16:38:55 GMT
I'll give that a shot but it is troubling because actual data size is not
the direct issue.

One file that is 43M works fine but one that is 5M is to larger. I'll have
to dig into some logging when time is more plentiful.

On Thu, Mar 19, 2015 at 11:28 AM, Matt <bsg075@gmail.com> wrote:

> Is each file a single json array object?
>
> If so, would converting the files to a format with one line per record a
> potential solution?
>
> Example using jq (http://stedolan.github.io/jq/): jq -c '.[]'
>
>
> On 19 Mar 2015, at 12:22, Jim Bates wrote:
>
>  I constantly, constantly, constantly hit this.
>>
>> I have json files that are just a huge collection of an array of json
>> objects
>>
>> example
>> "MyArrayInTheFile":
>> [{"a":"1","b":"2","c":"3"},{"a":"1","b":"2","c":"3"},...]
>>
>> My issue is in exploring the data, I hit this.
>>
>> Query failed: Query stopped., Record was too large to copy into vector. [
>> 39186288-2e01-408c-b886-dcee0a2c25c5 on maprdemo:31010 ]
>>
>> I can explore csv, tab, maprdb, hive at fairly large data sets and limit
>> the response to what fits in my system limitations but not json in this
>> format.
>>
>> The two options I have come up with to move forward are..
>>
>> 1. I strip out 90% of the array values in a file and explore that to get
>> to my view. then go to a larger system and see if I have enough to get the
>> job done.
>> 2. Move to the larger system and explore there taking resources that
>> don't need to be spent on a science project.
>>
>> Hoping the smart people have a different option for me,
>>
>> Jim
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message