drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Report issues with sensitive data
Date Wed, 01 Apr 2015 16:46:11 GMT
One idea is to post a log-synth [1] schema that generates data the same
shape as your real data.  If you can generate fake data that causes the
same problem you give developers a huge head start in solving your problem.

For the record, are you using the recently announced 0.8 version of Drill?


[1] https://github.com/tdunning/log-synth


On Wed, Apr 1, 2015 at 3:29 AM, Alexander Reshetov <
alexander.v.reshetov@gmail.com> wrote:

> Hello all,
>
> I have 80GB dataset of JSONs which have many nested arrays.
> I'm trying to flatten it and make some calculations, but I got
> exceptions after reading about 2/3 of file.
>
> I could (and want) to post an issue in Jira, but I cannot attach my dataset
> because it has sensitive data and also it's too large.
>
> It there any way to help to investigate issues without posting my dataset?
>
> To give a hit about issue I've attached file with exception text.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message