drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From François Méthot <fmetho...@gmail.com>
Subject Re: Drill Performance
Date Thu, 08 Sep 2016 20:25:07 GMT
late reply...

We are trying to run drill on 200 nodes, but we keep getting random lost of
connectivity with certain nodes, which spoil the query completely, happens
maybe 50% of the time.
It depends on how many files gets queried, basically how heavy is the query.

It looks exactly like this problem:

ForemanException: One or more nodes lost connectivity during query
https://issues.apache.org/jira/browse/DRILL-4325

Until we find a solution, we stick to a dedicated dozen node cluster.

It would be nice to have to have a query recover from disconnected nodes
and keep gathering result from valid nodes.







On Thu, Jul 14, 2016 at 11:22 PM, scott <tcots8888@gmail.com> wrote:

> Curious what the biggest is. Has anyone configured more than 100 drillbits
> in a cluster before?
>
> Scott
>
> On 07/14/2016 10:27 AM, Ted Dunning wrote:
>
>> On the right distribution, you can restrict the subset of the cluster that
>> has the data you need to avoid locality variation when Drill only runs on
>> a
>> subset of nodes.
>>
>>
>>
>> On Thu, Jul 14, 2016 at 6:48 AM, François Méthot <fmethot78@gmail.com>
>> wrote:
>>
>> We have observed that if the number of drillbits is lower than the number
>>> of nodes in our cluster, some minor fragment takes longer to complete
>>> their
>>> query (We hypothesize that it is because they can't take advantage of
>>> data
>>> locality, fragment has to reach out for data on a different node). One
>>> drillbit to one node, with evenly spread data is the best scenario.
>>>
>>> These results may also vary depending on your hardware I think.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Jul 7, 2016 at 7:06 PM, Ashish Goel <
>>> ashish.kumar.goel1@gmail.com>
>>> wrote:
>>>
>>> That's an interesting question. I would also be curious to learn more
>>>>
>>> about
>>>
>>>> this. Did anyone run any benchmarks around this? It would be helpful to
>>>> understand.
>>>>
>>>> On Thu, Jul 7, 2016 at 11:13 AM, scott <tcots8888@gmail.com> wrote:
>>>>
>>>> Abdel,
>>>>> I didn't ask about having more than one drillbit per node. I asked
>>>>>
>>>> about
>>>
>>>> the number of drillbits per cluster. For instance, if I had a 1000 node
>>>>> Hadoop cluster, should I install drillbits on each node? Or, is there
>>>>>
>>>> some
>>>>
>>>>> point at which the interaction of 1000 drillbits causes contention
>>>>> resulting in a plateau or decline of performance?
>>>>>
>>>>> Thanks,
>>>>> Scott
>>>>>
>>>>> On Thu, Jul 7, 2016 at 5:00 PM, Abdel Hakim Deneche <
>>>>>
>>>> adeneche@maprtech.com
>>>>
>>>>> wrote:
>>>>>
>>>>> I'm not sure you'll get any performance improvement from running more
>>>>>>
>>>>> than
>>>>>
>>>>>> a single drillbit per cluster node.
>>>>>>
>>>>>> On Thu, Jul 7, 2016 at 9:47 AM, scott <tcots8888@gmail.com>
wrote:
>>>>>>
>>>>>> Follow up question: Is there a sweet spot for
>>>>>>>
>>>>>> DRILL_MAX_DIRECT_MEMORY
>>>
>>>> and
>>>>>
>>>>>> DRILL_HEAP settings?
>>>>>>>
>>>>>>> On Wed, Jul 6, 2016 at 2:42 PM, scott <tcots8888@gmail.com>
wrote:
>>>>>>>
>>>>>>> Hello,
>>>>>>>> Does anyone know if there is a maximum number of drillbits
>>>>>>>>
>>>>>>> recommended
>>>>>
>>>>>> in
>>>>>>
>>>>>>> a Drill cluster? For example, I've observed that in a Solr Cloud,
>>>>>>>>
>>>>>>> the
>>>>
>>>>> performance tapers off for ingest at around 16 JVM instances. Is
>>>>>>>>
>>>>>>> there
>>>>>
>>>>>> a
>>>>>>
>>>>>>> similar practical limitation to the number of drillbits I should
>>>>>>>>
>>>>>>> cluster
>>>>>>
>>>>>>> together?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Scott
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Abdelhakim Deneche
>>>>>>
>>>>>> Software Engineer
>>>>>>
>>>>>>    <http://www.mapr.com/>
>>>>>>
>>>>>>
>>>>>> Now Available - Free Hadoop On-Demand Training
>>>>>> <
>>>>>>
>>>>>> http://www.mapr.com/training?utm_source=Email&utm_medium=Sig
>>> nature&utm_campaign=Free%20available
>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Ashish
>>>>
>>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message