drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacques Nadeau <jacq...@apache.org>
Subject Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?
Date Tue, 30 Jun 2015 22:39:07 GMT
What is your sense of the use of this interface? I would hope new uses
would be scarce. Do you think this will be a frequent occurrence that most
devs will experience?
On Jun 30, 2015 3:35 PM, "Abdel Hakim Deneche" <adeneche@maprtech.com>
wrote:

> I started this discussion while fixing DRILL-3312
> <https://issues.apache.org/jira/browse/DRILL-3312>, it took me quite some
> time to debug and find the root of an error I was seeing. I first thought
> about fixing those few places that call buffer() without checking if the
> buffer is null or not, but this won't prevent new code from repeating the
> same mistakes.
>
>
> On Tue, Jun 30, 2015 at 3:19 PM, Jacques Nadeau <jacques@apache.org>
> wrote:
>
> > Lots of different comments on this thread.  Here are few of my thoughts:
> >
> > I think you need to differentiate the different tiers of interfaces a
> > little bit more.
> >
> > BufferAllocator.buffer() was meant to be a low level advanced interface.
> > It should be used rarely and when done so, only with extreme care.  Most
> > devs shouldn't use it.  Having an advanced interface seems fine here.
> This
> > was how the interface was envisioned.   It looks like there are currently
> > 16 classes that reference this code not including vectors.  (There
> probably
> > should be less, it looks like some code duplication contributes to this.)
> >
> > All the vector classes also access this code.  Vectors have two methods:
> > allocateNew and allocateNewSafe.  allocateNew has an exception behavior
> and
> > allocateNewSafe has a return value behavior.  These are main APIs that
> are
> > used extensively throughout the code and the ones we should be most
> focused
> > on.  I like giving the developer the choice of behavior (thus the two
> > options).  It sounds like others are saying to remove this choice
> (because
> > code is not managing it correctly).
> >
> > If there are bugs in the 16 classes on how they use the advanced
> interface,
> > this doesn't seem like a big challenge to correct rather than bulk
> > modifying the API.  If people are using the wrong method on the vector
> > classes (or the vector classes aren't correctly behaving), that doesn't
> > seem like something we should address at the allocator API level.
> >
> > Hakim, can you be more specific about your concern?  It doesn't seem like
> > we're talking about thousands of places here.
> >
> > Regarding checked versus unchecked...  I'm strongly in support of using
> > checked exceptions for most things.  However, I'm against it for an out
> of
> > memory condition.  (Note how the JVM also treats this as an unchecked
> > exception.) It causes early catching when, in most cases, we really want
> to
> > bubble way up the stack and fail the query.  I know of very few places
> > where we can actually locally manage an out of memory condition so why
> > write huge amounts of code the bubble the exception all the way to where
> we
> > can actually handle it.
> >
> > A quick sidebar to Jason's "Hash agg out of memory" comment
> > Two thoughts here.  I'm not sure this is a meaningful message.  Where we
> > run out of memory may have nothing to do with where we are using most
> > memory.  And if we did to decide this was useful, this can be done just
> as
> > easily with unchecked exceptions as checked exception.  What I'm want to
> > avoid is adding throws OutOfMemoryException in 1000 places.  The reality
> is
> > everybody should assume that an OutOfMemory can occur pretty much
> anytime.
> >
> > TL;DR
> >
> > In general, I think this discussion is too abstract and thus has the
> danger
> > of trending towards a religious discussion (I already see a bit of that
> > fervor in some of the responses.).  The number of places we are talking
> > about are not that large and are easy to find.  Let's solve this by
> looking
> > at specifics.  For example, if people think checked would be better,
> > someone should go through and make an example changeset for everyone to
> > review.  My guess is, the only that it works will be if very quickly, the
> > handling code converts the checked exception into an unchecked
> > UserException.  But I'm more than happy to be proven wrong.
> >
> > On Tue, Jun 30, 2015 at 2:16 PM, Daniel Barclay <dbarclay@maprtech.com>
> > wrote:
> >
> > > Hanifi GUNES wrote:
> > >
> > >> - We would end up having to add it to almost everything everywhere
> > >> Why would one propagate the checked exception for no reason? And why
> > would
> > >> one not propagate the exception for a good reason like robustness? I
> > agree
> > >> that one has to avoid propagating the checked exception for no reason
> > >> however I support propagating it for a good reason.
> > >>
> > >> The added benefit of raising a checked exception is reminding as well
> as
> > >> enforcing devs to handle it and be more cautious about this particular
> > >> event. I find this compile-time safety check invaluable for
> robustness.
> > >>
> > >
> > > +(1 times <some large number>)
> > >
> > > Daniel
> > >
> > >
> > >
> > >>
> > >> - Or constantly wrapping it with RuntimeException to get around that
> > >> If it has to be done, I would recommend relying on a helper to do so.
> > >> There
> > >> is not much of man-work involved here.
> > >>
> > >>
> > >> 2015-06-30 13:53 GMT-07:00 Abdel Hakim Deneche <adeneche@maprtech.com
> >:
> > >>
> > >>  +1 to Hanifi's
> > >>>
> > >>> On Tue, Jun 30, 2015 at 1:38 PM, Jason Altekruse <
> > >>> altekrusejason@gmail.com
> > >>>
> > >>>>
> > >>>>  wrote:
> > >>>
> > >>>  +1 to Hanifi's comments, I think it makes much more sense to have
a
> > >>>>
> > >>> number
> > >>>
> > >>>> of sites where the operators are explicitly catching a checked
OOM
> > >>>> exception and either decide to handle it or produce a message like
> > "Hash
> > >>>> Aggregate does not support our of memory conditions". This would
be
> > >>>> particularly useful for debugging queries, as the user exception
can
> > >>>> provide context information about the current operation. This way
> > users
> > >>>>
> > >>> can
> > >>>
> > >>>> have some idea about the part of their query that might be causing
> an
> > >>>> excessive strain on system resources. I understand that there are
> also
> > >>>> cases where operators competing for memory can make it a toss up
to
> > >>>> which
> > >>>> will actually fail, but this would at least be a step to give more
> > >>>>
> > >>> detailed
> > >>>
> > >>>> information to users.
> > >>>>
> > >>>> On Tue, Jun 30, 2015 at 1:28 PM, Hanifi GUNES <hg@apache.org>
> wrote:
> > >>>>
> > >>>>  I would propose throwing a checked exception encouraging explicit
> and
> > >>>>> consistent handling of this event. Each sub-system has liberty
to
> > >>>>>
> > >>>> decide
> > >>>
> > >>>> if
> > >>>>
> > >>>>> an OOM failure is fatal or non-fatal depending on its capabilities.
> > >>>>>
> > >>>> Also
> > >>>
> > >>>> if
> > >>>>
> > >>>>> at some point a sub-system needs to communicate with its callers
> via
> > a
> > >>>>> different mechanism such like using flags (boolean, enum etc)
or
> > >>>>>
> > >>>> raising
> > >>>
> > >>>> an
> > >>>>
> > >>>>> unchecked exception that's still fine, just handle the exception.
> If
> > >>>>>
> > >>>> there
> > >>>>
> > >>>>> is a need to suppress the checked exception that's fine too,
just
> > use a
> > >>>>> helper method.
> > >>>>>
> > >>>>> Either way, returning *null* sounds problematic in many ways
i) it
> is
> > >>>>> implicit ii) unsafe iii) its handling logic is repetitive iv)
it is
> > >>>>> semantically unclean to make null mean something - even worse
> > something
> > >>>>> context specific.
> > >>>>>
> > >>>>>
> > >>>>> -Hanifi
> > >>>>>
> > >>>>> 2015-06-30 12:23 GMT-07:00 Abdel Hakim Deneche <
> > adeneche@maprtech.com
> > >>>>>
> > >>>> :
> > >>>>
> > >>>>>
> > >>>>>  I guess that would fix the issue too. But we may still run
into
> > >>>>>>
> > >>>>> situations
> > >>>>>
> > >>>>>> where the caller will pass a flag to "mute" the exception
and not
> > >>>>>>
> > >>>>> handle
> > >>>>
> > >>>>> the case anyway.
> > >>>>>>
> > >>>>>> If .buffer() unconditionally throws an exception, can't
the
> caller,
> > >>>>>>
> > >>>>> who
> > >>>
> > >>>> wants to, just catch that and handle it properly ?
> > >>>>>>
> > >>>>>> On Tue, Jun 30, 2015 at 12:13 PM, Chris Westin <
> > >>>>>>
> > >>>>> chriswestin42@gmail.com>
> > >>>>
> > >>>>> wrote:
> > >>>>>>
> > >>>>>>  No, but we should do something close to that.
> > >>>>>>>
> > >>>>>>> There are cases where the caller can handle the inability
to get
> > >>>>>>>
> > >>>>>> more
> > >>>
> > >>>> memory, and may be able to go to disk. However, you are correct
> > >>>>>>>
> > >>>>>> that
> > >>>
> > >>>> there
> > >>>>>>
> > >>>>>>> are many that can't handle an OOM, and that fail to
check.
> > >>>>>>>
> > >>>>>>> Instead of unconditionally throwing OutOfMemoryRuntimeException,
> I
> > >>>>>>>
> > >>>>>> would
> > >>>>>
> > >>>>>> suggest that the buffer() call take a flag that indicates
whether
> > >>>>>>>
> > >>>>>> or
> > >>>
> > >>>> not
> > >>>>>
> > >>>>>> it
> > >>>>>>
> > >>>>>>> should throw if it is unable to fulfill the request.
This way,
> the
> > >>>>>>>
> > >>>>>> call
> > >>>>
> > >>>>> sites that can handle an OOM can pass in the flag to return
null,
> > >>>>>>>
> > >>>>>> and
> > >>>
> > >>>> the
> > >>>>>
> > >>>>>> rest can pass in the flag value to throw, and not have
to have any
> > >>>>>>>
> > >>>>>> checking
> > >>>>>>
> > >>>>>>> code.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche
<
> > >>>>>>> adeneche@maprtech.com
> > >>>>>>>
> > >>>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>>  our current implementations of BufferAllocator.buffer(int,
int)
> > >>>>>>>>
> > >>>>>>> returns
> > >>>>>
> > >>>>>> null when it cannot allocate the buffer.
> > >>>>>>>>
> > >>>>>>>> But looking through the code, there are many places
that don't
> > >>>>>>>>
> > >>>>>>> check
> > >>>>
> > >>>>> if
> > >>>>>
> > >>>>>> the
> > >>>>>>>
> > >>>>>>>> allocated buffer is null before trying to access
it which will
> > >>>>>>>>
> > >>>>>>> throw
> > >>>>
> > >>>>> a
> > >>>>>
> > >>>>>> NullPointerException.
> > >>>>>>>>
> > >>>>>>>> ValueVectors' allocateNewSafe() seem to be the
only place that
> > >>>>>>>>
> > >>>>>>> handle
> > >>>>
> > >>>>> the
> > >>>>>>
> > >>>>>>> null in a specific manner.
> > >>>>>>>>
> > >>>>>>>> Should we update the allocators' implementation
to throw an
> > >>>>>>>> OutOfMemoryRuntimeException instead of returning
null ? this has
> > >>>>>>>>
> > >>>>>>> the
> > >>>>
> > >>>>> added
> > >>>>>>>
> > >>>>>>>> benefit of displaying a proper out of memory error
message to
> the
> > >>>>>>>>
> > >>>>>>> user.
> > >>>>>
> > >>>>>>
> > >>>>>>>> Thanks!
> > >>>>>>>>
> > >>>>>>>> --
> > >>>>>>>>
> > >>>>>>>> Abdelhakim Deneche
> > >>>>>>>>
> > >>>>>>>> Software Engineer
> > >>>>>>>>
> > >>>>>>>>    <http://www.mapr.com/>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Now Available - Free Hadoop On-Demand Training
> > >>>>>>>> <
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >>>
> > >>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> --
> > >>>>>>
> > >>>>>> Abdelhakim Deneche
> > >>>>>>
> > >>>>>> Software Engineer
> > >>>>>>
> > >>>>>>    <http://www.mapr.com/>
> > >>>>>>
> > >>>>>>
> > >>>>>> Now Available - Free Hadoop On-Demand Training
> > >>>>>> <
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >>>
> > >>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>>
> > >>> --
> > >>>
> > >>> Abdelhakim Deneche
> > >>>
> > >>> Software Engineer
> > >>>
> > >>>    <http://www.mapr.com/>
> > >>>
> > >>>
> > >>> Now Available - Free Hadoop On-Demand Training
> > >>> <
> > >>>
> > >>>
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >>>
> > >>>>
> > >>>>
> > >>>
> > >>
> > >
> > > --
> > > Daniel Barclay
> > > MapR Technologies
> > >
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message