trafodion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohit Jain <rohit.j...@esgyn.com>
Subject RE: question on NULL representation in DB?
Date Fri, 30 Oct 2015 19:32:28 GMT
Are we forgetting that we have to move toward supporting versioning, and
there are many reasons to support that which I am not going to get into
here.  But suffice is to say that HBase allows multiple versions of rows.
So you cannot possibly be thinking about always deleting the row when the
column value is updated to NULL.  Only if the user wants one version can you
do that.  But then you have not bought yourself any relief if we have any
possibility at all that a customer could choose multiple versions.

Rohit

-----Original Message-----
From: Qifan Chen [mailto:qifan.chen@esgyn.com]
Sent: Friday, October 30, 2015 1:19 PM
To: dev <dev@trafodion.incubator.apache.org>
Subject: Re: question on NULL representation in DB?

Pretty much and in certain situations, IN list will be mapped to RANGE SPEC.

*Between*

This can be verified by looking what will happen during normalization and
preCodeGen stage.  The two methods to check is Between::transformNode()
and Between::preCodeGen().

void Between::transformNode(NormWA & normWARef,
                            ExprValueId & locationOfPointerToMe,
                            ExprGroupId & introduceSemiJoinHere,
                            const ValueIdSet & externalInputs)
{
  DBGSETDBG( "TRANSFORM_DEBUG" )
  DBGIF(
    unp = "";
    unparse(unp);
    cerr << (Int32)getOperatorType() << " "
         << (Int32)getValueId() << " "
         << (void *)this << " "
         << unp << endl;
  )

  // ---------------------------------------------------------------------
  // Transform the operands of the Between
  // ---------------------------------------------------------------------
  ItemExpr::transformNode(normWARef, locationOfPointerToMe,
                          introduceSemiJoinHere, externalInputs);

  ItemExpr *tfm = transformIntoTwoComparisons();  <===================

ItemExpr * Between::preCodeGen(Generator * generator)
{
  if (nodeIsPreCodeGenned())
    return getReplacementExpr();

  // transform "A BETWEEN B AND C" to "A >= B AND A <= C"
  ItemExpr * newExpr =
    generator->getExpGenerator()->createExprTree(
      "@A1 >= @A2 AND @A1 <= @A3", 0, 3, child(0), child(1), child(2));

  newExpr->bindNode(generator->getBindWA());
  setReplacementExpr(newExpr->preCodeGen(generator));
  markAsPreCodeGenned();
  return getReplacementExpr();
}


*IN list*

ItemExpr *convertINvaluesToOR(ItemExpr *lhs, ItemExpr *rhs) called
from processINlist().

Sounds like both, if in AND and OR form, can be mapped to the FilterList
filter.


Thanks --Qifan


On Fri, Oct 30, 2015 at 12:44 PM, Eric Owhadi <eric.owhadi@esgyn.com> wrote:

> I have seen between and in be normalized to regular combination of
> <col><op><val> ANDed orORed.
> Isn't that always the case? I was expecting this behavior to happen , and
> then not worry about between and in?
>
> -----Original Message-----
> From: Qifan Chen [mailto:qifan.chen@esgyn.com]
> Sent: Friday, October 30, 2015 12:41 PM
> To: dev <dev@trafodion.incubator.apache.org>
> Subject: Re: question on NULL representation in DB?
>
> We also need to consider other types of SQL predicates to be pushdown
> down,
> such as BETWEEN and IN list.
>
> Thanks --Qifan
>
> On Fri, Oct 30, 2015 at 12:19 PM, Eric Owhadi <eric.owhadi@esgyn.com>
> wrote:
>
> > No, I am working on a solution to keep current 0xFF trick while not
> > requiring expression re-evaluation and pushing up null columns...
> > Eric
> >
> > -----Original Message-----
> > From: Suresh Subbiah [mailto:suresh.subbiah60@gmail.com]
> > Sent: Friday, October 30, 2015 6:39 AM
> > To: dev@trafodion.incubator.apache.org
> > Subject: Re: question on NULL representation in DB?
> >
> > Hi,
> >
> > Summarising personal correspondence with Anoop on this question "
> > there are
> > 2 ways in which traf puts and detect null value in hbase:
> > either a missing value or a column
> >
> > value with null indicator prefix.
> >
> >
> >
> > During insert, we use the first method of not inserting that column.
> >
> > But during update, we use the second method of putting in null
> > indicator as the value of that column.
> >
> >
> >
> > We create rowid with null values or part of key with null values by
> > putting in null indicator and zeroing
> >
> > out remainder of the field. That way 2 null values will be compared
> > equal and null will sort high."
> >
> >
> > Are we thinking that if we implemented an update which sets a column
> > value to be NULL as a HBase Delete of that cell then predicate
> > pushdown need not check for null values again? It will be some work to
> > split out an Update as Put of NonNull values and then a Delete of null
> > values. Will need an expression to be evaluated at Update time. I
> > suppose we have a choice, pay the cost of expression eval during select
> or
> > during update.
> >
> >
> > Thanks
> >
> > Suresh
> >
> > On Thu, Oct 29, 2015 at 1:34 PM, Selva Govindarajan <
> > selva.govindarajan@esgyn.com> wrote:
> >
> > > By default, the null values are not inserted into hbase. If the
> > > column is nullable, the first additional byte determines if the
> > > column value is null or not. When a value is inserted into nullable
> > > column the first byte is always 0x00. When the column value is
> > > updated to null, the existing column value will be replaced with
> > > 0xFF in the first byte because it is not possible to switch to
> > > delete cell value in the midst of data execution.
> > >
> > > Selva
> > >
> > > -----Original Message-----
> > > From: Eric Owhadi [mailto:eric.owhadi@esgyn.com]
> > > Sent: Thursday, October 29, 2015 11:06 AM
> > > To: dev@trafodion.incubator.apache.org
> > > Subject: question on NULL representation in DB?
> > >
> > > Reading the code, I have a hard time understanding the various ways
> > > NULLs are represented in the DB for non-aligned format.
> > >
> > > I see comments in the code suggesting that nullable columns have the
> > > first value byte representing if the value is null, but I also see
> > > special cases all over the place that take care of null as being
> > > totally absent cells.
> > >
> > > The former method (adding a first byte indicating a null) having
> > > consequences on predicate push down -> need to re-do predicate
> > > evaluation at trafodion layer to deal with null semantic.
> > >
> > >
> > >
> > > But I am not sure why we have this special situation of coding null
> > > with a byte, instead of always dealing with nulls as being “absent”
> > > cell? I am sure there is a reason, but I just could not figure it
> > > out…
> > >
> > > Someone can help?
> > >
> > > Eric
> > >
> >
>
>
>
> --
> Regards, --Qifan
>



-- 
Regards, --Qifan

Mime
View raw message