asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Maxon <ima...@uci.edu>
Subject Re: [jira] [Commented] (ASTERIXDB-1201) RTree built on the optional field refuses to load the NULL value when executing the bulk load
Date Wed, 02 Dec 2015 00:50:01 GMT
Whoops! Actually I am wrong on the above, I spoke too soon...
THIS is what we insert into the RTree when the attribute is null, so
it does contain all of the keys in the primary:

tid0:(1, 34)[f0:(0, 9) {AInt64: {1}}f1:(9, 10) {null}f2:(10, 11)
{null}f3:(11, 12) {null}f4:(12, 13) {null}
 tid1:(34, 67)[f0:(0, 9) {AInt64: {2}}f1:(9, 10) {null}f2:(10, 11)
{null}f3:(11, 12) {null}f4:(12, 13) {null}

I'm wondering how this affects internal node splits, though? I'll keep
investigating...

-Ian

On Tue, Dec 1, 2015 at 3:58 PM, Ian Maxon <imaxon@uci.edu> wrote:
> I don't think we know what we do with NULL values in an RTree in
> general though. After briefly discussing this with Mike, Till, Yingyi,
> and Abdullah this morning, the takeaway (as I understood it) was a
> concern with whether or not the RTree is a secondary index, or
> something like a partial index where the attribute isn't null. I ran
> through Jianfeng's test code briefly in the debugger to see what's
> going on.
>
> From the bulk load case, at least, I think the latter is true. What I
> did to check this was just to pretty print every tuple that got fed
> into the RTree and BTree bulk loaders.
>
> For the secondary BTree we can see we insert the key anyway regardless
> of the presence of the attribute:
>
> TC: 1
>  tid0:(1, 19)[f0:(0, 9) {AInt64: {3}}f1:(9, 10) {null}
>
> TC: 2
>  tid0:(1, 19)[f0:(0, 9) {AInt64: {1}}f1:(9, 10) {null}
>  tid1:(19, 45)[f0:(0, 9) {AInt64: {2}}f1:(9, 18) {AInt64: {3}}
>
> But for the RTree this is all we insert:
>
> tid0:(1, 66)[f0:(0, 9) {AInt64: {3}}f1:(9, 18) {ADouble: {4.0}}f2:(18,
> 27) {ADouble: {5.0}}f3:(27, 36) {ADouble: {4.0}}f4:(36, 45) {ADouble:
> {5.0}}
>
> This makes me wonder what was up with the original RTree bulkload code
> to make this happen, because that looks like a perfectly fine tuple to
> me...
>
> - Ian
>
> On Tue, Dec 1, 2015 at 2:40 PM, Ildar Absalyamov
> <ildar.absalyamov@gmail.com> wrote:
>> As far as I can see the patch I was working on have not been merged into master yet.
So unless Jianfeng was working off release-0.8.8 branch it should not be the cause.
>>
>>> On Dec 1, 2015, at 10:01, Chen Li <chenli@gmail.com> wrote:
>>>
>>> Maybe Ildar?
>>>
>>> On Mon, Nov 30, 2015 at 4:05 PM, Jianfeng Jia (JIRA) <jira@apache.org>
>>> wrote:
>>>
>>>>
>>>>    [
>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15032758#comment-15032758
>>>> ]
>>>>
>>>> Jianfeng Jia commented on ASTERIXDB-1201:
>>>> -----------------------------------------
>>>>
>>>> Hi devs,
>>>>
>>>> I submitted an issue 1201 which happens on the master
>>>> (48706305724f6e2580b5a6716a709cebce2b40c0). But it’s not reproducible in
>>>> the the latest master.
>>>> Basically, it built an RTree index on an nullable field. It was complain
>>>> about the NULL values in the older version.
>>>> I’m wondering if anyone fix this problem intentionally. If so, what’s
the
>>>> meaning of NULL as to the RTree index?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Best,
>>>>
>>>> Jianfeng Jia
>>>> PhD Candidate of Computer Science
>>>> University of California, Irvine
>>>>
>>>>
>>>>
>>>>> RTree built on the optional field refuses to load the NULL value when
>>>> executing the bulk load
>>>>>
>>>> ---------------------------------------------------------------------------------------------
>>>>>
>>>>>                Key: ASTERIXDB-1201
>>>>>                URL:
>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1201
>>>>>            Project: Apache AsterixDB
>>>>>         Issue Type: Bug
>>>>>         Components: Storage
>>>>>           Reporter: Jianfeng Jia
>>>>>           Assignee: Ian Maxon
>>>>>
>>>>> When I build a RTree index on an optional field, it will throw "Value
>>>> provider for type NULL is not implemented" exception when operates the bulk
>>>> load.
>>>>> Here is the reproducible script:
>>>>> {code}
>>>>> drop dataverse test if exists;
>>>>> create dataverse test;
>>>>> use dataverse test;
>>>>> create type t_record as closed {
>>>>> fa : int64,
>>>>> fb: int64?,
>>>>> fc : point?
>>>>> }
>>>>> create dataset ds_set (t_record) primary key fa;
>>>>> create index bidx on ds_set(fb) type btree;
>>>>> create index cidx on ds_set(fc) type rtree;
>>>>> insert into dataset ds_set ( [{"fa":1}, {"fa":2, "fb":3}, {"fa":3,
>>>> "fc":point("4.0,5.0")}]);
>>>>> load dataset ds_set
>>>>> using localfs
>>>>> (("path"="172.17.0.2:///data/twitter/test.adm"),("format"="adm"));
>>>>> {code}
>>>>> The "insert" and "load" statements are run separately.
>>>>> The test.adm uses the same three records:
>>>>> {code}
>>>>> {"fa":1}
>>>>> {"fa":2, "fb":3}
>>>>> {"fa":3, "fc":point("4.0,5.0")
>>>>> {code}
>>>>> The insert statement works fine. The error happens in the "load"
>>>> statement only:
>>>>> {code}
>>>>> Caused by:
>>>> org.apache.hyracks.algebricks.common.exceptions.NotImplementedException:
>>>> Value provider for type NULL is not implemented
>>>>>  at
>>>> org.apache.asterix.dataflow.data.nontagged.valueproviders.AqlPrimitiveValueProviderFactory$1.getValue(AqlPrimitiveValueProviderFactory.java:64)
>>>>>  at
>>>> org.apache.hyracks.storage.am.rtree.frames.RTreeNSMFrame.adjustMBRImpl(RTreeNSMFrame.java:132)
>>>>>   at
>>>> org.apache.hyracks.storage.am.rtree.frames.RTreeNSMFrame.adjustMBR(RTreeNSMFrame.java:153)
>>>>>   at
>>>> org.apache.hyracks.storage.am.rtree.impls.RTree$RTreeBulkLoader.propagateBulk(RTree.java:954)
>>>>>   at
>>>> org.apache.hyracks.storage.am.rtree.impls.RTree$RTreeBulkLoader.end(RTree.java:937)
>>>>>   at
>>>> org.apache.hyracks.storage.am.lsm.rtree.impls.LSMRTree$LSMRTreeBulkLoader.end(LSMRTree.java:584)
>>>>>   at
>>>> org.apache.hyracks.storage.am.common.dataflow.IndexBulkLoadOperatorNodePushable.close(IndexBulkLoadOperatorNodePushable.java:107)
>>>>>   ... 7 more
>>>>> {code}
>>>>> The BTree index works fine if I remove the RTree index.
>>>>
>>>>
>>>>
>>>> --
>>>> This message was sent by Atlassian JIRA
>>>> (v6.3.4#6332)
>>>>
>>
>> Best regards,
>> Ildar
>>

Mime
View raw message