db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Masood Mortazavi <masoodmortaz...@gmail.com>
Subject Re: Derby query optimizations / (materialized) Views and Queries
Date Tue, 14 Jun 2011 17:13:08 GMT
On Tue, Jun 14, 2011 at 6:09 AM, Rick Hillegas <rick.hillegas@oracle.com>wrote:

> Hi Masood,
> Some comments inline...
> On 6/13/11 7:30 PM, Masood Mortazavi wrote:
>> On Mon, Jun 13, 2011 at 7:18 PM, Masood Mortazavi <
>> masoodmortazavi@gmail.com <mailto:masoodmortazavi@gmail.com>> wrote:
>>    On Mon, Jun 13, 2011 at 6:39 PM, Masood Mortazavi
>>    <masoodmortazavi@gmail.com <mailto:masoodmortazavi@gmail.com>> wrote:
>>        I have a Derby table (T) with 10^7 records.
>>        This table has 5 columns.
>>        I have defined a Derby view (V) on one of those columns.
>>        To produce (V) all of (T) needs to be scanned.
>>        There are about 2000 records that the view (V) contains.
>>        Now, I "cross" (T) with (V), with an IN clause involving (V).
>>        I have three questions:
>>        [1] Is the optimizer smart enough to to scan only O( (T)*(V) )
>>        times (with the V comparisons involving in-memory
>>        "materialization" of (V)), or will it be O( (T)*(T) ) or
>>        something else?
>>  I think this depends on the view, the query, and the statistics on the
> underlying table. I'm sure that Derby's view materialization could be
> improved.
>>        [2] What is the support that Derby may be offering
>>        for materialized views?
>>  View materialization wasn't a focus of the release. It is
> possible that the automatic calculation of statistics may improve the
> performance of materialized views in some cases.
>>        [3] If the answer to [1] is the larger number, would
>>        materializing (V) change things or improve them drastically,
>>        i.e. would Derby save the large (T) sized scan to produce (V),
>>        or even better, would it do that and also bring all of (V),
>>        which is rather small, into memory (in large enough chunks or
>>        completely) for the purposes of the cross involving IN?
>>        With best regards to all good Derby friends,
>>        - m.
>>    I solve the materialization issue by a simple exporting of the
>>    view's select query and then importing it.
>>    It works well.
>>    I will rerun my IN cross query, against the now materialized table
>>    to see how much better or worse it does in this particular case.
>>    The general gist of my questions above still remain.
>>    Regards,
>>    Masood
>> OK.
>> As described above, I used Derby's wonderful system utility procedures to
>> produce a materialization of the view in question, above.
>> The cross query became faster than a blizzard.
> You may get better performance by using an INSERT ... SELECT statement
> rather than exporting the view and re-importing it. That will eliminate the
> redundant writing of the view to an external file. You may also investigate
> using a temporary table to store the intermediate result (for details, see
> the Reference Guide section on DECLARE GLOBAL TEMPORARY TABLE).
> Hope this helps,
> -Rick

This is all very helpful.
Thank you very much.
Yes, INSERT . . . SELECT would have been very helpful and more concise.
We may have other uses for the extra data file, in a Hadoop environment.
Given the statistics of the source data, doing an "IN" against a
materialized view turned out to be a lot faster, as expected.

Thanks again and with very best regards,

By the way, not supporting materialized views is not a big deal.
I'm not suggesting it should be supported.

View raw message