db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Noll <dan...@nuix.com>
Subject Filtering using the sort order on two columns
Date Tue, 04 Aug 2009 06:30:03 GMT
Hi all.

I have a table and index like this:

       field     VARCHAR(255) NOT NULL,
       text      VARCHAR(255) NOT NULL,
       docfreq   INTEGER NOT NULL

   CREATE INDEX TermsFieldAndText ON Terms (field, text)

Here's some example data.  Table is shown in sorted order as it would
be listed in the index above:

| field     | text      | docfreq      |
| body      | hello     | 10           |
| tag       | home      | 100          |
| tag       | work      | 80           |

For a given input term (let's say tag:home) I want to find the _next_
entry in the table (so it should return tag:work).  My naive
implementation does this:

   SELECT field, text, docfreq
   FROM Terms
   WHERE (field = ? AND text > ?) OR field > ?
   ORDER BY field ASC, text ASC

This does the right thing as far as the output is concerned, but Derby
does a table scan.  Looking at the query, my impression would be that
in the worse case scenario, Derby should do two index scans.  Or, if
it's smart, one index scan.

I'm aware that I might be able to work around this by having instead a
field with the concatenation of both fields with
CONCAT(field,'\0',text).  But I wanted to check first whether there is
a way I can maintain my relatively understandable table structure
before resorting to hacks.  Basically I just need the data in index
order, starting from a given point... so I figure there is a simple


Daniel Noll                            Forensic and eDiscovery Software
Senior Developer                              The world's most advanced
Nuix                                                email data analysis
http://nuix.com/                                and eDiscovery software

View raw message