gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (388J)" <chris.a.mattm...@jpl.nasa.gov>
Subject Re: HSQLDB woes...
Date Mon, 08 Nov 2010 18:55:05 GMT

We recently used Berkeley XMLDB and it was actually pretty performant. I'm not sure about
its license though; we used it internally at JPL.


On 11/8/10 11:26 AM, "Enis Söztutar" <enis.soz@gmail.com> wrote:

>From my experience the SQL backend is a mJOR headache. Writing a SQL backend
is actually much much harder than
the HBase or Cassandra backend since we need very custom code for each SQL
server. Plus, there is some code for
dealing with HSQL embedded mode.

I completely agree to switch to another zero-conf backend for tests and for
nutch. However, I am not sure about BerkeleyDB.
If we can implement a data store easily that would be great.


On Fri, Nov 5, 2010 at 8:31 AM, Andrzej Bialecki <ab@getopt.org> wrote:

> Hi,
> The HSQL-based SqlStore exhibits awful performance when used with Nutch.
> I believe this is related to the way LOBs are handled in HSQL - even for
> a tiny crawl of 50 pages the size of the .lob file is in the order of
> 100MB. Actually, after reaching this point the performance of any
> updates drops dramatically so it becomes nearly unusable.
> Of course, HSQL was never meant to be used as a serious backend...
> still, perhaps there are alternatives that could give us a better
> behavior for small / embedded use - and for small operations in the
> order of a few thousand records I think we should be able to come up
> with something better...
> I tried to integrate the H2 database (www.h2database.com), but gave up
> after I discovered that it doesn't support Blob.setBinaryStream(..) -
> there are workarounds for this in H2, but it would complicate the code
> too much...
> Any suggestions / comments? Maybe it's time for a BerkeleyDB DataStore?
> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com

Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: Chris.Mattmann@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message