gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Enis Söztutar <enis....@gmail.com>
Subject Re: HSQLDB woes...
Date Mon, 08 Nov 2010 18:26:56 GMT
>From my experience the SQL backend is a mJOR headache. Writing a SQL backend
is actually much much harder than
the HBase or Cassandra backend since we need very custom code for each SQL
server. Plus, there is some code for
dealing with HSQL embedded mode.

I completely agree to switch to another zero-conf backend for tests and for
nutch. However, I am not sure about BerkeleyDB.
If we can implement a data store easily that would be great.


On Fri, Nov 5, 2010 at 8:31 AM, Andrzej Bialecki <ab@getopt.org> wrote:

> Hi,
> The HSQL-based SqlStore exhibits awful performance when used with Nutch.
> I believe this is related to the way LOBs are handled in HSQL - even for
> a tiny crawl of 50 pages the size of the .lob file is in the order of
> 100MB. Actually, after reaching this point the performance of any
> updates drops dramatically so it becomes nearly unusable.
> Of course, HSQL was never meant to be used as a serious backend...
> still, perhaps there are alternatives that could give us a better
> behavior for small / embedded use - and for small operations in the
> order of a few thousand records I think we should be able to come up
> with something better...
> I tried to integrate the H2 database (www.h2database.com), but gave up
> after I discovered that it doesn't support Blob.setBinaryStream(..) -
> there are workarounds for this in H2, but it would complicate the code
> too much...
> Any suggestions / comments? Maybe it's time for a BerkeleyDB DataStore?
> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message