jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julian Reschke (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Deleted] (OAK-1941) RDB: decide on table layout
Date Fri, 18 Jul 2014 11:57:04 GMT

     [ https://issues.apache.org/jira/browse/OAK-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Julian Reschke updated OAK-1941:
--------------------------------

    Comment: was deleted

(was: The attached patch attempts to count changes to the _collisions table.

(work in progress)

[~mreutegg] note that I had to NodeDocument to ignore my counter property -- question: what's
the extensibility story wrt to properties added by the persistence? _modCount already is in
the list, although in theory the DocumentMK does not know what it is...)

> RDB: decide on table layout
> ---------------------------
>
>                 Key: OAK-1941
>                 URL: https://issues.apache.org/jira/browse/OAK-1941
>             Project: Jackrabbit Oak
>          Issue Type: Sub-task
>          Components: rdbmk
>            Reporter: Julian Reschke
>             Fix For: 1.1
>
>         Attachments: OAK-1941-cmodcount.diff
>
>
> The current approach is to serialize the Document using JSON, and then to store either
(a) the full JSON in a VARCHAR column, or, if that column isn't wide enough, (b) to store
it in a BLOB (optionally gzipped).
> For debugging purposes, the inline VARCHAR always gets populated with the start of the
JSON serialization.
> However, with Oracle we are limited to 4000 bytes (which may be way less characters due
to non-ASCII overhead), so many document instances will use what was initially thought to
be the exception case.
> Questions:
> 1) Do we stick with JSON or do we attempt a different serialization? It might make sense
both wrt to length and performance. There might be also some code to borrow from the off-heap
serialization code.
> 2) Do we get rid of the "dual" strategy, and just always use the BLOB? The indirection
might make things more expensive, but then the total column width would drop considerably.
-- How can we do good benchmarks on this?
> (This all assumes that we stick with a model where all code is the same between database
types, except for the DDL statements; of course it's also conceivable add more vendor-specific
special cases into the Java code)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message