polygene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niclas Hedhman <nic...@hedhman.org>
Subject JOOQ EntityStore
Date Sun, 11 Jun 2017 03:41:24 GMT
Gang,

This is just a heads-up and request for feedback.... I am very close to
finish the first pass at the JOOQ Entity Store, which uses JOOQ to handle
SQL details across SQL dialects.

The intent is that an "enterprisey" developer would go, "Yeah, that could
work..." rather than the current "Use SQL as Key/Value store" approach that
we did initially.

The structure is as follows (JOOQ generated queries);

The TYPES table keeps mapping between Mixin types and table names. In
general, the Class.getSimpleName() is used as table name, but if there is a
conflict, then a "_1", "_2" and so on is added.

create table if not exists "POLYGENE"."TYPES"(
    "_identity" varchar null,
    "_table_name" varchar null,
    "_created_at" timestamp null,
    "_modified_at" timestamp null
)

The ENTITIES table is containing the meta data about the entities. This is
effectively the built-in data in EntityState. The "_value_id" field is the
foreign key into the generated mixin tables. This is done this way to
support the "Identity+Value" view of entities that we may have later, and
that history can be preserved (but not fully implemented yet), and the
"_deleted_at" field is there for this same reason, and also not used
initially.

create table if not exists "POLYGENE"."ENTITIES"(
    "_identity" varchar null,
    "_app_version" varchar null,
    "_value_id" varchar null,
    "_version" varchar null,
    "_type" varchar null,
    "_modified_at" timestamp null,
    "_created_at" timestamp null,
    "_deleted_at" timestamp null
)

Then for each Mixin type, two tables are created. The first one is to store
Properties and Associations. The second is to store the ManyAssociations
and NamedAssociations. I chose a single table, to reduce the amount of
tables that needs to be managed, but perhaps it should be one table per
many-to-many relation in the model.

The primary table for the ES testcase (TestEntity) looks like;

create table "POLYGENE"."TestEntity"(
    "_identity" varchar null,
    "_created_at" timestamp null,
    "instantValue" timestamp with time zone null,
    "bigIntegerValue" varchar null,
    "bigDecimalValue" varchar null,
    "dateTimeValue" timestamp with time zone null,
    "localDateTimeValue" timestamp null,
    "localDateValue" date null,
    "localTimeValue" time null,
    "duractionValue" varchar null,
    "periodValue" varchar null,
    "valueProperty" varchar null,
    "unsetName" varchar null,
    "emptyName" varchar null,
    "name" varchar null,
    "booleanValue" boolean null,
    "intValue" integer null,
    "longValue" bigint null,
    "floatValue" float null,
    "doubleValue" double null,
    "association" varchar null,
    "unsetAssociation" varchar null
)

and the assocs table;

create table if not exists "POLYGENE"."TestEntity_ASSOCS"(
    "_identity" varchar null,
    "_name" varchar null,
    "_index" varchar null,
    "_reference" varchar null
)

The "_name" is the name of the association in the mixin, say "children" for

    ManyAssociation<Person> children();

and the "_index" is the position inside the ManyAssociation or the key/name
of NamedAssociations.

For both of these tables, the "_identity" is the "_value_id" in the
ENTITIES table above.


Composite get() results in 2 queries, one for properties/assocs and one for
the _ASSOCS tables.

select *
    from "POLYGENE"."ENTITIES"
    left outer join "POLYGENE"."TestEntity" on "_value_id" =
"TestEntity"."_identity"
    where "ENTITIES"."_identity" = 'ca22d692-5461-4756-8316-5a214d055aa5'

select *
    from "POLYGENE"."ENTITIES"
    join "POLYGENE"."TestEntity_ASSOCS" on "_value_id" =
"TestEntity_ASSOCS"."_identity"
    where "ENTITIES"."_identity" = 'ca22d692-5461-4756-8316-5a214d055aa5'

In each of those queries, there is a "join" line per Mixin type, so highly
composed

The idea is that ValueComposites are still going to be JSON serialized into
fields and leave it in the hands of developers to decide whether to go with
more entities or use values. This serialization is not yet implemented and
the main part that is not working yet.
Other things still outstanding;
  * Primary Key constraints
  * Caching of TYPES, to reduce queries
  * Caching of queries (there is only 2 queries per primaryType)
  * Design improvements, not totally happy with the implementation.
  * History support
  * ALTER TABLE and Migration (big one)
  * Allow more JOOQ config to be specified

Interestingly enough, the current design (without history) preserves state
of sub-types, which I think is no longer supported in other ES
implementations. This is just a side-effect of how the storage is
structured.

So, this is on the brink of operational (6 of 8 tests in
AbstractEntityStoreTest passes), and before digging into the serialization
of Values, I would like to check with you guys if this sounds reasonable,
and if there are any ways to make this even more tempting.

I am for instance seeking to change the "entitystore-sql" to another name
(for instance sqlkv) or even remove it completely, and this store take the
"entitystore-sql" name. Otherwise, I think too many people will discard
Polygene after seeing "SQL ES" key/value storage pattern.

Personally, I think that this is fairly clean and something I think
SQL-happy people would consider an alternative to Hibernate.


I am committing/pushing this to branch "es-jooq" branch shortly, if you
want to take a closer look and feel free to help on implementation.


P.S. I am also keen on improving/expanding the ES test suite. It is
currently a bit rudimentary, and I think we should try to improve that.


Cheers
-- 
Niclas Hedhman, Software Developer
http://polygene.apache.org - New Energy for Java

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message