gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maria Podorvanova <podorvanova.ma...@gmail.com>
Subject Re: Add datastore for Elasticsearch. Outreachy Week 7 Report
Date Tue, 19 Jan 2021 09:49:44 GMT
Hi

Thank you for your comments.

I will take a look into your links, but my question was a bit different.
The problem is that foreign key "boss" is represented in Avro as UNION of
three types: STRING, NULL and RECORD. Your answer is in regards to how to
handle the last case (RECORD), but I was asking about how to handle
the STRING case. AFAIU STRING refers to the Employee's primary key type, so
that you could write "boss: '123'" instead of specifying the whole object.
Should I be making an additional GET request for this case?

Regards,
Maria

On Tue, 19 Jan 2021 at 08:53, John Mora <jhnmora000@gmail.com> wrote:

> Hi Maria,
>
> Thanks for the update.
>
> Some comments:
>
>
> https://github.com/podorvanova/gora/blob/gora-664/gora-elasticsearch/src/main/java/org/apache/gora/elasticsearch/store/ElasticsearchStore.java#L192
>
> Please add the index mappings when you create the elasticsearch index.
>
>
> https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high-create-index.html#java-rest-high-create-index-request-mappings
>
> You can use the Field mappings parsed from the XML file.
>
>
> https://github.com/podorvanova/gora/blob/gora-664/gora-elasticsearch/src/main/java/org/apache/gora/elasticsearch/mapping/ElasticsearchMapping.java#L28
>
> Regarding your question, Elasticsearch supports complex datatypes:
>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/object.html
> https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html
>
> You can use the RethinkDB datastore as an example and store recursively
> the fields of the embedded objects.
>
>
> https://github.com/apache/gora/blob/b45581a371d2d69c472c37793efa085436056c9b/gora-rethinkdb/src/main/java/org/apache/gora/rethinkdb/store/RethinkDBStore.java#L448
>
> Give it a try first and let me know if you get stuck.
>
> Alternatively, if the first option is not feasible, you can serialize the
> embedded objects as byte array, example:
>
>
> https://github.com/apache/gora/blob/master/gora-solr/src/main/java/org/apache/gora/solr/store/SolrStore.java#L735
> https://www.elastic.co/guide/en/elasticsearch/reference/current/binary.html
>
> Best regards,
> John.
>
> El sáb, 16 ene 2021 a las 8:02, Maria Podorvanova (<
> podorvanova.maria@gmail.com>) escribió:
>
>> Hi,
>>
>> Report #7
>> Period: January 10 - January 16
>> Activities:
>> - Fixed authentication [1]:
>>
>>    1. Set up password to Elasticsearch container properly
>>    2. Set default Elasticsearch container server’s username in
>>    gora.properties
>>    3. Added exceptions for missing arguments in authentication
>>
>> - Added a parameter for the XSD validation [2]:
>>
>>    1. Defined a parameter for the XSD validation
>>    2. Added a test case for the parameter
>>    3. Made ElasticsearchStore read mapping file from properties, not
>>    configuration
>>
>> - Implemented some basic Input-Output operations for schema management
>> [3]:
>>
>>    1. Implemented delete, get and put methods
>>    2. Implemented newInstance and getUnionSchema utility methods
>>    3. Implemented basic serialization/deserialization for primitive AVRO
>>    types
>>
>>
>> Here are links to the commits:
>> [1]
>> https://github.com/apache/gora/commit/679b6d8f0a27b7a7be99b6e8773327d482b9996b
>> [2]
>> https://github.com/apache/gora/commit/0f17849a383ef5f29e650eda22fb4d3022578f43
>> [3]
>> https://github.com/apache/gora/commit/474a3946ebfde25732fe16d6546aa479fc6509a0
>>
>> This week I have started work on serialization/deserialization. While
>> testing get method I found that UNION case could be a combination of NULL,
>> STRING or another RECORD for external table references (e.g. boss for
>> Employee). Could you explain to me what I should do in this case? I see two
>> possible cases here: 1) Do deserialize recursively if the field value is a
>> RECORD 2) Make another request for STRING case, where I have only key for
>> the external object.
>>
>> Regards,
>> Maria
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message