gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maria Podorvanova <podorvanova.ma...@gmail.com>
Subject Re: Add datastore for Elasticsearch. Outreachy Week 7 Report
Date Thu, 21 Jan 2021 10:38:16 GMT
Hi,

Okay, I will do that then. Thanks.

Regards,
Maria

On Thu, 21 Jan 2021 at 03:33, John Mora <jhnmora000@gmail.com> wrote:

> Hi Maria,
>
> Sorry for the late reply. Let's keep it simple.You can throw an exception
> when you receive a STRING and only process RECORD cases in UNION.
>
> Example:
>
> https://github.com/apache/gora/blob/b45581a371d2d69c472c37793efa085436056c9b/gora-lucene/src/main/java/org/apache/gora/lucene/store/LuceneStore.java#L349
>
> Regards,
> John
>
> El mar, 19 ene 2021 a las 4:49, Maria Podorvanova (<
> podorvanova.maria@gmail.com>) escribió:
>
>> Hi
>>
>> Thank you for your comments.
>>
>> I will take a look into your links, but my question was a bit different.
>> The problem is that foreign key "boss" is represented in Avro as UNION of
>> three types: STRING, NULL and RECORD. Your answer is in regards to how to
>> handle the last case (RECORD), but I was asking about how to handle
>> the STRING case. AFAIU STRING refers to the Employee's primary key type, so
>> that you could write "boss: '123'" instead of specifying the whole object.
>> Should I be making an additional GET request for this case?
>>
>> Regards,
>> Maria
>>
>> On Tue, 19 Jan 2021 at 08:53, John Mora <jhnmora000@gmail.com> wrote:
>>
>>> Hi Maria,
>>>
>>> Thanks for the update.
>>>
>>> Some comments:
>>>
>>>
>>> https://github.com/podorvanova/gora/blob/gora-664/gora-elasticsearch/src/main/java/org/apache/gora/elasticsearch/store/ElasticsearchStore.java#L192
>>>
>>> Please add the index mappings when you create the elasticsearch index.
>>>
>>>
>>> https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high-create-index.html#java-rest-high-create-index-request-mappings
>>>
>>> You can use the Field mappings parsed from the XML file.
>>>
>>>
>>> https://github.com/podorvanova/gora/blob/gora-664/gora-elasticsearch/src/main/java/org/apache/gora/elasticsearch/mapping/ElasticsearchMapping.java#L28
>>>
>>> Regarding your question, Elasticsearch supports complex datatypes:
>>>
>>>
>>> https://www.elastic.co/guide/en/elasticsearch/reference/current/object.html
>>>
>>> https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html
>>>
>>> You can use the RethinkDB datastore as an example and store recursively
>>> the fields of the embedded objects.
>>>
>>>
>>> https://github.com/apache/gora/blob/b45581a371d2d69c472c37793efa085436056c9b/gora-rethinkdb/src/main/java/org/apache/gora/rethinkdb/store/RethinkDBStore.java#L448
>>>
>>> Give it a try first and let me know if you get stuck.
>>>
>>> Alternatively, if the first option is not feasible, you can serialize
>>> the embedded objects as byte array, example:
>>>
>>>
>>> https://github.com/apache/gora/blob/master/gora-solr/src/main/java/org/apache/gora/solr/store/SolrStore.java#L735
>>>
>>> https://www.elastic.co/guide/en/elasticsearch/reference/current/binary.html
>>>
>>> Best regards,
>>> John.
>>>
>>> El sáb, 16 ene 2021 a las 8:02, Maria Podorvanova (<
>>> podorvanova.maria@gmail.com>) escribió:
>>>
>>>> Hi,
>>>>
>>>> Report #7
>>>> Period: January 10 - January 16
>>>> Activities:
>>>> - Fixed authentication [1]:
>>>>
>>>>    1. Set up password to Elasticsearch container properly
>>>>    2. Set default Elasticsearch container server’s username in
>>>>    gora.properties
>>>>    3. Added exceptions for missing arguments in authentication
>>>>
>>>> - Added a parameter for the XSD validation [2]:
>>>>
>>>>    1. Defined a parameter for the XSD validation
>>>>    2. Added a test case for the parameter
>>>>    3. Made ElasticsearchStore read mapping file from properties, not
>>>>    configuration
>>>>
>>>> - Implemented some basic Input-Output operations for schema management
>>>> [3]:
>>>>
>>>>    1. Implemented delete, get and put methods
>>>>    2. Implemented newInstance and getUnionSchema utility methods
>>>>    3. Implemented basic serialization/deserialization for primitive
>>>>    AVRO types
>>>>
>>>>
>>>> Here are links to the commits:
>>>> [1]
>>>> https://github.com/apache/gora/commit/679b6d8f0a27b7a7be99b6e8773327d482b9996b
>>>> [2]
>>>> https://github.com/apache/gora/commit/0f17849a383ef5f29e650eda22fb4d3022578f43
>>>> [3]
>>>> https://github.com/apache/gora/commit/474a3946ebfde25732fe16d6546aa479fc6509a0
>>>>
>>>> This week I have started work on serialization/deserialization. While
>>>> testing get method I found that UNION case could be a combination of NULL,
>>>> STRING or another RECORD for external table references (e.g. boss for
>>>> Employee). Could you explain to me what I should do in this case? I see two
>>>> possible cases here: 1) Do deserialize recursively if the field value is
a
>>>> RECORD 2) Make another request for STRING case, where I have only key for
>>>> the external object.
>>>>
>>>> Regards,
>>>> Maria
>>>>
>>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message