gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Mora (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GORA-411) Add exists(key) to DataStore interface
Date Tue, 19 Mar 2019 05:24:00 GMT

    [ https://issues.apache.org/jira/browse/GORA-411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16795698#comment-16795698

John Mora commented on GORA-411:

Hi [~alfonso.nishikawa] , [~lewismc].

 I would like to work on this issue as a warm up task for my GoSC2019 application.

I added the method _*public*_ _*boolean exists(K key) throws GoraException*_ in the *DataStore*
interface and implemented a default behavior in the *DataStoreBase* class as follows.
  public boolean exists(K key) throws GoraException {
    return get(key,new String [0])!=null;
And, for testing I added the following case:

  public static void testExistsEmployee(DataStore<String, Employee> dataStore)
    throws Exception {
    Employee employee = DataStoreTestUtil.createEmployee();
    String ssn = employee.getSsn().toString();
    dataStore.put(ssn, employee);
It seems this naive approach works (tests are passing), so I think I could analyze every backend
in order to find more adequate custom implementations for each one. But, I would like to know
if the test case above is enough for this new method, do you know other edge cases that should
be also checked?.







> Add exists(key) to DataStore interface
> --------------------------------------
>                 Key: GORA-411
>                 URL: https://issues.apache.org/jira/browse/GORA-411
>             Project: Apache Gora
>          Issue Type: Improvement
>          Components: gora-core, storage
>            Reporter: Alfonso Nishikawa
>            Priority: Minor
>             Fix For: 0.9
> NUTCH-1679 need to check if there exists some rows and they are proposing to use {{store.get(TableUtil.reverseUrl(url)))}}.
> This will have a considerably impact on performance since every column will be fetched.
> Some datastores implements a call to just check if a row exists (like HBase) so no data
is transfered by network.
> If a datastore can't handle an "exists" call, can default to a get.

This message was sent by Atlassian JIRA

View raw message