gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Weiss (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (GORA-227) Failing assertions when putting and getting Values using MemStore#execute
Date Tue, 07 Oct 2014 11:29:33 GMT

    [ https://issues.apache.org/jira/browse/GORA-227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161775#comment-14161775
] 

Sergey Weiss edited comment on GORA-227 at 10/7/14 11:28 AM:
-------------------------------------------------------------

Hello!

I have debugged TestGenerator and, from what I saw, it fails due to the fact that query is
being executed on a different MemStore instance rather than one that holds injected web pages.
That is, when GeneratorJob inits its mapper and reducer, it creates new instance of MemStore
for both. Each of this two instances hold their internal maps and know nothing about each
other and MemStore created by TestGenerator (and populated with web pages).

What is the best way to address this issue? Should we somehow amend DataStoreFactory to make
it return single instance of MemStore or should all MemStores share their states? Any suggestions?


was (Author: sweiss):
Hello!

I have debugged TestGenerator and, from what I saw, it fails due to the fact that query is
being executed on a different MemStore instance rather than one that holds injected web pages.
That is, when GeneratorJob inits its mapper and reducer, it creates new instance of MemStore
for both. Each of this two instances hold their internal map and know nothing about MemStore
created by TestGenerator (and populated with web pages).

What is the best way to address this issue? Should we somehow amend DataStoreFactory to make
it return single instance of MemStore or should all MemStores share their states? Any suggestions?

> Failing assertions when putting and getting Values using MemStore#execute
> -------------------------------------------------------------------------
>
>                 Key: GORA-227
>                 URL: https://issues.apache.org/jira/browse/GORA-227
>             Project: Apache Gora
>          Issue Type: Sub-task
>          Components: gora-core
>    Affects Versions: 0.3
>         Environment: gora-core 0.3, Nutch 2.x HEAD
>            Reporter: Lewis John McGibbney
>             Fix For: 0.6
>
>
> Test [0] fails with the following useless logging... I need to DEBUG this much more throughly
> {code}
> Testcase: testGenerateHighest took 1.845 sec
> 	FAILED
> expected:<2> but was:<0>
> junit.framework.AssertionFailedError: expected:<2> but was:<0>
> 	at org.apache.nutch.crawl.TestGenerator.testGenerateHighest(TestGenerator.java:78)
> Testcase: testGenerateHostLimit took 1.207 sec
> 	FAILED
> expected:<1> but was:<0>
> junit.framework.AssertionFailedError: expected:<1> but was:<0>
> 	at org.apache.nutch.crawl.TestGenerator.testGenerateHostLimit(TestGenerator.java:134)
> Testcase: testGenerateDomainLimit took 1.175 sec
> 	FAILED
> expected:<1> but was:<0>
> junit.framework.AssertionFailedError: expected:<1> but was:<0>
> 	at org.apache.nutch.crawl.TestGenerator.testGenerateDomainLimit(TestGenerator.java:185)
> Testcase: testFilter took 2.31 sec
> 	FAILED
> expected:<3> but was:<0>
> junit.framework.AssertionFailedError: expected:<3> but was:<0>
> 	at org.apache.nutch.crawl.TestGenerator.testFilter(TestGenerator.java:239)
> {code}
> However so far I have found commonality in the fact that the tests all use the following
code:
> {code}
>   public static ArrayList<URLWebPage> readContents(DataStore<String,WebPage>
store,
>       Mark requiredMark, String... fields) throws Exception {
>     ArrayList<URLWebPage> l = new ArrayList<URLWebPage>();
>     Query<String, WebPage> query = store.newQuery();
>     if (fields != null) {
>       query.setFields(fields);
>     }
>     Result<String, WebPage> results = store.execute(query);
>     while (results.next()) {
>       try {
>         WebPage page = results.get();
>         String url = results.getKey();
>         if (page == null)
>           continue;
>         if (requiredMark != null && requiredMark.checkMark(page) == null)
>           continue;
>         l.add(new URLWebPage(TableUtil.unreverseUrl(url), (WebPage)page.clone()));
>       } catch (Exception e) {
>         e.printStackTrace();
>       }
>     }
>     return l;
>   }
> {code}
> and also that the assertions are all of the type
> {code}
>     ArrayList<URLWebPage> fetchList = CrawlTestUtil.readContents(webPageStore,
Mark.GENERATE_MARK, FIELDS);
>     // verify we got right amount of records
>     assertEquals(1, fetchList.size());
> {code}
> [0] http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/crawl/TestGenerator.java?view=markup



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message