gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Weiss (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GORA-227) Failing assertions when putting and getting Values using MemStore#execute
Date Tue, 07 Oct 2014 11:24:34 GMT

    [ https://issues.apache.org/jira/browse/GORA-227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161775#comment-14161775
] 

Sergey Weiss commented on GORA-227:
-----------------------------------

Hello!

I have debugged TestGenerator and, from what I saw, it fails due to the fact that query is
being executed on a different MemStore instance rather than one that holds injected web pages.
That is, when GeneratorJob inits its mapper and reducer, it creates new instance of MemStore
for both. Each of this two instances hold their internal map and know nothing about MemStore
created by TestGenerator (and populated with web pages).

What is the best way to address this issue? Should we somehow amend DataStoreFactory to make
it return single instance of MemStore or should all MemStores share their states? Any suggestions?

> Failing assertions when putting and getting Values using MemStore#execute
> -------------------------------------------------------------------------
>
>                 Key: GORA-227
>                 URL: https://issues.apache.org/jira/browse/GORA-227
>             Project: Apache Gora
>          Issue Type: Sub-task
>          Components: gora-core
>    Affects Versions: 0.3
>         Environment: gora-core 0.3, Nutch 2.x HEAD
>            Reporter: Lewis John McGibbney
>             Fix For: 0.6
>
>
> Test [0] fails with the following useless logging... I need to DEBUG this much more throughly
> {code}
> Testcase: testGenerateHighest took 1.845 sec
> 	FAILED
> expected:<2> but was:<0>
> junit.framework.AssertionFailedError: expected:<2> but was:<0>
> 	at org.apache.nutch.crawl.TestGenerator.testGenerateHighest(TestGenerator.java:78)
> Testcase: testGenerateHostLimit took 1.207 sec
> 	FAILED
> expected:<1> but was:<0>
> junit.framework.AssertionFailedError: expected:<1> but was:<0>
> 	at org.apache.nutch.crawl.TestGenerator.testGenerateHostLimit(TestGenerator.java:134)
> Testcase: testGenerateDomainLimit took 1.175 sec
> 	FAILED
> expected:<1> but was:<0>
> junit.framework.AssertionFailedError: expected:<1> but was:<0>
> 	at org.apache.nutch.crawl.TestGenerator.testGenerateDomainLimit(TestGenerator.java:185)
> Testcase: testFilter took 2.31 sec
> 	FAILED
> expected:<3> but was:<0>
> junit.framework.AssertionFailedError: expected:<3> but was:<0>
> 	at org.apache.nutch.crawl.TestGenerator.testFilter(TestGenerator.java:239)
> {code}
> However so far I have found commonality in the fact that the tests all use the following
code:
> {code}
>   public static ArrayList<URLWebPage> readContents(DataStore<String,WebPage>
store,
>       Mark requiredMark, String... fields) throws Exception {
>     ArrayList<URLWebPage> l = new ArrayList<URLWebPage>();
>     Query<String, WebPage> query = store.newQuery();
>     if (fields != null) {
>       query.setFields(fields);
>     }
>     Result<String, WebPage> results = store.execute(query);
>     while (results.next()) {
>       try {
>         WebPage page = results.get();
>         String url = results.getKey();
>         if (page == null)
>           continue;
>         if (requiredMark != null && requiredMark.checkMark(page) == null)
>           continue;
>         l.add(new URLWebPage(TableUtil.unreverseUrl(url), (WebPage)page.clone()));
>       } catch (Exception e) {
>         e.printStackTrace();
>       }
>     }
>     return l;
>   }
> {code}
> and also that the assertions are all of the type
> {code}
>     ArrayList<URLWebPage> fetchList = CrawlTestUtil.readContents(webPageStore,
Mark.GENERATE_MARK, FIELDS);
>     // verify we got right amount of records
>     assertEquals(1, fetchList.size());
> {code}
> [0] http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/crawl/TestGenerator.java?view=markup



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message