gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lewis John McGibbney (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (GORA-206) Verify storage and retrieval of Avro null-single-type Union data type within Gora-Cassandra
Date Sun, 03 Mar 2013 01:02:03 GMT

     [ https://issues.apache.org/jira/browse/GORA-206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Lewis John McGibbney updated GORA-206:
--------------------------------------

    Attachment: GORA-206.v2.patch

I attach a new patch, which applies cleanly against my local copy of trunk HEAD. 

OK I've worked my way through this so far (BTW hold on to your hat, this is pretty hardcore).

1. Apply your patch against Gora trunk.
2. mvn package
3. Get a clean copy of Nutch 2.x HEAD
4. Run ant (default target is runtime)
5. Manually copy gora-core-0.3-SNAPSHOT and gora-cassandra-0.3-SNAPSHOT over to the Nutch
build/lib directory
6. patch Nutch build.xml with the following
{code}
   <!-- ====================================================== -->
  <!-- Generate the Java files from the GORA schemas          -->
  <!-- Will call this automatically later                     -->
  <!-- ====================================================== -->
  <target name="generate-gora-src" depends="init" description="--> compile the avro
schema(s) in src/gora/*.avsc">
    <java classname="org.apache.gora.compiler.GoraCompiler">
     <classpath refid="classpath"/>
     <arg value="src/gora/webpage.avsc"/>
     <arg value="${src.dir}"/>
    </java>
 </target>
{code}
7. Patch $NUTCH_HOME/src/gora/wegpage.json with the patch in http://s.apache.org/UUM
8. Run ant generate-gora-src 
9. cd runtime/local
10. mkdir urls
11. touch urls/seed.txt (then add some URLs to seed.txt)
12. Ensure that all libraries present in ./lib are the most current and up-to-date as we use
over in gora-cassandra
13. run ./bin/nutch inject urls

I get the following

{code}
Exception in thread "main" java.lang.ExceptionInInitializerError
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
	at org.apache.gora.util.ReflectionUtils.newInstance(ReflectionUtils.java:76)
	at org.apache.gora.persistency.impl.BeanFactoryImpl.<init>(BeanFactoryImpl.java:53)
	at org.apache.gora.store.impl.DataStoreBase.initialize(DataStoreBase.java:88)
	at org.apache.gora.cassandra.store.CassandraStore.initialize(CassandraStore.java:88)
	at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
	at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
	at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
	at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
	at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:221)
	at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:251)
	at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:273)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:282)
Caused by: java.lang.RuntimeException: org.codehaus.jackson.JsonParseException: Unexpected
close marker '}': expected ']' (for ARRAY starting at [Source: java.io.StringReader@2bb04fbf;
line: 1, column: 72])
 at [Source: java.io.StringReader@2bb04fbf; line: 1, column: 128]
	at org.apache.avro.Schema.parseJson(Schema.java:996)
	at org.apache.avro.Schema.parse(Schema.java:821)
	at org.apache.nutch.storage.WebPage.<clinit>(WebPage.java:44)
	... 17 more
Caused by: org.codehaus.jackson.JsonParseException: Unexpected close marker '}': expected
']' (for ARRAY starting at [Source: java.io.StringReader@2bb04fbf; line: 1, column: 72])
 at [Source: java.io.StringReader@2bb04fbf; line: 1, column: 128]
	at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:929)
	at org.codehaus.jackson.impl.JsonParserBase._reportError(JsonParserBase.java:632)
	at org.codehaus.jackson.impl.JsonParserBase._reportMismatchedEndMarker(JsonParserBase.java:608)
	at org.codehaus.jackson.impl.ReaderBasedParser.nextToken(ReaderBasedParser.java:104)
	at org.codehaus.jackson.map.deser.BaseNodeDeserializer.deserializeArray(JsonNodeDeserializer.java:179)
	at org.codehaus.jackson.map.deser.BaseNodeDeserializer.deserializeAny(JsonNodeDeserializer.java:193)
	at org.codehaus.jackson.map.deser.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:166)
	at org.codehaus.jackson.map.deser.BaseNodeDeserializer.deserializeAny(JsonNodeDeserializer.java:190)
	at org.codehaus.jackson.map.deser.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:52)
	at org.codehaus.jackson.map.deser.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:13)
	at org.codehaus.jackson.map.ObjectMapper._readValue(ObjectMapper.java:1263)
	at org.codehaus.jackson.map.ObjectMapper.readTree(ObjectMapper.java:618)
	at org.codehaus.jackson.map.ObjectMapper.readTree(ObjectMapper.java:593)
	at org.apache.avro.Schema.parseJson(Schema.java:994)
	... 19 more

{code} 

Can someone please double check that the most recent JSON schema attached to NUTCH-1477 is
syntactically sound? I used it on a JSON validator on the web, and it seesm valid, but evidently
there is a problem somewhere in the generated WebPage Java class and this is preventing me
from trying to use the work you guys have done on staring null-single-type Unions.

Thanks for any help here. It has taken me a while to get this far, but now I've nailed out
exactly what is going on throughout the pipeline and I know where I can make this better.
                
> Verify storage and retrieval of Avro null-single-type Union data type within Gora-Cassandra
> -------------------------------------------------------------------------------------------
>
>                 Key: GORA-206
>                 URL: https://issues.apache.org/jira/browse/GORA-206
>             Project: Apache Gora
>          Issue Type: Sub-task
>          Components: storage-cassandra
>    Affects Versions: 0.3
>            Reporter: Renato Javier Marroquín Mogrovejo
>            Assignee: Renato Javier Marroquín Mogrovejo
>              Labels: gora-cassandra, gora-core
>             Fix For: 0.3
>
>         Attachments: GORA-206.v1.patch, GORA-206.v2.patch
>
>
> The necessary features should be added to confirm that we are able to support Avro Union
data types.
> This referes specifically to null-single-type unions. We will open another issue to address
the multi-type unions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message