mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tristan Slominski <tristan.slomin...@gmail.com>
Subject Error: ... overrides final method tokenStream
Date Fri, 06 Apr 2012 13:21:39 GMT
Hello group,

I managed to get Mahout running.. awesome! But I keep on running into
issues that break Hadoop jobs that Mahout launches.

For example, when I follow the wikipedia Naive Bayes example, during the
wikipediaDataSetCreator step, my Hadoop jobs fail due to:

Error: class org.apache.lucene.analysis.ReusableAnalyzerBase overrides
final method
tokenStream.(Ljava/lang/String;Ljava/io/Reader;)Lorg/apache/lucene/analysis/TokenStream;

So, I decided to try the examples in the example folder within Mahout.

The classify-20newsgroups.sh example works just fine.

Then I try to run the cluster-reuters.sh example and Hadoop jobs break with:

Error: class org.apache.mahout.vectorizer.DefaultAnalyzer overrides final
method
tokenStream.(Ljava/lang/String;Ljava/io/Reader;)Lorg/apache/lucene/analysis/TokenStream;

I did this on latest Mahout 7.0 Snapshot built from source, and on the
packaged Mahout 6.0.

>From reading about it, it appears that the problem stems from the Lucene
project enforcing a final restriction on
org.apache.lucene.analysis.TokenStream . So, in order to try to at least
get it to run despite that restriction, I attempted to find a way to build
lucene-analysis project from scratch to generate a separate jar that
doesn't have the final restriction, but I'm sort of lost in the size of
that project right now.

What are you doing to get around this issue? Am I doing something wrong?
Using a wrong version of something perhaps? Again, I've build latest 7.0
Snapshot from source and I used packaged Mahout 6.0 with same problems.

Cheers,

Tristan

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message