nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiao Yang (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (NUTCH-650) Hbase Integration
Date Fri, 22 Jan 2010 08:00:22 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803623#action_12803623
] 

Xiao Yang edited comment on NUTCH-650 at 1/22/10 7:59 AM:
----------------------------------------------------------

Some instructions for NUTCH-650.patch
1. API in hbase-0.20.0-r804408.jar is different from the final release.
2. Avoid some NullPointer error
3. Change invalid Column family name
4. Add "id" field to index to avoid this error:
java.lang.IllegalArgumentException: it doesn't make sense to have a field that is neither
indexed nor stored
	at org.apache.lucene.document.Field.(Field.java:279)
	at org.apache.nutch.indexer.lucene.LuceneWriter.createLuceneDoc(LuceneWriter.java:136)
	at org.apache.nutch.indexer.lucene.LuceneWriter.write(LuceneWriter.java:245)
	at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:46)
	at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:41)
	at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
	at org.apache.nutch.indexer.IndexerReducer.reduce(IndexerReducer.java:79)
	at org.apache.nutch.indexer.IndexerReducer.reduce(IndexerReducer.java:20)
	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:563)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)



      was (Author: yangxiao):
    1. API in hbase-0.20.0-r804408.jar is different from the final release.
2. Avoid some NullPointer error
3. Change invalid Column family name
4. Add "id" field to index to avoid this error:
java.lang.IllegalArgumentException: it doesn't make sense to have a field that is neither
indexed nor stored
	at org.apache.lucene.document.Field.(Field.java:279)
	at org.apache.nutch.indexer.lucene.LuceneWriter.createLuceneDoc(LuceneWriter.java:136)
	at org.apache.nutch.indexer.lucene.LuceneWriter.write(LuceneWriter.java:245)
	at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:46)
	at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:41)
	at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
	at org.apache.nutch.indexer.IndexerReducer.reduce(IndexerReducer.java:79)
	at org.apache.nutch.indexer.IndexerReducer.reduce(IndexerReducer.java:20)
	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:563)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)


  
> Hbase Integration
> -----------------
>
>                 Key: NUTCH-650
>                 URL: https://issues.apache.org/jira/browse/NUTCH-650
>             Project: Nutch
>          Issue Type: New Feature
>    Affects Versions: 1.0.0
>            Reporter: Doğacan Güney
>            Assignee: Doğacan Güney
>             Fix For: 1.1
>
>         Attachments: hbase-integration_v1.patch, hbase_v2.patch, malformedurl.patch,
meta.patch, meta2.patch, nofollow-hbase.patch, NUTCH-650.patch, nutch-habase.patch, searching.diff,
slash.patch
>
>
> This issue will track nutch/hbase integration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message