nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki ...@getopt.org>
Subject Re: nutch 2.0 (trunk)
Date Tue, 07 Sep 2010 13:02:39 GMT
On 2010-09-07 14:50, Faruk Berksöz wrote:
> Dear all,
>
> wenn i try to fetch a web page (e.g.
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html ) with mysql
> storage definition,
> I am seeing the following error in my hadoop logs. ,  (no error with
> hbase ) ;
>
> java.io.IOException: java.sql.BatchUpdateException: Data truncation:
> Data too long for column 'content' at row 1
>      at org.gora.sql.store.SqlStore.flush(SqlStore.java:316)
>      at org.gora.sql.store.SqlStore.close(SqlStore.java:163)
>      at
> org.gora.mapreduce.GoraOutputFormat$1.close(GoraOutputFormat.java:72)
>      at
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:567)
>      at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>      at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
>
> The type of the column 'content' is BLOB.
> It may be important for the next developments of Gora.
> Should I file this in nutch-jira or hithub/gora or nothing?
>
> environments : ubuntu 10.04
> JVM : 1.6.0_20
> nutch 2.0 (trunk)
> Mysql/HBase (0.20.6) / Hadoop(0.20.2) pseudo-distributed

Yes, please create a JIRA issue. Thanks!



-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Mime
View raw message