gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nathan Gass (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GORA-24) Throwing EOFException with MEDIUMBLOB type for inlinks column
Date Fri, 04 Jan 2013 11:26:13 GMT

    [ https://issues.apache.org/jira/browse/GORA-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543807#comment-13543807

Nathan Gass commented on GORA-24:

The reported bug is about using jdbc-type="MEDIUMBLOB", which is not in the default sql mapping
file of Nutch.
Whether jdbc-type="MEDIUMBLOB" ought to work or not is not my decision, but a better error
message and some documentation about supported jdbc-type values would indeed be nice from
a user perspective.

The Data truncation exception in the comment is indeed a problem with Nutch, assuming that
gora does *not* promise to support arbitrary length data when no length is given. There are
two issues about this: NUTCH-1490 and NUTCH-1497.

> Throwing EOFException with MEDIUMBLOB type for inlinks column
> -------------------------------------------------------------
>                 Key: GORA-24
>                 URL: https://issues.apache.org/jira/browse/GORA-24
>             Project: Apache Gora
>          Issue Type: Bug
>          Components: storage-sql
>         Environment: MySQL
>            Reporter: Alexis
>             Fix For: 0.3
> I had an exception with DbUpdaterJob complaining that inlinks column of type BLOB in
webpage table was not big enough to store all the incoming links. So I changed the column
definition in gora-sql-mapping.xml from BLOB to MEDIUMBLOB:
>     <field name="inlinks" column="inlinks" jdbc-type="MEDIUMBLOB"/>
> Now I systematically get an exception in the update step:
> java.io.IOException: java.sql.BatchUpdateException: Error reading from InputStream java.io.EOFException
> 	at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:341)
> 	at org.apache.gora.sql.store.SqlStore.close(SqlStore.java:185)
> 	at org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:55)
> 	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:567)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> Caused by: java.sql.BatchUpdateException: Error reading from InputStream java.io.EOFException
> 	at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2020)
> 	at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1451)
> 	at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:329)
> 	... 5 more

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message