manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <>
Subject Re: Crawling MySQL with latest MySQL connector fails
Date Mon, 20 Aug 2012 13:53:25 GMT
I checked in a fix for this ticket - please let me know if it allows
you to successfully crawl MySQL database tables.


On Mon, Aug 20, 2012 at 8:10 AM, Karl Wright <> wrote:
> There's some online chatter about this.  Apparently the JDBC 4.0
> specification was clarified in this regard, and MySQL's implementation
> follows the clarified meaning.  The recommended approach during this
> transition period is to allow the user to select which method they
> want to use to get the column name.  See CONNECTORS-509.
> Karl
> On Mon, Aug 20, 2012 at 8:00 AM, Karl Wright <> wrote:
>> Here's some additional info.
>> The JDBC class "ResultSetMetaData" has two methods: getColumnName(),
>> and getColumnLabel().  For all supported databases, getColumnName()
>> returns the right thing, EXCEPT for MySQL, where you have to use
>> getColumnLabel() instead.
>> I've abstracted the logic that does this in the main database classes
>> that underlie the framework, but for the JDBC Connector I tried to
>> make the connector be independent of any special logic, and instead
>> make it the responsibility of the query writer to know how to work
>> with their database.  Unfortunately, that strategy is failing when it
>> comes to MySQL because the JDBC driver is implemented in a way that is
>> inconsistent with the specification.
>> So, we have two ways forward:
>> (1) Change the logic to use getColumnLabel() always.  If we do this,
>> it will be necessary to test the JDBC connector against a PostgreSQL,
>> MySQL, and MSSQL database before we know what the effects are.  It is
>> possible everything will just work, but it is also possible that such
>> a change would break other people's jobs, and that would be no good.
>> (2) Try to conditionalize the logic so that only for MySQL is
>> getColumnLabel() used.  This is less risky but results in messy code
>> that sooner or later would become unmaintainable.
>> Karl
>> On Mon, Aug 20, 2012 at 6:22 AM, Karl Wright <> wrote:
>>> Hi Shigeki,
>>> This is critical functionality for ManifoldCF.  Quite a lot of
>>> ManifoldCF stuff won't work on MySQL if this is broken - not just
>>> crawling using the JDBC connector.  Are you successfully crawling with
>>> MySQL as the back-end?  If you are, that means that there is a way to
>>> do this right but the JDBC connector is not using it.
>>> I am testing with MySQL JDBC connector 5.1.18 here, which would
>>> indicate that that is the case.
>>> Could you open a ticket describing the problem, and I will look into
>>> this in some detail tonight?  Thanks,
>>> Karl
>>> On Mon, Aug 20, 2012 at 4:21 AM, Shigeki Kobayashi
>>> <> wrote:
>>>> Hi guys.
>>>> I am not sure if everyone has already noticed this, but this is to share
>>>> experimental fact of using MySQL connectors to crawl MySQL data.
>>>> Using "AS" in Select queries in SeedQuery and DataQuery causes error
>>>> depending on versions of MySQL connectors.
>>>> Env:
>>>> - ManifoldCF0.5
>>>> - Solr3.6
>>>> - MySQL5.5
>>>> Example:
>>>>  SeedQuery´╝ÜSELECT idfield AS $(IDCOLUMN) FROM documenttable
>>>> Error Message:
>>>>   Bad seed query; doesn't return $(IDCOLUMN) column. Try using quotes around
>>>> $(IDCOLUMN) variable, e.g. "$(IDCOLUMN)".
>>>> Cause of Error:
>>>>  MySQL connecors of over version 5.1 seem to have a bug that causes error
>>>> when you use "AS" in Select to put an alias for a column.
>>>> Versions of MySQL Connector:
>>>>  mysql-connector-java-5.0.8.jar  -> OK
>>>>  mysql-connector-java-5.1.18.jar -> No Good
>>>>  mysql-connector-java-5.1.21.jar -> No Good
>>>> Exception:
>>>> Using function  (e.g. sysdate() as) or fixed strings  (e.g. " fixed string"
>>>> as) followed by "as" does not cause error.
>>>> Regards,
>>>> Shigeki

View raw message