calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jesus Camacho Rodriguez <jcama...@apache.org>
Subject Re: Materialized view case sensitivity problem
Date Thu, 24 Aug 2017 19:52:17 GMT
I never hit this issue as we do not go through the JDBC adaptor when we
use the MV rewriting within Hive.

I am not familiar with that code path, but I guess no matter whether it is
MV or a table definition, we should end up doing the same wrt casing column
names, thus there should be no need for case insensitive comparison?

- Jesús



On 8/24/17, 12:19 PM, "Christian Beikov" <christian.beikov@gmail.com> wrote:

>I apparently had a different problem that lead me to believe the view 
>was the problem. In fact, the actual query was the problem.
>
>So i have the query for the materialized view "select id as `id`, name 
>as `name` from document" and the query for the normal view "select 
>cast(_MAP['id'] AS bigint) AS `id`, cast(_MAP['name'] AS varchar(255)) 
>AS `name` from elasticsearch_raw.document_index".
>
>Now when I send the query "select id as col1, name as col2 from 
>document", the row type at first is "col1 bigint, col2 varchar(255)" and 
>later it becomes "ID bigint, NAME varchar(255)" which is to a specific 
>extent a good thing. The materialization logic determines it can 
>substitue the query, but during the substitution it compares that row 
>type with the one from the view. The Jdbc schema receives the columns in 
>upper case, which is why the row type of the sent query is in upper 
>case. Either the comparison should be case insensitive, or I simply 
>upper case the names of the columns in the view, which is what I did now.
>
>Doing that will unfortunately cause a little mismatch in the ES adapter 
>which expects that the field names have the same case as the fields of 
>the row type. This is why I adapted some rules to extract the correctly 
>cased field name from the _MAP expression.
>
>Now the question is, should the comparison be case insensitive or should 
>I rely on the fact, that the JDBC schema will always have upper cased 
>column names?
>
>
>Mit freundlichen Grüßen,
>------------------------------------------------------------------------
>*Christian Beikov*
>Am 24.08.2017 um 21:00 schrieb Julian Hyde:
>> Rather than "select id, name from document” could you create your view as "select
`id`, `name` from document” (or however the back-end system quotes identifiers). Then “id”
would still be in lower-case when the JDBC adapter queries the catalog.
>>
>>> On Aug 24, 2017, at 5:17 AM, Christian Beikov <christian.beikov@gmail.com>
wrote:
>>>
>>> My main problem is the row type equality assertion in org.apache.calcite.plan.SubstitutionVisitor#go(org.apache.calcite.rel.mutable.MutableRel)
>>>
>>> Imagine I have a table "document" with columns "id" and "name". When the JdbcSchema
reads the structure, it gets column names in upper case. Now I register a materialized view
for a query like "select id, name from document". The materialized table for that view is
in my case a view again defined like "select ... AS `id`, ... AS `name` from ...".
>>>
>>> The row type of my view correctly is "id, name". The row type of the table "document"
is "ID, NAME" because the JdbcSchema gets upper cased names. Initially, the row type of the
query for the materialized view is also correct, but during the "trim fields" phase the row
type gets replaced with the types from the table. Is this replacement of field types even
correct?
>>>
>>> Because of that, the assertion in the substiution visitor fails. What would be
the appropriate solution for this mismatch?
>>>
>>>
>>> Mit freundlichen Grüßen,
>>> ------------------------------------------------------------------------
>>> *Christian Beikov*
>>> Am 24.08.2017 um 12:57 schrieb Julian Hyde:
>>>> Or supply your own TableFactory? I'm not quite sure of your use case.
>>>> I've only tested cases where materialized views are "internal",
>>>> therefore they work fine with Calcite's default dialect.
>>>>
>>>> On Thu, Aug 24, 2017 at 3:21 AM, Christian Beikov
>>>> <christian.beikov@gmail.com> wrote:
>>>>> Actually, it seems the root cause is that the materialization uses the
wrong
>>>>> configuration.
>>>>>
>>>>> org.apache.calcite.materialize.MaterializationService.DefaultTableFactory#createTable
>>>>> creates a new connection with the default configuration that does TO_UPPER.
>>>>> Would it be ok for it to receive a CalciteConnectionConfig?
>>>>>
>>>>>
>>>>> Mit freundlichen Grüßen,
>>>>> ------------------------------------------------------------------------
>>>>> *Christian Beikov*
>>>>> Am 24.08.2017 um 11:36 schrieb Christian Beikov:
>>>>>> Seems org.apache.calcite.prepare.CalcitePrepareImpl#prepare2_ misses
a
>>>>>> call to
>>>>>> org.apache.calcite.sql.parser.SqlParser.ConfigBuilder#setCaseSensitive
to
>>>>>> configure the parser according to the LEX configuration. Is that
a bug or
>>>>>> expected?
>>>>>>
>>>>>>
>>>>>> Mit freundlichen Grüßen,
>>>>>> ------------------------------------------------------------------------
>>>>>> *Christian Beikov*
>>>>>> Am 24.08.2017 um 11:24 schrieb Christian Beikov:
>>>>>>> Hey,
>>>>>>>
>>>>>>> I have configured Lex.MYSQL_ANSI but when a query gets parsed,
the column
>>>>>>> names of select items are "to-upper-cased".
>>>>>>>
>>>>>>> I'm having problems with matching the row types of materialized
views and
>>>>>>> the source sql because of that. Any idea how to fix that?
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Mit freundlichen Grüßen,
>>>>>>> ------------------------------------------------------------------------
>>>>>>> *Christian Beikov*
>


Mime
View raw message