drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aman Sinha <amansi...@gmail.com>
Subject Re: [DISCUSS] case insensitive storage plugin and workspaces names
Date Tue, 12 Jun 2018 15:53:13 GMT
Yes, that seems ok to me...since the plugin name and workspace are logical
entities and don't correspond to a path.
There could be compatibility issues if certain users have relied on the
case-sensitive names, but those would be temporary.

Aman

On Tue, Jun 12, 2018 at 8:35 AM, Arina Yelchiyeva <
arina.yelchiyeva@gmail.com> wrote:

> To make it clear we have three notions here: storage plugin name, workspace
> (schema) and table name (dfs.root.`/tmp/t`).
> My suggestion is the following:
> Storage plugin names to be case insensitive (DFS vs dfs, INFORMATION_SCHEMA
> vs information_schema).
> Workspace  (schemas) names to be case insensitive (ROOT vs root, TMP vs
> tmp). Even if user has two directories /TMP and /tmp, he can create two
> workspaces but not both with tmp name. For example, tmp vs tmp_u.
> Table names case sensitivity are treated per plugin. For example, system
> plugins (information_schema, sys) table names (views, tables) should be
> case insensitive. Actually, currently for sys plugin table names are case
> insensitive, information_schema table names are case sensitive. That needs
> to be synchronized. For file system plugins table names must be case
> sensitive, since under table name we imply directory / file name and their
> case sensitivity depends on file system.
>
> Kind regards,
> Arina
>
> On Tue, Jun 12, 2018 at 6:13 PM Aman Sinha <amansinha@gmail.com> wrote:
>
> > Drill is dependent on the underlying file system's case sensitivity.  On
> > HDFS one can create  'hadoop fs -mkdir /tmp/TPCH'  and /tmp/tpch which
> are
> > separate directories.
> > These could be set as workspace in Drill's storage plugin configuration
> and
> > we would want the ability to query both.   If we change the current
> > behavior, we would want
> > some way, either using back-quotes `  or other way to support that.
> >
> > RDBMSs seem to have vendor-specific behavior...
> > In MySQL [1] the database name and schema name are case-sensitive on
> Linux
> > and case-insensitive on Windows.   Whereas in Postgres it converts the
> > database name and schema name to lower-case by default but one can put
> > double-quotes to make it case-sensitive [2].
> >
> > [1]
> > https://dev.mysql.com/doc/refman/8.0/en/identifier-case-sensitivity.html
> > [2]
> > http://www.postgresqlforbeginners.com/2010/11/gotcha-case-
> sensitivity.html
> >
> >
> >
> > On Tue, Jun 12, 2018 at 5:01 AM, Arina Yelchiyeva <
> > arina.yelchiyeva@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > Currently Drill we treat storage plugin names and workspaces as
> > > case-sensitive [1].
> > > Names for storage plugins and workspaces are defined by the user. So we
> > > allow to create plugin -> DFS and dfs, workspace -> tmp and TMP.
> > > I have a suggestion to move to case insensitive approach and won't
> allow
> > > creating two plugins / workspaces with the same name in different case
> at
> > > least for the following reasons:
> > > 1. usually rdbms schema and table names are case insensitive and many
> > users
> > > are used to this approach;
> > > 2. in Drill we have INFORMATION_SCHEMA schema which is in upper case,
> sys
> > > in lower case.
> > > personally I find it's extremely inconvenient.
> > >
> > > Also we should consider making table names case insensitive for system
> > > schemas (info, sys).
> > >
> > > Any thoughts?
> > >
> > > [1] https://drill.apache.org/docs/lexical-structure/
> > >
> > >
> > > Kind regards,
> > > Arina
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message