drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reed Villanueva <rvillanu...@ucera.org>
Subject Drill “VALIDATION ERROR: A table or view with given name already exists in schema” for empty directory
Date Tue, 04 Dec 2018 20:50:15 GMT
After upgrading drill on our cluster to drill-1.12.0-mapr, testing our
daily ETL scripts (which all use drill for converting parquet files to
tsv), a validation error ("*table or view with given name already exists*")
is always thrown when trying to run a `CREATE TABLE` statement on some
empty directories in a writable workspace.


    [Error Id: 6ea46737-8b6a-4887-a671-4bddbea02476 on
mapr002.ucera.local:31010]
    at
org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489)
    at
org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:561)
    :
    :
    :
    Caused by: org.apache.drill.common.exceptions.UserRemoteException:
VALIDATION ERROR: A table or view with given name
[/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv] already exists
in schema [dfs.etl_internal]


After some brief debugging, I see that the directory in question under the
workspace (ie. /internal_etl/project/version-2/stages/storage/ACCOUNT/tsv)
*is in fact empty*, yet still throwing these errors.

Looking for the error ID in the drillbit.log file in the associated node in
the error message above, we see

    2018-12-04 10:13:25,285 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO  o.a.drill.exec.work.foreman.Foreman - Query text for query id
23f92019-db56-862f-e7b9-cd51b3e174ae: create table
dfs.etl_internal.`/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv`
as
    select <a bunch of fields>
    from
dfs.etl_internal.`/internal_etl/project/version-2/stages/storage/ACCOUNT/parquet`
    2018-12-04 10:13:25,406 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO  o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took
0 ms, numFiles: 1
    2018-12-04 10:13:25,408 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO  o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took
0 ms, numFiles: 1
    2018-12-04 10:13:25,893 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO  o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took
0 ms, numFiles: 1
    2018-12-04 10:13:25,894 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO  o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took
0 ms, numFiles: 1
    2018-12-04 10:13:25,898 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO  o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took
0 ms, numFiles: 1
    2018-12-04 10:13:25,898 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO  o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took
0 ms, numFiles: 1
    2018-12-04 10:13:25,905 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO  o.a.d.e.p.s.h.CreateTableHandler - User Error Occurred: A table or
view with given name
[/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv] already exists
in schema [dfs.etl_internal]
    org.apache.drill.common.exceptions.UserException: VALIDATION ERROR: A
table or view with given name
[/internal_etl/project/version-2/stages/storage/ACCOUNT/tsv] already exists
in schema [dfs.etl_internal]


    [Error Id: 45177abc-7e9f-4678-959f-f9e0e38bc564 ]
    at
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
~[drill-common-1.12.0-mapr.jar:1.12.0-mapr]
    at
org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.checkTableCreationPossibility(CreateTableHandler.java:326)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
    at
org.apache.drill.exec.planner.sql.handlers.CreateTableHandler.getPlan(CreateTableHandler.java:90)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
    at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:131)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
    at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:79)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
    at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:567)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
    at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:264)
[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
    at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[na:1.8.0_151]
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[na:1.8.0_151]
    at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
    2018-12-04 10:13:25,924 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO  o.apache.drill.exec.work.WorkManager - Waiting for 0 queries to
complete before shutting down
    2018-12-04 10:13:25,924 [23f92019-db56-862f-e7b9-cd51b3e174ae:foreman]
INFO  o.apache.drill.exec.work.WorkManager - Waiting for 0 running
fragments to complete before shutting down

This error occurs even when using `DROP TABLE [IF EXISTS]
<workspace>.<table path name>` before the `CREATE TABLE` statement.
Furthermore, the configurations for the dfs workspace itself does not
appear to be changed from before upgrading to drill-1.12, see below:

    :
    :
    "workspaces": {
    "root": {
    "location": "/",
    "writable": false,
    "defaultInputFormat": null,
    "allowAccessOutsideWorkspace": false
    },
    "tmp": {
    "location": "/tmp",
    "writable": true,
    "defaultInputFormat": null,
    "allowAccessOutsideWorkspace": false
    },
    "etl_internal": {
    "location": "/etl/internal",
    "writable": true,
    "defaultInputFormat": null,
    "allowAccessOutsideWorkspace": false
    }
    },
    :
    :

Note that the full process in question is intended to `mv` the directory
contents every day and `CREATE TABLE` with new data from current day (in
case that makes a difference) and this process had been working fine when
we were using drill-1.11.

If anyone with more experience using drill knows what could be happening
here, any opinions or advice would be appreciated.

-- 
This electronic message is intended only for the named 
recipient, and may 
contain information that is confidential or 
privileged. If you are not the 
intended recipient, you are 
hereby notified that any disclosure, copying, 
distribution or 
use of the contents of this message is strictly 
prohibited. If 
you have received this message in error or are not the 
named
recipient, please notify us immediately by contacting the 
sender at 
the electronic mail address noted above, and delete 
and destroy all copies 
of this message. Thank you.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message