spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "angerszhu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-28466) FileSystem closed error when to call Hive.moveFile
Date Mon, 22 Jul 2019 01:59:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-28466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

angerszhu updated SPARK-28466:
------------------------------
    Attachment: image-2019-07-22-09-58-55-107.png

>  FileSystem closed error when to call Hive.moveFile
> ---------------------------------------------------
>
>                 Key: SPARK-28466
>                 URL: https://issues.apache.org/jira/browse/SPARK-28466
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.3.0, 2.4.0, 3.0.0
>            Reporter: angerszhu
>            Priority: Major
>         Attachments: image-2019-07-22-09-58-19-023.png, image-2019-07-22-09-58-55-107.png
>
>
> When we close a session of STS, if this session has done some SQL about insert, then
other session do CTAS/INSERT and trigger Hive.moveFile, DFSClient will do checkOpen and throw
java.io.IOException: Filesystem closed.
> **Root cause** :
> When we first execut SQL like CTAS/INSERT, it will call Hive.moveFile, during this method,
it will initialize this field SessionState.hdfsEncryptionShim , when initialize this field,
it will initialize a FS.
> !https://user-images.githubusercontent.com/46485123/61587025-45802600-abb4-11e9-9926-6817f52490a3.png!
> But this FS is under current HiveSessionImpleWithUgi.sessionUgi, so when we close this
session, it will call `FileSystem.closeForUgi()`, above FileSystem will be closed, then during
other session execute SQL like CTAS/INSERT, such error will happen since FS has been close.
> Some one may be confused why HiveServer2 won't appear this problem :
> - In HiveServer2, each session has it's own SessionState, so close current session's
FS is ok.
> - In SparkThriftServer, all session interact with hive through one HiveClientImpl, it
has only one SessionState, when we call method with HiveClientImpl, it will call **withHiveState**
first to set HiveClientImpl's sessionState to current Thread.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message