hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (HIVE-23526) Beeline may throw the misleading exception
Date Thu, 04 Jun 2020 13:14:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-23526?focusedWorklogId=441268&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-441268
]

ASF GitHub Bot logged work on HIVE-23526:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 04/Jun/20 13:13
            Start Date: 04/Jun/20 13:13
    Worklog Time Spent: 10m 
      Work Description: belugabehr commented on pull request #1029:
URL: https://github.com/apache/hive/pull/1029#issuecomment-638838456


   @dengzhhu653 By default (Hive 2.3 and earlier), beeline buffers all of the results before
displaying them to the user.
   
   With a query like `select * from a limit 500000`, if there are that many rows in the table,
it will have to buffer 500_000 rows and fit them all into 512MB of memory (please all the
other stuff beeline stores).  And let's be honest, no human is going to look through that
many rows manually.  
   
   You're better off disabling buffering, and streaming out the results to a CSV for further
processing:
   
   Something like:
   `beeline -u "jdbc:hive2://..." -e "select * from a limit 500000" --outputformat=csv2 --incremental=true
> results.csv`
   
   https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=82903124#HiveServer2Clients-Separated-ValueOutputFormats
   
   https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-BeelineCommandOptions
   
   This is not a problem with Thrift.  Maybe Thrift should better handle the OOM Exception,
but that will have to be addressed in that project, not Hive.  As long as the OOM Exception
is being propagated up, and the Statement is closed (which I believe it is) then Beeline is
handling this appropriately.
   
   I just visually traced the trunk code and it looks like the OOM is being handled correctly.
Details should be printed with verbose logging enabled.  I think this method is a bit spaghetti
and needs some TLC, but it should be working.
   
   I am not in favor of this change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 441268)
    Time Spent: 4.5h  (was: 4h 20m)

> Beeline may throw the misleading exception
> ------------------------------------------
>
>                 Key: HIVE-23526
>                 URL: https://issues.apache.org/jira/browse/HIVE-23526
>             Project: Hive
>          Issue Type: Improvement
>          Components: Beeline
>         Environment: Hive 1.2.2
>            Reporter: Zhihua Deng
>            Priority: Minor
>              Labels: pull-request-available
>         Attachments: HIVE-23526.2.patch, HIVE-23526.3.patch, HIVE-23526.patch, outofsequence.log
>
>          Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Sometimes we can see 'out of sequence response' message in beeline, for example:
> Error: org.apache.thrift.TApplicationException: CloseOperation failed: out of sequence
response (state=08S01,code=0)
> java.sql.SQLException: org.apache.thrift.TApplicationException: CloseOperation failed:
out of sequence response
> at org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:198)
> at org.apache.hive.jdbc.HiveStatement.close(HiveStatement.java:217)
> at org.apache.hive.beeline.Commands.execute(Commands.java:891)
> at org.apache.hive.beeline.Commands.sql(Commands.java:713)
> at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:976)
> at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:816)
> at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:774)
> at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:487)
> at org.apache.hive.beeline.BeeLine.main(BeeLine.java:470)
> and there is no other usage message to figured it out, even with --verbose, this makes
problem puzzled as beeline does not have concurrency problem on underlying thrift transport.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message