drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <Mridul.Cho...@rbs.com.INVALID>
Subject Drill Join query optimization
Date Fri, 04 Nov 2016 08:53:07 GMT
Hello,

I am working on a proof of concept to optimize the performance of the join queries executed
through drill. The underlying storage is a NO-SQL based database - Mongo DB. The time it takes
to return the result of the join query is very high (46 seconds). Upon further analysis, as
per the physical plan of the query it is observed that both the left side (15 lakh records)
and right side table (13 lakh) are fully scanned and it takes 24 seconds and 20 seconds respectively
for the same.

Here is the query -

select ta.[SOME_COLUMN] from mongo.Test.TABLEA ta INNER JOIN mongo.Test.TABLEB ta ON ta.Id
= tb.Id and ta.Id ='123'

Records in table A : 15 Lakhs

Records in table B : 13 Lakhs

Filter condition : Id is indexed field in both the tables (ascending)

Drill Plan shows Hash-join being performed as shown in the plan (as attached above)

I have the following questions -

1)     Why does drill fetch all the records into the memory even though filter condition on
the indexed column was provided for one of the table ?

2)     If the Drill planner/optimizer is intelligent then records could have been filtered
on the second table as well based on the join condition (to reduce the dataset and hence result
in faster execution time)

3)     Are there any performance tuning options available to bring down the query response
time and generate a better query plan ?



I would appreciate if you can provide me pointers/directions or answers to the above questions.

Please feel free if you need any additional information.

Thanks,

Mridul





*********************************************************************************** 
The Royal Bank of Scotland plc. Registered in Scotland No 90312. 
Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. 
Authorised by the Prudential Regulation Authority and regulated 
by the Financial Conduct Authority and Prudential Regulation Authority. 
The Royal Bank of Scotland N.V. is authorised and regulated by the 
De Nederlandsche Bank and has its seat at Amsterdam, the 
Netherlands, and is registered in the Commercial Register under 
number 33002587. Registered Office: Gustav Mahlerlaan 350, 
Amsterdam, The Netherlands. The Royal Bank of Scotland N.V. and 
The Royal Bank of Scotland plc are authorised to act as agent for each 
other in certain jurisdictions. 
  
This e-mail message is confidential and for use by the addressee only. 
If the message is received by anyone other than the addressee, please 
return the message to the sender by replying to it and then delete the 
message from your computer. Internet e-mails are not necessarily 
secure. The Royal Bank of Scotland plc and The Royal Bank of Scotland 
N.V. including its affiliates ("RBS group") does not accept responsibility 
for changes made to this message after it was sent. For the protection
of RBS group and its clients and customers, and in compliance with
regulatory requirements, the contents of both incoming and outgoing
e-mail communications, which could include proprietary information and
Non-Public Personal Information, may be read by authorised persons
within RBS group other than the intended recipient(s). 

Whilst all reasonable care has been taken to avoid the transmission of 
viruses, it is the responsibility of the recipient to ensure that the onward 
transmission, opening or use of this message and any attachments will 
not adversely affect its systems or data. No responsibility is accepted 
by the RBS group in this regard and the recipient should carry out such 
virus and other checks as it considers appropriate. 

Visit our website at www.rbs.com 
***********************************************************************************  

Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message