drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rahul Challapalli (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-1482) Tpch 3 over text files for SF100 causes some of the drillbits(JVM) to crash
Date Thu, 02 Oct 2014 17:25:33 GMT
Rahul Challapalli created DRILL-1482:

             Summary: Tpch 3 over text files for SF100 causes some of the drillbits(JVM) to
                 Key: DRILL-1482
                 URL: https://issues.apache.org/jira/browse/DRILL-1482
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Flow
            Reporter: Rahul Challapalli


We created views on top of Tpch text data (SF 100) so that we need not modify the TPCH original
queries. Below are the views and the query itself.

create view nation as select cast(columns[0] as int) n_nationkey, columns[1] n_name, cast(columns[2]
as int) n_regionkey, columns[3] n_comment from `nation_text`;
create view region as select cast(columns[0] as int) r_regionkey, columns[1] r_name, columns[2]
r_comment from `region_text`;
create view part as select cast(columns[0] as int) p_partkey, columns[1] p_name, columns[2]
p_mfgr, columns[3] p_brand, columns[4] p_type, cast(columns[5] as int) p_size, columns[6]
p_container, cast(columns[7] as double) p_retailprice, columns[8] p_comment from `part_text`;
create view supplier as select cast(columns[0] as int) s_suppkey, columns[1] s_name, columns[2]
s_address, cast(columns[3] as int) s_nationkey, columns[4] s_phone, cast(columns[5] as double)
s_acctbal, columns[6] s_comment from `supplier_text`;
create view partsupp as select cast(columns[0] as int) ps_partkey, cast(columns[1] as int)
ps_suppkey, cast(columns[2] as int) ps_availqty, cast(columns[3] as double) ps_supplycost,
columns[4] ps_comment from `partsupp_text`;
create view customer as select cast(columns[0] as int) c_custkey, columns[1] c_name, columns[2]
c_address, cast(columns[3] as int) c_nationkey, columns[4] c_phone, cast(columns[5] as double)
c_acctbal, columns[6] c_mktsegment, columns[7] c_comment from `customer_text`;
create view orders as select cast(columns[0] as int) o_orderkey, cast(columns[1] as int) o_custkey,
columns[2] o_orderstatus, cast(columns[3] as double) o_totalprice, cast(columns[4] as date)o_orderdate,
columns[5] o_orderpriority, columns[6] o_clerk, cast(columns[7] as int) o_shippriority, columns[8]
o_comment from `orders_text`;
create view lineitem as select cast(columns[0] as int) l_orderkey, cast(columns[1] as int)
l_partkey, cast(columns[2] as int) l_suppkey, cast(columns[3] as int) l_linenumber, cast(columns[4]
as double) l_quantity, cast(columns[5] as double) l_extendedprice, cast(columns[6] as double)
l_discount, cast(columns[7] as double) l_tax, columns[8] l_returnflag, columns[9] l_linestatus,
cast(columns[10] as date) l_shipdate, cast(columns[11] as date) l_commitdate, cast(columns[12]
as date) l_receiptdate, columns[13] l_shipinstruct, columns[14] l_shipmode, columns[15] l_comment
from `lineitem_text`;

-- TPCH Query 3

  sum(l.l_extendedprice * (1 - l.l_discount)) as revenue,
  customer c,
  orders o,
  lineitem l
  c.c_mktsegment = 'HOUSEHOLD'
  and c.c_custkey = o.o_custkey
  and l.l_orderkey = o.o_orderkey
  and o.o_orderdate < date '1995-03-25'
  and l.l_shipdate > date '1995-03-25'
group by
order by
  revenue desc,
limit 10;

The cluster has 8 drillbits running with DRILL_MAX_DIRECT_MEMORY="32G". After running for
around 45 seconds 3 out of the 8 drillbits come down due to a jvm crash. I attached the log
files. Let me know if you need more information.

This message was sent by Atlassian JIRA

View raw message