trafodion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hans Zeller <>
Subject RE: Tables left over from regression test runs
Date Wed, 02 May 2018 15:38:08 GMT

My two cents: +1 on Dave's suggestion to clean up the tests. I think some of these leftover
tables happen when people comment out the cleanup code for debugging and then accidentally
check that change into git.

About speeding up regressions: I really like Ming's idea (
) of storing multiple Trafodion tables in a single HBase table and think that this could potentially
speed things up by a lot.

One command that is particularly slow is "drop schema cascade". Is there a way to speed this
up, maybe by using a flavor of the "cleanup" command instead?



-----Original Message-----
From: Qifan Chen <> 
Sent: Wednesday, May 2, 2018 7:32 AM
Subject: Re: Tables left over from regression test runs

There is actually a case against HBase to allow a quick way to disable a table, before dropping
it:  The case has been open since 2011.

Given the fact that it is slow to create/drop a table, it may be a good idea to promote table
reuse in general.  We have been doing it for most of the HIVE tests, utilizing HIVE tables
created during local hadoop setup time.

To allow HBase table reuse, we may need to name these tables more precisely, such as sb_056_t1
for a table used in SEABASE/TEST056.

On memory used in the region servers,  my understanding is once the table is closed, then
the memory taken by that table is subjected to GC.

Thanks --Qifan

From: Sean Broeder <>
Sent: Wednesday, May 2, 2018 8:16:07 AM
Subject: RE: Tables left over from regression test runs

It seems like to accomplish what Dave is seeking the tables should be disabled at least. 
Then if you really want to go back and look at the contents you could by re-enabling the tables,
but the extra memory would be freed up in the region server.

If the tables have a common name for a given test, then you might be able to leverage a pattern
match with a disable_all command and disable them all in a single statement at the end of
the test.


-----Original Message-----
From: Sandhya Sundaresan <>
Sent: Tuesday, May 1, 2018 9:14 PM
Subject: Re: Tables left over from regression test runs

Agree that each test needs to be a "good citizen"  and cleanup all tables.  SOme tests have
the "-noCleanup" option that skip the final cleanup step. That's a really useful step to have
in every test if possible. But in some cases tables are created and dropped mid test too.
For those there is no choice but to modify the test if any kind of debugging needs to be done
that need to tables to stay around.

Thanks for looking into these,  Dave.


From: Dave Birdsall <>
Sent: Tuesday, May 1, 2018 5:03:00 PM
Subject: RE: Tables left over from regression test runs


Regarding why stopping hbase takes a long time: I was watching the HBase log today while doing
a swstophbase. It was doing individual region closes on each table. It took a long time to
get through all of them. Of course, one can always just kill the HMaster process (I sometimes
do this) but that sometimes results in not being able to bring the instance up again, with
loss of any working data. So that's risky.

Regarding time to drop tables: I'm noticing that many of the tests that don't drop tables
at the end do so at the beginning. If they are run on a clean instance, that's fast (because
it fails fast or it has "drop if exists"). If they are run on an instance where they have
been run before, we pay the cost of dropping the table anyway. Agreed, for Jenkins it's better
because we just throw the instance away after one run. For developers who are keeping test
tables around, it's not so good.

Regarding the convenience of having objects around when there's a need to debug something:
I've been unlucky at this. Almost always, the particular object I need is in a test that cleans
up its objects. So I end up having to recreate it from a stripped down version of the test
script. I suspect this is true more often than not. So I haven't found this particular argument

Regarding speeding up HBase drop: Yes, that would be a great idea.


-----Original Message-----
From: Anoop Sharma <>
Sent: Tuesday, May 1, 2018 4:51 PM
Subject: RE: Tables left over from regression test runs

yes, it is true that some tests do not drop all the tables that are created as part of that
This is not always intentional and at times it is because one missed cleaning them up.

But there are some advantages of not dropping tables at the end of a test run.

- drop hbase tables take a non-trivial amount of time.  dropping all tables will increase
the time it takes to run a test.
  This will also impact Jenkins as it runs tests after init traf which cleans up everything
- is there a way to make dropping of table or dropping of whole schema faster? Using concurrent
drops? Or drop without disable(disable is where most of the time is spent due to mem flush).
There is an hbase jira on drop issue but no one has volunteered to fix it.
- some tables are permanent (like from QAT) that should not be cleaned up
- many tests drop tables at the beginning of the test or have an 'if not exists' clause.
- one advantage of not dropping a table at the end is that sometimes an issue could be diagnosed
without having to recreate the table and associated dependent objects.
- if the only objects on a dev instance are regression tests, then doing ilh_trafinit will
be much faster to clean up everything after full regressions.
  But this would also nuke any non-regression traf objects so one need to be careful about
- should we also find out why stopping hbase takes a long time. Is there something that can
be done to 'stop abrupt' on dev platform?


-----Original Message-----
From: Dave Birdsall <>
Sent: Tuesday, May 1, 2018 3:57 PM
Subject: Tables left over from regression test runs


I've noticed after running full regressions that there are a boatload of tables that don't
get cleaned up.

These tables occupy regions in our instance's region server and I think may cause excessive
memory usage and/or increasingly long times when stopping HBase.

So, I'm thinking about cleaning up some of our regression tests to drop these tables when
they finish.

Does anyone object to this? Or is there some pressing need to keep any of these tables around
after regressions complete?



View raw message