db-derby-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Db-derby Wiki] Update of "Nirmal/GSoC 2010- My Proposal" by Nirmal
Date Sun, 04 Apr 2010 16:43:54 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Db-derby Wiki" for change notification.

The "Nirmal/GSoC 2010- My Proposal" page has been changed by Nirmal.


New page:
#format wiki
#language en
== Google Summer of Code 2010 - Project Proposal ==

|| Project || Derby-4587- Add tools for improved analysis and understanding of query plans
and execution statistics ||
|| Student Name  || Nirmal Fernando ||
|| E-mail || nirmal070125@gmail.com ||
|| IM || nirmal070125 (Google Talk) ||

=== Abstract ===

Apache Derby, an Apache DB subproject, is an open source relational database implemented entirely
in Java and available under the Apache License, Version 2.0, and is a very interesting piece
of software. Derby is based on Java, JDBC and SQL. Quite frequently, users of Derby have troubles
in comprehending
   * how their query is being translated into a query plan by the optimizer
   * what is the execution-time, resource usage at the various steps of a query plan
The objective of this project is to provide visual displays to help people understand the
way their query is being run.

=== Description ===

Derby is known as a system which is easy to install, deploy and maintain in production environments.
Latest official release of Derby is, and was released in year 2009. Apache Derby
(here after mention as Derby) has an active developers’ community behind the wall, who are
dedicating, hours of their time to improve/ fix bugs that are raised by the users. The Derby
developers’ community is really helpful and they are easy to work with.
Users, of an open source project like Derby are the most important group that will take the
software an extra mile from what it is at a particular instance. That’s why the developers’
pay a great attention for the issues or improvements that are raised by users of Derby. This
project addresses such an improvement that users of Derby are expecting. Quite frequently,
they have troubles in comprehending the steps followed by the optimizer when it translates
a query into a query plan, and details such as time to execute, resource utilization at each
step of the query plan. There are low-level features in Derby which capture this information
and record it, such as logQueryPlan, and the XPLAIN tables, so this project will lead to a
development of a new tool which can process the query plan and execution statistics information
and present it in a more comprehensible fashion i.e. a graphical interpretation of the query
After went through some possible ways of implementing this tool which were suggested by my
mentor for this project Mr. Bryan Pendleton, we came across the following idea which will
display the query plan in a web browser.

Figure1: Process View of the tool http://img682.imageshack.us/img682/9691/derbya.png

 * XSLT Style Sheet:
First I do a research on Derby to recognize all the possible instances that can be occur in
a query such as name of the table, sort, hash-join etc. Next I planned to write the template,
as it iteratively looks for the instance occurred at that particular step and displays the
image + details (execution time, resource usage), for all the steps that were followed when
executing the query. Here, image and details can be obtained through the XML file emitted
by the query plan.

 * Raw XML Document:
The XML document should contain a predefined set of tags for all possible instances of a query
as mentioned above and each tag (Child) should contain sub-children which will describe the
image + details. And this will create at the run time in order of the execution procedure.

 * Next I will link Raw XML document and the XSL style sheet. This step is fairly easy; I
just need to provide the reference to the XSL style sheet inside the XML document. 

 * After that XSLT compliant browser will transform XML document generated by query plan into
a XHTML page, which will show the graphical query plan + details.

 * We planned to approach the complete graphical query explainer, incrementally, starting
from very basic queries.

== Things Done So Far ==

 * Checked out Derby trunk from the SVN
 * Setup my development and testing environment
 * Built Derby from the source and got familiar with the folder structure, functionalities
 * Searched and went through the explain query plans of other well-known open source database
management systems such as PostgreSQL-PgAdmin tool, MySQL. PgAdmin provides a good graphical
explain of a query and support for many functionalities than MySQL. So, checked the possibility
of using PgAdmin tool with Derby, and found that PgAdmin is not allowing, using other database
management systems.
 * Currently I am assigned myself to the issue Derby-4406-Wrong order when using ORDER BY
on non-deterministic function [2], and hope to fix it soon.

== Deliverables ==

 * Test cases required to verify the functionality of the graphical query explain tool.
 * Code of the tool that emits the data in XML tagged format.
 * Code of the tool that format XML data into visual information in a browser.   
 * XSLT style sheet which templates the graphical output.
 * Documentation of new tools and screen-shots of the graphical output for sample queries.

== Development Schedule ==

 * 10th May 2010 – Complete reading the documentation and analysing the existing Derby code
base while paying higher attention on the relevant areas to the project.
 * 15th May 2010 – Finalize the process view after getting the comments from the developers’
community and from my mentor and finalize the way that a user might view the graphical explainer.
 * 23rd May 2010 – Recognize all the possible instances that can be occur in a query and
complete developing test cases required to test the tool.
 * 6th of July 2010 – Build a tool (a small Java program using standard JDBC) which can
read the query execution data for a SQL statement from the Derby XPLAIN tables and emit the
data in XML tagged format. This involves specifying the XML schema for the data, writing the
program to produce the data, and building tests for it.
 * 12th of July 2010 – Submit workings for mid-term evaluations.
 * 31st of July 2010 – Build a tool to format the XML-formatted data into visual information
in a browser, using an XSLT style sheet. This involves conceptualizing the visual display,
designing and writing the XSLT style sheet, and building tests for it.
 * 9th August 2010 – Complete testing new tools and create documentation for those and update
Derby documentation.
 * 16th August 2010 – Submit workings for final evaluations.

== About Me ==

I am Nirmal Fernando, 3rd year undergraduate at Department of Computer Science and Engineering
at University of Moratuwa, Sri Lanka. I am very competent at Java programming language, OOP,
relational database schemas, SQL, XML, XSL and data structures and algorithms. I had gone
through a course module in Database Management Systems, few months’ back that will help
me to understand the underlying concepts fairly easily.
I had created an extension for OpenOffice.org as my level-3-semester-I project [3].

== Community Interaction ==

Apache Derby developers’ community is the main community behind this project and I highly
appreciated comments/ ideas of the expert developers of Derby and consider those as a great
opportunity to learn and contribute more and more to the improvement of Derby.
I already got subscribed to the derby-dev mailing list and getting the continuous feed backs
and ideas regarding the project as well as the other issues of Derby.

== Resources/References ==

[1] Project Idea on Apache JIRA: https://issues.apache.org/jira/browse/DERBY-4587

[2] Derby-4406-Involvements: https://issues.apache.org/jira/browse/DERBY-4406

[3] Open Office.org extension: http://extensions.services.openoffice.org/en/project/MatrixManipulator

'''Thank you !!'''

View raw message