xml-xindice-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jcplerm" <jcpl...@ameritech.net>
Subject Re: XML-RPC-based enhancements
Date Tue, 05 Aug 2003 20:37:23 GMT
I forgot to include one important comment on the "windowed query" capability:

Without it, one of my major concerns has been to avoid the "out of memory error"
when running an XPath query on large document databases.
That type of exception occurs because Xindice always aggregates all qualifying
documents into a single DOM tree, before serializing it to String and
sending this String to the client.

With the "windowed" query, the most memory the server will use to solve the query
is just for the number of elements determined by the size of the window.
It just ocurrs to me now that MyQuery.java can be further optimized to serialize
the DOM objects into String set as they qualify to be included in the [N..N+size] window.
This way, the maximum amount of memory that will be used is determined by the
maximum object being searched on the server.

There is always room for more enhancements, but at least with this 
implementation, one can be sure to be safe with respect to 
"out of memory" exceptions. No matter how large the potential match for
an XPath query, you will only get a slice of the total result set at a time.

jlerm
  ----- Original Message ----- 
  From: jcplerm 
  To: xindice-users@xml.apache.org 
  Sent: Tuesday, August 05, 2003 3:20 PM
  Subject: XML-RPC-based enhancements


  This is to report a few enhancements I implemented that I think are useful 
  especially in web environments, and I wish they somehow become 
  standard offerings.

  They are related to the following optimizations:

  1) Ability to do a "windowed" query on the server:
      Given an XPath query, return only the N through N+size of the entire result set produced
by that query.
      The elements are selected on the server, and only the qualifying ones are returned to
the client.

  2) Ability to just check for existence:
      Given an XPath query, return just a true/false value, whether the XPath query found
something or not.
      This implementation returns something like: <result>true</result>

  3) Ability to just obtain a count of the elements qualified by an XPath query (without any
documents):
      Given an XPath query, return somthing like: <result count="N"/>

  4) Ability to query just document ids:
      Given an XPath query, return something like: 
                  <result>
                          <key>f0324bhj24bj23g4j</key>
                          <key>23i43468kj72b68b</key>
                  </result>

  5) Ability to query on attribute elements:
      For instance, if the query is: /person/@name
      Return something like:
                  <result>
                          <name>name1</name>
                          <name>name2</name>
                  </result>

  Below are two pieces of code: the first one ("MyXMLDBCollection.java") is the client
  code based on CollectionImpl.java that does XML-RPC invocations on the server, for the
  various types of calls listed above. The methods return content as String (which is enough
for me,
  but can be modified to support other types). The thing is that content comes back 
  from the server always as text, and this way I can directly process it with SAX without

  incurring in the overhead of XMLResourceImpl, which seems to always parse the content into
  DOM, before giving back String or SAX. This is client code, so no rebuild of the server
  is needed for this part.

  The second piece ('MyQuery.java") is server code. You need to compile and rebuild it.
  Please note that the name of this class is referenced in the "runRemoteCommands" 
  of MyXMLDBCollection.java. 

  New parameters to identify different options (such as to query just for existence, just
  a count, just the keys, and the start/length of the result set) were defined in class
  MyQuery.java.

  I am sharing this for your review, and hopefully something like it will be incorporated

  into Xindice.

  The ability to filter content on the server, without downloading everything to the client
  is essential for optimal performance. This implementation is not based on standards,
  but works well for me. Hopefully it will make into the XML:DB standard, which 
  apparently does not provide any way of doing what I'm describing in this message.

  I am already using the code and, as I said, works for my own needs.
  But if its acceptance could be widened and somehow supported as standard, that
  would be great.

  Thanks,

  jlerm


  ============  section of MyXMLDBCollection.java ==========


      
      // Should return something like "<result count="N"/>"
      public String queryCountAsString(String xpathQuery) throws XMLDBException {
       
       String result=null;

          try {
              checkOpen();

              Hashtable params = new Hashtable();
       Hashtable ns = new Hashtable();
              params.put(RPCDefaultMessage.COLLECTION, collPath);
              params.put(RPCDefaultMessage.TYPE, "XPath");
              params.put(RPCDefaultMessage.NAMESPACES, ns);
              params.put(RPCDefaultMessage.QUERY, xpathQuery);
              params.put(MyQuery.QUERY_COUNT, MyQuery.QUERY_COUNT);

              result = (String) runRemoteCommand("MyQuery", params);
          } catch (Exception e) {
              throw FaultCodes.createXMLDBException(FaultCodes.QRY_PROCESSING_ERROR, "Query
error", e);
          }
          
          return result;
      }
      
      // Should return something like "<result>true/false</result>"
      public String queryExistsAsString(String xpathQuery) throws XMLDBException {
       
       String result=null;

          try {
              checkOpen();

              Hashtable params = new Hashtable();
       Hashtable ns = new Hashtable();
              params.put(RPCDefaultMessage.COLLECTION, collPath);
              params.put(RPCDefaultMessage.TYPE, "XPath");
              params.put(RPCDefaultMessage.NAMESPACES, ns);
              params.put(RPCDefaultMessage.QUERY, xpathQuery);
              params.put(MyQuery.QUERY_EXISTS, MyQuery.QUERY_EXISTS);

              result = (String) runRemoteCommand("MyQuery", params);
          } catch (Exception e) {
              throw FaultCodes.createXMLDBException(FaultCodes.QRY_PROCESSING_ERROR, "Query
error", e);
          }
          
          return result;
      }
      
      // Should return something like "<result><key>key1</key><key>key2</key></result>"
      public String queryKeysAsString(String xpathQuery, int start, int length) throws XMLDBException
{
       
       String result=null;

          try {
              checkOpen();

              Hashtable params = new Hashtable();
       Hashtable ns = new Hashtable();
              params.put(RPCDefaultMessage.COLLECTION, collPath);
              params.put(RPCDefaultMessage.TYPE, "XPath");
              params.put(RPCDefaultMessage.NAMESPACES, ns);
              params.put(RPCDefaultMessage.QUERY, xpathQuery);
              params.put(MyQuery.QUERY_KEYS, MyQuery.QUERY_KEYS);
              params.put(MyQuery.START, String.valueOf(start) );
              params.put(MyQuery.LENGTH, String.valueOf(length) );

              result = (String) runRemoteCommand("MyQuery", params);
          } catch (Exception e) {
              throw FaultCodes.createXMLDBException(FaultCodes.QRY_PROCESSING_ERROR, "Query
error", e);
          }
          
          return result;
      }

         
      public String queryAsString(String xpathQuery, int start, int length) throws XMLDBException
{
       
       String result=null;
       
   //System.out.println("MyXMLDBCollection.queryAsString(): starting (xpathQuery=" + xpathQuery
+ ")");

          try {
              checkOpen();

              Hashtable params = new Hashtable();
       Hashtable ns = new Hashtable();
              params.put(RPCDefaultMessage.COLLECTION, collPath);
              params.put(RPCDefaultMessage.TYPE, "XPath");
              params.put(RPCDefaultMessage.NAMESPACES, ns);
              params.put(RPCDefaultMessage.QUERY, xpathQuery);
              params.put(MyQuery.START, String.valueOf(start) );
              params.put(MyQuery.LENGTH, String.valueOf(length) );

              result = (String) runRemoteCommand("MyQuery", params);
          } catch (Exception e) {
              throw FaultCodes.createXMLDBException(FaultCodes.QRY_PROCESSING_ERROR, "Query
error", e);
          }
          
   //System.out.println("MyXMLDBCollection.queryAsString(): finished (result=" + result +
")");

          return result;
      }
      

  ============ end of MyXMLDBCollection.java ============


  =============  MyQuery.java ============
  package org.apache.xindice.server.rpc.messages;

  import org.apache.xindice.core.Collection;
  import org.apache.xindice.core.data.NodeSet;
  import org.apache.xindice.server.rpc.RPCDefaultMessage;
  import org.apache.xindice.xml.NamespaceMap;
  import org.apache.xindice.xml.TextWriter;
  import org.apache.xindice.xml.dom.DBNode;
  import org.apache.xindice.xml.dom.DocumentImpl;
  import org.apache.xindice.xml.dom.ElementImpl;
  import org.apache.xindice.core.data.Key;
  import org.apache.xindice.xml.NodeSource;

  import org.w3c.dom.Document;
  import org.w3c.dom.Element;
  import org.w3c.dom.Node;
  import org.w3c.dom.Text;
  import org.w3c.dom.Attr;

  import java.util.Enumeration;
  import java.util.Hashtable;

  /**
   * Executes a query against a document or collection
   */

  public class MyQuery extends Query {
   
      // Constants for new parameters
      public static final String QUERY_COUNT = "QUERY_COUNT";
      public static final String QUERY_EXISTS = "QUERY_EXISTS";
      public static final String QUERY_KEYS = "QUERY_KEYS";
      public static final String START = "START";
      public static final String LENGTH = "LENGTH";

      public Hashtable execute(Hashtable message) throws Exception {

          if (!message.containsKey(COLLECTION)) {
              throw new Exception(MISSING_COLLECTION_PARAM);
          }

          if (!message.containsKey(TYPE)) {
              throw new Exception(MISSING_TYPE_PARAM);
          }

          if (!message.containsKey(QUERY)) {
              throw new Exception(MISSING_QUERY_PARAM);
          }

          Collection col = getCollection((String) message.get(COLLECTION));
          NodeSet ns = null;

          if (message.containsKey(NAME)) {

              ns = col.queryDocument((String) message.get(TYPE), (String) message.get(QUERY),
mapNamespaces((Hashtable) message.get(NAMESPACES)), (String) message.get(NAME));

          } else {

              ns = col.queryCollection((String) message.get(TYPE), (String) message.get(QUERY),
mapNamespaces((Hashtable) message.get(NAMESPACES)));
          }
          
   // Get the start and length parameters, if any
   int start=0;
          if (message.containsKey(START)) {
              String strStart = (String) message.get(START);
              start = Integer.parseInt(strStart);
          }   
          // -1 means "no limit"
   int length=-1;
          if (message.containsKey(LENGTH)) {
              String strLength = (String) message.get(LENGTH);
              length = Integer.parseInt(strLength);
          }                 

          Hashtable result = new Hashtable();
          // Invoke the right query wrapper
          if (message.containsKey(QUERY_EXISTS)) {
           result.put(RESULT, existsQueryWrapper(ns));
          }
          else if (message.containsKey(QUERY_COUNT)) {
           result.put(RESULT, countQueryWrapper(ns));
          }
          else if (message.containsKey(QUERY_KEYS)) {
           result.put(RESULT, keysQueryWrapper(ns,start,length));
          }
          else {
           result.put(RESULT, queryWrapper(ns,start,length));
          }
          
          return result;
      }

      /**
       * Maps a Hashtable containing namespace definitions into a Xindice
       * NamespaceMap.
       *
       * @param namespaces
       * @return A NamespaceMap
       */
      protected NamespaceMap mapNamespaces(Hashtable namespaces) {
          NamespaceMap nsMap = null;
          if (namespaces.size() > 0) {
              nsMap = new NamespaceMap();

              Enumeration keys = namespaces.keys();
              while (keys.hasMoreElements()) {
                  String key = (String) keys.nextElement();
                  nsMap.setNamespace(key, (String) namespaces.get(key));
              }
          }

          return nsMap;
      }

      /**
       * Standard query wrapper
       */
      private String queryWrapper(NodeSet ns,int start, int length) throws Exception {
       
       // Make sure start makes sense
       if (start<0) start=0;
       
          // Turn the NodeSet into a document.
          DocumentImpl doc = new DocumentImpl();

          Element root = doc.createElement("result");
          doc.appendChild(root);
          int count = 0;
          int curr = 0;
          while (ns != null && ns.hasMoreNodes()) {
              Node n = ns.getNextNode();
              
              if (curr>=start && (length<1 || (curr<start+length) ) ) {
                   // Within the desired window, so process node

        if (n.getNodeType() == Node.DOCUMENT_NODE) {
                n = ((Document) n).getDocumentElement();
        }
               if (!(n.getNodeType() == Node.ATTRIBUTE_NODE)) {
                // If it's an element, add it to the result
   
                if (n instanceof DBNode) {
                    ((DBNode) n).expandSource();
               }
               
            root.appendChild(doc.importNode(n, true));
               }
               else  {
                // or transform an attribute into an element with the same name
                   Attr attr = (Attr) n;
                   String attrName = attr.getName();
                   String attrValue = attr.getValue();
                   Node newNode = doc.createElement(attrName);
                   Node newTextNode = doc.createTextNode(attrValue);
                   newNode.appendChild(newTextNode);
                   root.appendChild(newNode);
               }            
               count++;
              }
              curr++;
          }

          root.setAttribute("count", Integer.toString(count));

          return TextWriter.toString(doc);
      }


      /**
       * Key query wrapper
       */
      private String keysQueryWrapper(NodeSet ns,int start, int length) throws Exception {
       
       // Make sure start makes sense
       if (start<0) start=0;
       
          // Turn the NodeSet into a document.
          DocumentImpl doc = new DocumentImpl();

          Element root = doc.createElement("result");
          doc.appendChild(root);
          int count = 0;
          int curr = 0;
          while (ns != null && ns.hasMoreNodes()) {
              Node n = ns.getNextNode();
              
              if (curr>=start && (length<1 || (curr<start+length) ) ) {
                   // Within the desired window, so process node

        if (n.getNodeType() == Node.DOCUMENT_NODE) {
                n = ((Document) n).getDocumentElement();
        }
               if (!(n.getNodeType() == Node.ATTRIBUTE_NODE)) {
                // If it's an element, add it to the result
   
                if (n instanceof DBNode) {
                    ((DBNode) n).expandSource();
               }
               
               String key=null;
               NodeSource nodeSource = ((DBNode) n).getSource();
               if (nodeSource!=null) {
                Key k = nodeSource.getKey();
                if (k!=null) {
                  key = k.toString();
             }
     }
               
               Element newElement = doc.createElement("key");
               Text newText = doc.createTextNode(key);
               newElement.appendChild(newText);
               
            root.appendChild(newElement);
               }            
               count++;
              }
              curr++;
          }

          root.setAttribute("count", Integer.toString(count));

          return TextWriter.toString(doc);
      }


      /**
       * Count query wrapper
       */
      private String countQueryWrapper(NodeSet ns) throws Exception {

          int count = 0;
          while (ns != null && ns.hasMoreNodes()) {
              Node n = ns.getNextNode();
              count++;
          }

          return "<result count=\"" + count + "\"/>";
      }

      /**
       * "Exists" query wrapper
       */
      private String existsQueryWrapper(NodeSet ns) throws Exception {

   String result=null;

          if (ns.hasMoreNodes()) {
           result="<result>true</result>";
          }
          else {
           result="<result>false</result>";
          }

          return result;
      }

  }
  ================= end of MyQuery.java =========================
Mime
View raw message