lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sangraal aiken" <sangr...@gmail.com>
Subject Re: Doc add limit
Date Mon, 31 Jul 2006 15:32:16 GMT
Those are some great ideas Chris... I'm going to try some of them out.  I'll
post the results when I get a chance to do more testing. Thanks.

At this point I can work around the problem by ignoring Solr's response but
this is obviously not ideal. I would feel better knowing what is causing the
issue as well.

-Sangraal



On 7/29/06, Chris Hostetter <hossman_lucene@fucit.org> wrote:
>
>
> : Sure, the method that does all the work updating Solr is the
> doUpdate(String
> : s) method in the GanjaUpdate class I'm pasting below. It's hanging when
> I
> : try to read the response... the last output I receive in my log is Got
> : Reader...
>
> I don't have the means to try out this code right now ... but i can't see
> any obvious problems with it (there may be somewhere that you are opening
> a stream or reader and not closing it, but i didn't see one) ... i notice
> you are running this client on the same machine as Solr (hence the
> localhost URLs) did you by any chance try running the client on a seperate
> machine to see if hte number of updates before it hangs changes?
>
> my money is still on a filehandle resource limit somwhere ... if you are
> running on a system that has "lsof" (on some Unix/Linux installations you
> need sudo/su root permissions to run it) you can use "lsof -p ####" to
> look up what files/network connections are open for a given process.  You
> can try running that on both the client pid and the Solr server pid once
> it's hung -- You'll probably see a lot of Jar files in use for both, but
> if you see more then a few XML files open by the client, or more then a
> 1 TCP connection open by either the client or the server, there's your
> culprit.
>
> I'm not sure what Windows equivilent of lsof may exist.
>
> Wait ... i just had another thought....
>
> You are using InputStreamReader to deal with the InputStreams of your
> remote XML files -- but you aren't specifying a charset, so it's using
> your system default which may be differnet from the charset of the
> orriginal XML files you are pulling from the URL -- which (i *think*)
> means that your InputStreamReader may in some cases fail to read all of
> the bytes of the stream, which might some dangling filehandles (i'm just
> guessing on that part ... i'm not acctually sure whta happens in that
> case).
>
> What if you simplify your code (for the purposes of testing) and just put
> the post-transform version ganja-full.xml in a big ass String variable in
> your java app and just call GanjaUpdate.doUpdate(bigAssString) over and
> over again ... does that cause the same problem?
>
>
> :
> : ----------
> :
> : package com.iceninetech.solr.update;
> :
> : import com.iceninetech.xml.XMLTransformer;
> :
> : import java.io.*;
> : import java.net.HttpURLConnection;
> : import java.net.URL;
> : import java.util.logging.Logger;
> :
> : public class GanjaUpdate {
> :
> :   private String updateSite = "";
> :   private String XSL_URL = "http://localhost:8080/xsl/ganja.xsl";
> :
> :   private static final File xmlStorageDir = new
> : File("/source/solr/xml-dls/");
> :
> :   final Logger log = Logger.getLogger(GanjaUpdate.class.getName());
> :
> :   public GanjaUpdate(String siteName) {
> :     this.updateSite = siteName;
> :     log.info("GanjaUpdate is primed and ready to update " + siteName);
> :   }
> :
> :   public void update() {
> :     StringWriter sw = new StringWriter();
> :
> :     try {
> :       // transform gawkerInput XML to SOLR update XML
> :       XMLTransformer transform = new XMLTransformer();
> :       log.info("About to transform ganjaInput XML to Solr Update XML");
> :       transform.transform(getXML(), sw, getXSL());
> :       log.info("Completed ganjaInput/SolrUpdate XML transform");
> :
> :       // Write transformed XML to Disk.
> :       File transformedXML = new File(xmlStorageDir, updateSite+".sml");
> :       FileWriter fw = new FileWriter(transformedXML);
> :       fw.write(sw.toString());
> :       fw.close();
> :
> :       // post to Solr
> :       log.info("About to update Solr for site " + updateSite);
> :       String result = this.doUpdate(sw.toString());
> :       log.info("Solr says: " + result);
> :       sw.close();
> :     } catch (Exception e) {
> :       e.printStackTrace();
> :     }
> :   }
> :
> :   public File getXML() {
> :     String XML_URL = "http://localhost:8080/" + updateSite + "/ganja-
> : full.xml";
> :
> :     // check for file
> :     File localXML = new File(xmlStorageDir, updateSite + ".xml");
> :
> :     try {
> :       if (localXML.createNewFile() && localXML.canWrite()) {
> :         // open connection
> :         log.info("Downloading: " + XML_URL);
> :         URL url = new URL(XML_URL);
> :         HttpURLConnection conn = (HttpURLConnection) url.openConnection
> ();
> :         conn.setRequestMethod("GET");
> :
> :         // Read response to File
> :         log.info("Storing XML to File" + localXML.getCanonicalPath());
> :         FileOutputStream fos = new FileOutputStream(new
> File(xmlStorageDir,
> : updateSite + ".xml"));
> :
> :         BufferedReader rd = new BufferedReader(new InputStreamReader(
> : conn.getInputStream()));
> :         String line;
> :         while ((line = rd.readLine()) != null) {
> :           line = line + '\n'; // add break after each line. It preserves
> : formatting.
> :           fos.write(line.getBytes("UTF8"));
> :         }
> :
> :         // close connections
> :         rd.close();
> :         fos.close();
> :         conn.disconnect();
> :         log.info("Got the XML... File saved.");
> :       }
> :     } catch (Exception e) {
> :       e.printStackTrace();
> :     }
> :
> :     return localXML;
> :   }
> :
> :   public File getXSL() {
> :     StringBuffer retVal = new StringBuffer();
> :
> :     // check for file
> :     File localXSL = new File(xmlStorageDir, "ganja.xsl");
> :
> :     try {
> :       if (localXSL.createNewFile() && localXSL.canWrite()) {
> :         // open connection
> :         log.info("Downloading: " + XSL_URL);
> :         URL url = new URL(XSL_URL);
> :         HttpURLConnection conn = (HttpURLConnection) url.openConnection
> ();
> :         conn.setRequestMethod("GET");
> :         // Read response
> :         BufferedReader rd = new BufferedReader(new InputStreamReader(
> : conn.getInputStream()));
> :         String line;
> :         while ((line = rd.readLine()) != null) {
> :           line = line + '\n';
> :           retVal.append(line);
> :         }
> :         // close connections
> :         rd.close();
> :         conn.disconnect();
> :
> :         log.info("Got the XSLT.");
> :
> :         // output file
> :         log.info("Storing XSL to File" + localXSL.getCanonicalPath());
> :         FileOutputStream fos = new FileOutputStream(new
> File(xmlStorageDir,
> : "ganja.xsl"));
> :         fos.write(retVal.toString().getBytes());
> :         fos.close();
> :         log.info("File saved.");
> :       }
> :     } catch (Exception e) {
> :       e.printStackTrace();
> :     }
> :     return localXSL;
> :   }
> :
> :   private String doUpdate(String sw) {
> :     StringBuffer updateResult = new StringBuffer();
> :     try {
> :       // open connection
> :       log.info("Connecting to and preparing to post to SolrUpdate
> : servlet.");
> :       URL url = new URL("http://localhost:8080/update");
> :       HttpURLConnection conn = (HttpURLConnection) url.openConnection();
> :       conn.setRequestMethod("POST");
> :       conn.setRequestProperty("Content-Type",
> "application/octet-stream");
> :       conn.setDoOutput(true);
> :       conn.setDoInput(true);
> :       conn.setUseCaches(false);
> :
> :       // Write to server
> :       log.info("About to post to SolrUpdate servlet.");
> :       DataOutputStream output = new DataOutputStream(
> conn.getOutputStream
> : ());
> :       output.writeBytes(sw);
> :       output.flush();
> :       output.close();
> :       log.info("Finished posting to SolrUpdate servlet.");
> :
> :       // Read response
> :       log.info("Ready to read response.");
> :       BufferedReader rd = new BufferedReader(new InputStreamReader(
> : conn.getInputStream()));
> :       log.info("Got reader....");
> :       String line;
> :       while ((line = rd.readLine()) != null) {
> :         log.info("Writing to result...");
> :         updateResult.append(line);
> :       }
> :       rd.close();
> :
> :       // close connections
> :       conn.disconnect();
> :
> :       log.info("Done updating Solr for site" + updateSite);
> :     } catch (Exception e) {
> :       e.printStackTrace();
> :     }
> :
> :     return updateResult.toString();
> :   }
> : }
> :
> :
> : On 7/28/06, Chris Hostetter <hossman_lucene@fucit.org> wrote:
> : >
> : >
> : > : I'm sure... it seems like solr is having trouble writing to a tomcat
> : > : response that's been inactive for a bit. It's only 30 seconds
> though, so
> : > I'm
> : > : not entirely sure why that would happen.
> : >
> : > but didn't you say you don't have this problem when you use curl --
> just
> : > your java client code?
> : >
> : > Did you try Yonik's python test client? or the java client in Jira?
> : >
> : > looking over the java clinet codey you sent, it's not clear if you are
> : > reading the response back, or closing the connections ... can you post
> a
> : > more complete sample app thatexhibits the problem for you?
> : >
> : >
> : >
> : > -Hoss
> : >
> : >
> :
>
>
>
> -Hoss
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message