lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ankit Murarka <ankit.mura...@rancoretech.com>
Subject Re: Stream Closed Exception and Lock Obtain Failed Exception while reading the file in chunks iteratively.
Date Mon, 02 Sep 2013 06:10:31 GMT
There's a reason why Writer is being opened everytime inside a while 
loop. I usually open writer in main method itself as suggested by you 
and pass a reference to it. However what I have observed is that if my 
file contains more than 4 lakh lines, the writer.add(doc) line does not 
execute and throws an OutOfMemoryError although JVM is provided with 
enough heap.

You may please go through my another query on mailing list for the same 
under the heading :
"Files greater than 20 MB not getting Indexed. No files generated except 
write.lock even after 8-9 minutes. "

So, as a precautionary measure, I will read 300000 lines from a 
document, add them to writer and then again read another 300000 lines 
and add them to writer with same filename and path... This process needs 
to be carried out recursively until the whole file is read.

If there's another alternative I will be more than happy to know .

As of now, I still get StreamClosedException and 
LockObtainFailedException. So any help on this will be deeply appreciated..

On 9/1/2013 5:46 PM, Erick Erickson wrote:
> I really recommend you restructure your program, it's a hard to follow.
>
> For instance, you open a new IndexWriter every time through the
> while (flags)
> loop. You only close it in the
> if (iwcTemp1.getConfig().getOpenMode() == OpenMode.CREATE_OR_APPEND) {
> case. That may be the root of your problem right there.
>
> I'd probably open the indexWriter once in the main method, probably make
> into a
> member variable that's opened in an init method and closed in a finish
> method
> or some such.
>
> I'd separate the loop that traverses the directory structure away from the
> code that indexes the data.
>
> FWIW,
> Erick
>
>
> On Sun, Sep 1, 2013 at 7:28 AM, Ankit Murarka<ankit.murarka@rancoretech.com
>    
>> wrote:
>>      
>    
>> Hello.
>> I am struck in a problem and have been continously getting exception of
>> /*Stream Closed and LockObtainFailedException..*/
>>
>> I am trying to read the complete document line by line and once I have
>> read 100000 lines I will add the doc to the writer and then again read the
>> next 100000 lines and repeat the same process unless the complete file is
>> read.
>>
>> I cannot read the entire document line by line in 1 go due to memory
>> constraints.
>>
>> Can anyone kindly let me know what might be wrong with this code:
>>
>> The */StreamCLOSED/ *exception occurs when I try to add doc to writer for
>> iterations>1..In first iteration everything goes fine. /*Lock exception
>> */follows streamclosed exception..
>>
>> Code Snippet:
>>
>> package com.iorg;
>>
>> import org.apache.lucene.analysis.**Analyzer;
>> import org.apache.lucene.document.**Document;
>> import org.apache.lucene.document.**Field;
>> import org.apache.lucene.document.**LongField;
>> import org.apache.lucene.document.**StringField;
>> import org.apache.lucene.document.**TextField;
>> import org.apache.lucene.index.**IndexCommit;
>> import org.apache.lucene.index.**IndexWriter;
>> import org.apache.lucene.index.**IndexWriterConfig.OpenMode;
>> import org.apache.lucene.index.**IndexWriterConfig;
>> import org.apache.lucene.index.**IndexableField;
>> import org.apache.lucene.index.**LiveIndexWriterConfig;
>> import org.apache.lucene.index.**LogDocMergePolicy;
>> import org.apache.lucene.index.**LogMergePolicy;
>> import org.apache.lucene.index.**MergePolicy;
>> import org.apache.lucene.index.**MergePolicy.OneMerge;
>> import org.apache.lucene.index.**MergeScheduler;
>> import org.apache.lucene.index.Term;
>> import org.apache.lucene.store.**Directory;
>> import org.apache.lucene.store.**FSDirectory;
>> import org.apache.lucene.util.**Version;
>>
>> import com.rancore.**CustomAnalyzerForCaseSensitive**;
>>
>> import java.io.BufferedReader;
>> import java.io.File;
>> import java.io.FileInputStream;
>> import java.io.FileNotFoundException;
>> import java.io.FileReader;
>> import java.io.IOException;
>> import java.io.InputStreamReader;
>> import java.io.LineNumberReader;
>> import java.util.Date;
>> import java.util.Iterator;
>>
>> public class MainClass2 {
>>    public static void main(String[] args) {
>>
>>      String indexPath = args[0];  //Place where indexes will be created
>>      String docsPath=args[1];    //Place where the files are kept.
>>
>>     final File docDir = new File(docsPath);
>>     if (!docDir.exists() || !docDir.canRead()) {
>>        System.out.println("Document directory '" +docDir.getAbsolutePath()+
>> "' does not exist or is not readable, please check the path");
>>        System.exit(1);
>>      }
>>      Date start = new Date();
>>     try {
>>        System.out.println("Indexing to directory ONLY '" + indexPath +
>> "'..."+docsPath);
>>       Directory dir = FSDirectory.open(new File(indexPath));
>>       Analyzer analyzer=new CustomAnalyzerForCaseSensitive**
>> (Version.LUCENE_44);
>>       IndexWriterConfig iwc = new IndexWriterConfig(Version.**LUCENE_44,
>> analyzer);
>>       iwc.setOpenMode(OpenMode.**CREATE_OR_APPEND);
>>
>>        if(args[2].trim().**equalsIgnoreCase("OverAll")){
>>            System.out.println("IN");
>>            indexDocs(docDir,true,dir,iwc)**;
>>        }
>>        Date end = new Date();
>>       System.out.println(end.**getTime() - start.getTime() + " total
>> milliseconds");
>>
>>      } catch (IOException e) {
>>        System.out.println(" caught a " + e.getClass() +
>>         "\n with message: " + e.getMessage());
>>      }
>>      catch(Exception e)
>>      {
>>          e.printStackTrace();
>>      }
>>   }
>>
>>    //Over All
>> static void indexDocs(File file,boolean flag,Directory
>> dir,IndexWriterConfig iwc)
>>    throws IOException {
>>        FileInputStream fis = null;
>>   if (file.canRead()) {
>>      if (file.isDirectory())
>>      {
>>       String[] files = file.list();
>>       System.out.println("size of list is "  + files.length);
>>        if (files != null) {
>>          for (int i = 0; i<  files.length; i++) {
>>              System.out.println("Invoked for  "  +  i  +  "and  "  +
>> files[i]);
>>            indexDocs(new File(file, files[i]),flag,dir,iwc);
>>          }
>>        }
>>     }
>>      else {
>>          boolean flags=true;
>>        try {
>>          fis = new FileInputStream(file);
>>       } catch (FileNotFoundException fnfe) {
>>         fnfe.printStackTrace();
>>       }
>>        try {
>>            Document doc = new Document();
>>            LineNumberReader lnr=new LineNumberReader(new FileReader(file));
>>            Field pathField = new StringField("path", file.getPath(),
>> Field.Store.YES);
>>            doc.add(pathField);
>>            String line=null;
>>            System.out.println("**INITIALIZING");
>>            int i=0;
>>            doc.add(new StringField("TT",file.getName(**),Field.Store.YES));
>>            BufferedReader br=new BufferedReader(new InputStreamReader(fis));
>>             doc.add(new TextField("DD", br));
>>            while(flags)
>>            {
>>                System.out.println("Looping");
>>                IndexWriter iwcTemp1=new IndexWriter(dir,iwc);
>>            while( null != (line = lnr.readLine()) ){
>>                  i++;
>>              StringField sf=new StringField("EEE",line.trim(),**
>> Field.Store.YES);
>>                  doc.add(sf);
>>                if(i%10000==0)
>>                {
>>                    System.out.println("Breaking" +  i);
>>                    lnr.mark(i);
>>                    break;
>>                }
>>                sf=null;
>>            }
>>            if(line==null)
>>            {
>>                System.out.println("FALSE");
>>                flags=false;
>>            }
>>            System.out.println("Total value is  "  +  i);
>>            if (iwcTemp1.getConfig().**getOpenMode() ==
>> OpenMode.CREATE_OR_APPEND) {
>>                try
>>                {
>>                  iwcTemp1.addDocument(doc);
>>                    iwcTemp1.commit();
>>                    iwcTemp1.close();
>>                }catch(Throwable t)
>>                {
>>                    lnr.close();
>>                        br.close();
>>                        fis.close();
>>                       t.printStackTrace();
>>                }
>>
>>            } else {
>>                try
>>                {
>>
>>              System.out.println("updating " + file);
>>              iwcTemp1.updateDocument(new Term("path", file.getPath()), doc);
>>                }catch(Exception e)
>>                {
>>                    e.printStackTrace();
>>                }
>>            }
>>            System.out.println("END OF WHILE  ");
>>            lnr.reset();
>>            }//end of While
>>        }catch (Exception e) {
>>           e.printStackTrace();
>>        }finally {
>>         fis.close();
>>        }
>>      }
>>    }
>> }
>> }
>>
>>
>> Kindly guide as to where the possible problem lies. Trying to figure out
>> but to no avail..
>>
>> --
>> Regards
>>
>> Ankit Murarka
>>
>> "What lies behind us and what lies before us are tiny matters compared
>> with what lies within us"
>>
>>
>>      
>    


-- 
Regards

Ankit Murarka

"What lies behind us and what lies before us are tiny matters compared with what lies within
us"


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message