lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koji Sekiguchi <k...@r.email.ne.jp>
Subject Re: Train Lucene with topic-defined files
Date Sun, 22 Jun 2014 23:46:04 GMT
Hi benglish,

Your code looks good to me. Once you have got an index, then you can do train()
on the index as I've told you. For the training and test, please take a look at
TestLuceneIndexClassifier program for your reference that is put in the blog
post I've introduced you.

Good luck!

Koji
-- 
http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html

(2014/06/23 1:50), benglish wrote:
> Dear Koji,
>
> I'd deeply appreciate your kind helps.
> I have created the index something like below. I was wondering if you would
> tell me if this code is correct and also help me for the rest of the code
> and how to insert the index into the training part of the Naive Bayes
> classifier
>
>
> private static void BuildIndex
> {
>          Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);
>
>          File myIndex = new File("indexCorpus/");
>          Directory index = FSDirectory.open(myIndex);
>
>          IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_40,
> analyzer);
>          IndexWriter w = new IndexWriter(index, config);
>
>
>          File dir = new File("trainCorpus/");
>          File[] directoryListing = dir.listFiles();
>          int x = directoryListing.length; //#files in a directory
>          for(int i = 0; i < x; i++)
>          {
>              String mypath = directoryListing[i].toString();
>              Path wiki_path = Paths.get(mypath); //new file is chosen
>                 ....
>
>              addDoc(w, eachFileContent, eachFileCategory);
>
>          }
>
>          w.close();
> }
>
> private static void addDoc(IndexWriter w, String content, String category)
> throws IOException {
>          org.apache.lucene.document.Document doc = new
> org.apache.lucene.document.Document();
>          doc.add(new org.apache.lucene.document.TextField("content", content,
> Field.Store.YES));
>
>          doc.add(new StringField("category", category, Field.Store.YES));
>
>          w.addDocument(doc);
>      }
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Train-Lucene-with-topic-defined-files-tp4141979p4143329.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>




Mime
View raw message