lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <erik.hatc...@gmail.com>
Subject Re: indexing java byte code in classes / jars
Date Fri, 08 May 2015 20:19:19 GMT
Oh, and sorry, I omitted a couple of details:

# creating the “java” core/collection
bin/solr create -c java 

# I ran this from my Solr source code checkout, so that SolrLogFormatter.class just happened
to be handy

	Erik




> On May 8, 2015, at 4:11 PM, Erik Hatcher <erik.hatcher@gmail.com> wrote:
> 
> What kinds of searches do you want to run?  Are you trying to extract class names, method
names, and such and make those searchable?   If that’s the case, you need some kind of “parser”
to reverse engineer that information from .class and .jar files before feeding it to Solr,
which would happen before analysis.   Java itself comes with a javap command that can do this;
whether this is the “best” way to go for your scenario I don’t know, but here’s an
interesting example pasted below (using Solr 5.x).
> 
> —
> Erik Hatcher, Senior Solutions Architect
> http://www.lucidworks.com
> 
> 
> javap build/solr-core/classes/java/org/apache/solr/SolrLogFormatter.class > test.txt
> bin/post -c java test.txt
> 
> now search for "coreInfoMap" http://localhost:8983/solr/java/browse?q=coreInfoMap
> 
> I tried to be cleverer and use the stdin option of bin/post, like this: 
> javap build/solr-core/classes/java/org/apache/solr/SolrLogFormatter.class | bin/post
-c java -url http://localhost:8983/solr/java/update/extract -type text/plain -params "literal.id=SolrLogFormatter"
-out yes -d
> but something isn’t working right with the stdin detection like that (it does work
to `cat test.txt | bin/post…` though, hmmm)
> 
> test.txt looks like this, `cat test.txt`:
> Compiled from "SolrLogFormatter.java"
> public class org.apache.solr.SolrLogFormatter extends java.util.logging.Formatter {
>  long startTime;
>  long lastTime;
>  java.util.Map<org.apache.solr.SolrLogFormatter$Method, java.lang.String> methodAlias;
>  public boolean shorterFormat;
>  java.util.Map<org.apache.solr.core.SolrCore, org.apache.solr.SolrLogFormatter$CoreInfo>
coreInfoMap;
>  public java.util.Map<java.lang.String, java.lang.String> classAliases;
>  static java.lang.ThreadLocal<java.lang.String> threadLocal;
>  public org.apache.solr.SolrLogFormatter();
>  public void setShorterFormat();
>  public java.lang.String format(java.util.logging.LogRecord);
>  public void appendThread(java.lang.StringBuilder, java.util.logging.LogRecord);
>  public java.lang.String _format(java.util.logging.LogRecord);
>  public java.lang.String getHead(java.util.logging.Handler);
>  public java.lang.String getTail(java.util.logging.Handler);
>  public java.lang.String formatMessage(java.util.logging.LogRecord);
>  public static void main(java.lang.String[]) throws java.lang.Exception;
>  public static void go() throws java.lang.Exception;
>  static {};
> }
> 
>> On May 8, 2015, at 3:31 PM, Mark <javamark@gmail.com> wrote:
>> 
>> I looking to use Solr search over the byte code in Classes and Jars.
>> 
>> Does anyone know or have experience of Analyzers, Tokenizers, and Token
>> Filters for such a task?
>> 
>> Regards
>> 
>> Mark
> 


Mime
View raw message