lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Diego Ceccarelli (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SOLR-8542) Integrate Learning to Rank into Solr
Date Sat, 16 Jan 2016 17:18:40 GMT

    [ https://issues.apache.org/jira/browse/SOLR-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15103260#comment-15103260
] 

Diego Ceccarelli edited comment on SOLR-8542 at 1/16/16 5:18 PM:
-----------------------------------------------------------------

Hi Ishan, thanks for pointing out SOLR-8183, I didn't know about that, it seems quite related.

We can plug RankLib creating a new class representing the new LTR model, extending [ModelMetadata|https://github.com/bloomberg/lucene-solr/blob/trunk-learning-to-rank-plugin/solr/contrib/ltr/src/java/org/apache/solr/ltr/feature/ModelMetadata.java],
for example:

{code:java}
public class RankLibModel extends ModelMetadata {
	
	Ranker rankLibRanker;
	RankerFactory rankerFactory = new RankerFactory();
	DenseDataPoint documentFeatures = new DenseDataPoint(); // this contructor is missing, we
will need a way to create a datapoint
	
	public RankLibModel(String name, String type, List<Feature> features,
	      String featureStoreName, Collection<Feature> allFeatures,
	      NamedParams params) {
		  super(name, type, features, featureStoreName, allFeatures, params);
		  // the  file containing the model is  a parameter
		  String ranklibModelFile = getParams().getParam("model-file")
		  // load the model
		  rankLibRanking = rankerFactory.loadModel(ranklibModelFile);
	}
	
	@Override
	public float score(float[] modelFeatureValuesNormalized) {
		// set the feature vector in the datapoint object
	        documentFeatures.setFeatureVector(modelFeatureValuesNormalized)
		// predict the score using the ranklib model
		return rankLibRanker.eval(documentFeatures);
	}
		  
}
{code}
	
This code will load a particular RankLib model, using the file specified into the model store
configuration. 
If you send to Solr a model configuration file like this:

{code}
{
    "type":"org.apache.solr.ltr.ranking.RankLibModel",
    "name":"ranklib-GBDT",
    "features":[
    {"name":"isInStock"},
    {"name":"price"},
    {"name":"originalScore"},
    {"name":"productNameMatchQuery"}
    ],
    "params":{
		"model-file":"/data/ranking/ranking-GBDT.txt"        
    }
}
{code}

The plugin will create a RankLib model by using the model in {{/data/ranking/ranking-GBDT.txt}}
and you'll be able 
to use it at ranking time using its name {{ranklib-GBDT}}, adding the {{ltr}} param to the
query: 

{code}
http://localhost:8983/solr/techproducts/query?indent=on&q=test&wt=json&rq={!ltr
model=ranklib-GBDT reRankDocs=25} 
{code}

At query time, the features {{isInStock}} , {{price}} , {{originalScore}} , and {{productNameMatchQuery}}
will be computed and provided in the {{score(float[] modelFeatureValuesNormalized)}} method
in order to get the new predicted score 
for each document. If RankLib's licence is compatible I think we could plug this into the
plugin. Any comments? 


was (Author: diegoceccarelli):
Hi Ishan, thanks for pointing out SOLR-8183, I didn't know about that, it seems quite related.

We can plug RankLib creating a new class representing the new LTR model, extending [ModelMetadata|https://github.com/bloomberg/lucene-solr/blob/trunk-learning-to-rank-plugin/solr/contrib/ltr/src/java/org/apache/solr/ltr/feature/ModelMetadata.java],
for example:

{code:java}
public class RankLibModel extends ModelMetadata {
	
	Ranker rankLibRanker;
	RankerFactory rankerFactory = new RankerFactory();
	DenseDataPoint documentFeatures = new DenseDataPoint(); // this contructor is missing, we
will need a way to create a datapoint
	
	public RankLibModel(String name, String type, List<Feature> features,
	      String featureStoreName, Collection<Feature> allFeatures,
	      NamedParams params) {
		  super(name, type, features, featureStoreName, allFeatures, params);
		  // the  file containing the model is  a parameter
		  String ranklibModelFile = getParams().getParam("model-file")
		  // load the model
		  rankLibRanking = rankerFactory.loadModel(ranklibModelFile);
	}
	
	@Override
	public float score(float[] modelFeatureValuesNormalized) {
		// set the feature vector in the datapoint object
	        documentFeatures.setFeatureVector(modelFeatureValuesNormalized)
		// predict the score using the ranklib model
		return rankLibRanker.eval(documentFeatures);
	}
		  
}
{code}
	
This code will load a particular RankLib model, using the file specified into the model store
configuration. 
If you send to Solr a model configuration file like this:

{code}
{
    "type":"org.apache.solr.ltr.ranking.RankLibModel",
    "name":"ranklib-GBDT",
    "features":[
    {"name":"isInStock"},
    {"name":"price"},
    {"name":"originalScore"},
    {"name":"productNameMatchQuery"}
    ],
    "params":{
		"model-file":"/data/ranking/ranking-GBDT.txt"        
    }
}
{code}

The plugin will create a RankLib model by using the model in {{/data/ranking/ranking-GBDT.txt}}
and you'll be able 
to use it at ranking time using its name {{ranklib-GBDT}}, adding the {{ltr}} param to the
query: 

{code}
http://localhost:8983/solr/techproducts/query?indent=on&q=test&wt=json&rq={!ltr
model=ranklib-GBDT reRankDocs=25} 
{code}

At query time, the features {{isInStock}} , {{price}} , {{originalScore}} , and {{productNameMatchQuery}}
will be computed and 
and provided in the {{score(float[] modelFeatureValuesNormalized)}} method in order to get
the new predicted score 
for each document. If RankLib's licence is compatible I think we could plug this into the
plugin. Any comments? 

> Integrate Learning to Rank into Solr
> ------------------------------------
>
>                 Key: SOLR-8542
>                 URL: https://issues.apache.org/jira/browse/SOLR-8542
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Joshua Pantony
>            Assignee: Christine Poerschke
>            Priority: Minor
>         Attachments: README.md, README.md, SOLR-8542-branch_5x.patch, SOLR-8542-trunk.patch
>
>
> This is a ticket to integrate learning to rank machine learning models into Solr. Solr
Learning to Rank (LTR) provides a way for you to extract features directly inside Solr for
use in training a machine learned model. You can then deploy that model to Solr and use it
to rerank your top X search results. This concept was previously presented by the authors
at Lucene/Solr Revolution 2015 ( http://www.slideshare.net/lucidworks/learning-to-rank-in-solr-presented-by-michael-nilsson-diego-ceccarelli-bloomberg-lp
).
> The attached code was jointly worked on by Joshua Pantony, Michael Nilsson, and Diego
Ceccarelli.
> Any chance this could make it into a 5x release? We've also attached documentation as
a github MD file, but are happy to convert to a desired format.
> h3. Test the plugin with solr/example/techproducts in 6 steps
> Solr provides some simple example of indices. In order to test the plugin with 
> the techproducts example please follow these steps
> h4. 1. compile solr and the examples 
> cd solr
> ant dist
> ant example
> h4. 2. run the example
> ./bin/solr -e techproducts 
> h4. 3. stop it and install the plugin:
>    
> ./bin/solr stop
> mkdir example/techproducts/solr/techproducts/lib
> cp build/contrib/ltr/lucene-ltr-6.0.0-SNAPSHOT.jar example/techproducts/solr/techproducts/lib/
> cp contrib/ltr/example/solrconfig.xml example/techproducts/solr/techproducts/conf/
> h4. 4. run the example again
>     
> ./bin/solr -e techproducts
> h4. 5. index some features and a model
> curl -XPUT 'http://localhost:8983/solr/techproducts/schema/fstore'  --data-binary "@./contrib/ltr/example/techproducts-features.json"
 -H 'Content-type:application/json'
> curl -XPUT 'http://localhost:8983/solr/techproducts/schema/mstore'  --data-binary "@./contrib/ltr/example/techproducts-model.json"
 -H 'Content-type:application/json'
> h4. 6. have fun !
> *access to the default feature store*
> http://localhost:8983/solr/techproducts/schema/fstore/_DEFAULT_ 
> *access to the model store*
> http://localhost:8983/solr/techproducts/schema/mstore
> *perform a query using the model, and retrieve the features*
> http://localhost:8983/solr/techproducts/query?indent=on&q=test&wt=json&rq={!ltr%20model=svm%20reRankDocs=25%20efi.query=%27test%27}&fl=*,[features],price,score,name&fv=true



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message