lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anurag Sharma (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SOLR-6478) need docs / tests of the "rules" as far as collection names go
Date Sun, 02 Nov 2014 12:45:34 GMT

    [ https://issues.apache.org/jira/browse/SOLR-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14193817#comment-14193817
] 

Anurag Sharma edited comment on SOLR-6478 at 11/2/14 12:45 PM:
---------------------------------------------------------------

Unit test covering the allowed and not allowed collection names is attached. 

W3 http://www.w3.org/Addressing/URL/uri-spec.html has a standard for valid character set in
the URI. In the code currently there are no filters to disallow any character. W3 guideline
can be used to filter some characters in the collection name.

Query params having special characters or whitespaces can be send after encoding while making
API calls. Here is an example to create "rand chars {£ & $ 1234567890-+=`~@\}" collection

{code}
$ curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=rand%20chars%20%7B%C2%A3%20%26%20%24%201234567890-%2B%3D%60~%40%7D&numShards=1&collection.configName=myconf&indent=true&wt=json'

{
  "responseHeader":{
    "status":0,
    "QTime":28509},
  "success":{
    "":{
      "responseHeader":{
        "status":0,
        "QTime":22011},
      "core":"rand chars {£ & $ 1234567890-+=`~@\}_shard1_replica1"}}}

{code}


was (Author: anuragsharma):
Unit test covering the allowed and not allowed collection names is attached. 

W3 http://www.w3.org/Addressing/URL/uri-spec.html has a standard for valid character set in
the URI. In the code currently there are no filters to disallow any character. W3 guideline
can be used to filter some characters in the collection name.

Query params having special characters or whitespaces can be send after encoding while making
API calls. Here is an example to create "rand chars {£ & $ 1234567890-+=`~@}" collection
{code}
$ curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=rand%20chars%20%7B%C2%A3%20%26%20%24%201234567890-%2B%3D%60~%40%7D&numShards=1&collection.configName=myconf&indent=true&wt=json'

{
  "responseHeader":{
    "status":0,
    "QTime":28509},
  "success":{
    "":{
      "responseHeader":{
        "status":0,
        "QTime":22011},
      "core":"rand chars {£ & $ 1234567890-+=`~@}_shard1_replica1"}}}

{code}

> need docs / tests of the "rules" as far as collection names go
> --------------------------------------------------------------
>
>                 Key: SOLR-6478
>                 URL: https://issues.apache.org/jira/browse/SOLR-6478
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Hoss Man
>              Labels: difficulty-medium, impact-medium
>         Attachments: SOLR-6478.patch
>
>
> historically, the rules for "core" names have been vague but implicitly defined based
on the rule that it had to be a valid directory path name - but i don't know that we've ever
documented anywhere what the rules are for a "collection" name when dealing with the Collections
API.
> I haven't had a chance to try this, but i suspect that using the Collections API you
can create any collection name you want, and the zk/clusterstate.json data will all be fine,
and you'll then be able to request anything you want from that collection as long as you properly
URL escape it in your request URLs ... but we should have a test that tries to do this, and
document any actual limitations that pop up and/or fix those limitations so we really can
have arbitrary collection names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message