lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: svn commit: r1502468 - in /lucene/dev/trunk/solr/core/src: java/org/apache/solr/core/CorePropertiesLocator.java test/org/apache/solr/core/TestSolrXmlPersistor.java
Date Fri, 12 Jul 2013 12:21:37 GMT
I checked all possibilities we have, also by reading source code of JDK:

 

We can have „one migration strategy”:

-          We forbid writing with OutputStream, so we won’t produce new ASCII-only files
and we write new files as UTF-8. Older Solr versions no longer can read those files, but this
is not a problem.

-          We forbid reading with InputStream, because that one can no longer read files written
as UTF-8 without escapes.

-          We allow only reading by Reader and the Reader must be UTF-8 -> this allows
to still load old properties files loaded by older Solr versions (because when they were written
in the old format, the reader code allows also Unicode escapes with \u, so “old-style”
files are still parseable.

 

So this would allow us to enforce the Reader/Writer API, if the charset to be used is UTF-8

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Uwe Schindler [mailto:uwe@thetaphi.de] 
Sent: Friday, July 12, 2013 2:09 PM
To: dev@lucene.apache.org
Subject: RE: svn commit: r1502468 - in /lucene/dev/trunk/solr/core/src: java/org/apache/solr/core/CorePropertiesLocator.java
test/org/apache/solr/core/TestSolrXmlPersistor.java

 

Hi,

 

There are several reasons why I added this:

The biggest issue is consistence: we should decide to use either one or the other, but not
mixed. The recent commits on Solr were wrong in that case, because the wrote the files using
a defined reader but read it using the InputStream and similar problems. The commit was to
prevent this.

 

I agree, both properties files formats are defined in JDK6 docs, but the “original” one
defined by Sun once a while back was specified to be: “ISO-8859-1 and Unicode escapes”.
And almost all the properties files out there are using this encoding, including all of the
ones included in the JDK (see your JDK folder, all properties files there are in this format,
one example: $JAVA_HOME/jre/lib/deploy/messages_ja.properties for a crazy one). I agree, for
newer developments one should use a newer format, but the problem we have in Solr is that
we are no longer able to read old properties files – which were always written in the Sun
original specification format (see http://docs.oracle.com/javase/1.4.2/docs/api/java/util/Properties.html
for the format).

 

The commit was just there to make it consistent and enforce consistence – that’s all.
We can discuss about that, maybe only enforce this for Solr. For the core load/save the properties
files are written by machines only, so nobody edits them by hand. Lucene does not use properties
files, that was the hole thing.

 

Robert, please, before complaining again – don’t think about fucking Unicode only, think
about standards defined long time ago (and the properties file format is one of those). This
is just for consistency. The reasoning behind the whole thing is similar to my complaints
about XML: XML also needs to be read through an InputStream because they are binary and charsetless
(charset is part of the fileformat; application/xml and not text/xml is the MIME type). Properties
files are somehow also binary J and were defined in the past to use Unicode Escapes and ISO-8859-1
charset (http://docs.oracle.com/javase/1.4.2/docs/api/java/util/Properties.html - that’s
the oldest one I got).

 

In general we should not use properties files at all, so I would personally forbid them completely,
but Solr used them for longer time now.

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Robert Muir [mailto:rcmuir@gmail.com] 
Sent: Friday, July 12, 2013 1:41 PM
To: dev@lucene.apache.org
Subject: Re: svn commit: r1502468 - in /lucene/dev/trunk/solr/core/src: java/org/apache/solr/core/CorePropertiesLocator.java
test/org/apache/solr/core/TestSolrXmlPersistor.java

 

Where is this 'officially defined one' according to JVM spec? Please give a reference, as
the Reader api is just as well defined as the InputStream API.

If we want consistency, I want the Reader one!
Just so you know, i totally disagree with this. I refuse to use native2ascii. 
I think these commits banning the Reader API should be reverted.

On Fri, Jul 12, 2013 at 4:53 AM, Uwe Schindler <uwe@thetaphi.de> wrote:

Yes, and be sure to do the opposite when reading properties files! Otherwise it is not consistent
and loading/saving and exceptions may happen.

I am working on forbidden-apis to ensure we are consistent everywhere. The binary properties
file format is the officially defined one according to JVM spec.


Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----

> From: Alan Woodward [mailto:alan@flax.co.uk]
> Sent: Friday, July 12, 2013 10:47 AM
> To: dev@lucene.apache.org
> Subject: Re: svn commit: r1502468 - in /lucene/dev/trunk/solr/core/src:
> java/org/apache/solr/core/CorePropertiesLocator.java
> test/org/apache/solr/core/TestSolrXmlPersistor.java
>
> Does this fix it?
>
> @@ -79,9 +76,7 @@ public class CorePropertiesLocator implements
> CoresLocator {
>      OutputStream os = null;
>      try {
>        os = new FileOutputStream(propfile);
> -      Writer writer = new OutputStreamWriter(os, Charsets.UTF_8);
> -      p.store(writer, "Written by CorePropertiesLocator on " + new Date());
> -      writer.close();
> +      p.store(os, "Written by CorePropertiesLocator on " + new Date());
>      }
>      catch (IOException e) {
>
> Alan Woodward
> www.flax.co.uk
>
>
> On 12 Jul 2013, at 09:35, Uwe Schindler wrote:
>
> > Hi,
> >
> > you have tob e careful: If you store properties with a writer but load it with
> InputStream, the code is different. Properties files have a defined charset of
> ISO-8859-1:
> >
> > "The load(Reader) / store(Writer, String) methods load and store
> properties from and to a character based stream in a simple line-oriented
> format specified below. The load(InputStream) / store(OutputStream,
> String) methods work the same way as the load(Reader)/store(Writer,
> String) pair, except the input/output stream is encoded in ISO 8859-1
> character encoding. Characters that cannot be directly represented in this
> encoding can be written using Unicode escapes as defined in section 3.3 of
> The Java™ Language Specification; only a single 'u' character is allowed in an
> escape sequence. The native2ascii tool can be used to convert property files
> to and from other character encodings."
> >
> > So be sure to be consistent when loading/saving! If we previously (in older
> Solr version) used the InputStream methods to load/store core props, we
> should use ISO-8859-1 to load/store to be compatible with older versions!
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >
> >> -----Original Message-----
> >> From: romseygeek@apache.org [mailto:romseygeek@apache.org]
> >> Sent: Friday, July 12, 2013 10:26 AM
> >> To: commits@lucene.apache.org
> >> Subject: svn commit: r1502468 - in /lucene/dev/trunk/solr/core/src:
> >> java/org/apache/solr/core/CorePropertiesLocator.java
> >> test/org/apache/solr/core/TestSolrXmlPersistor.java
> >>
> >> Author: romseygeek
> >> Date: Fri Jul 12 08:25:36 2013
> >> New Revision: 1502468
> >>
> >> URL: http://svn.apache.org/r1502468
> >> Log:
> >> SOLR-4914: Close OutputStreamWriter properly, use
> >> System.getProperty("line.separator") instead of \n
> >>
> >> Fixes Windows test failures.
> >>
> >> Modified:
> >>
> >> lucene/dev/trunk/solr/core/src/java/org/apache/solr/core/CoreProperti
> >> esL
> >> ocator.java
> >>
> >> lucene/dev/trunk/solr/core/src/test/org/apache/solr/core/TestSolrXmlP
> >> ersi
> >> stor.java
> >>
> >> Modified:
> >> lucene/dev/trunk/solr/core/src/java/org/apache/solr/core/CoreProperti
> >> esL
> >> ocator.java
> >> URL:
> >> http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/
> >> apa
> >>
> che/solr/core/CorePropertiesLocator.java?rev=1502468&r1=1502467&r2=15
> >> 0
> >> 2468&view=diff
> >>
> ==========================================================
> >> ====================
> >> ---
> >> lucene/dev/trunk/solr/core/src/java/org/apache/solr/core/CoreProperti
> >> esL
> >> ocator.java (original)
> >> +++
> lucene/dev/trunk/solr/core/src/java/org/apache/solr/core/CoreProp

> >> +++ ert iesLocator.java Fri Jul 12 08:25:36 2013

> >> @@ -20,6 +20,7 @@ package org.apache.solr.core;  import
> >> com.google.common.base.Charsets;  import
> >> com.google.common.collect.Lists;  import
> >> org.apache.solr.common.SolrException;
> >> +import org.apache.solr.util.IOUtils;
> >> import org.slf4j.Logger;
> >> import org.slf4j.LoggerFactory;
> >>
> >> @@ -27,6 +28,7 @@ import java.io.File; import
> >> java.io.FileInputStream; import java.io.FileOutputStream; import
> >> java.io.IOException;
> >> +import java.io.OutputStream;
> >> import java.io.OutputStreamWriter;
> >> import java.io.Writer;
> >> import java.util.Date;
> >> @@ -56,14 +58,7 @@ public class CorePropertiesLocator imple
> >>         throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
> >>                                 "Could not create a new core in " + cd.getInstanceDir()
> >>                               + "as another core is already defined there");
> >> -      try {
> >> -        Properties p = buildCoreProperties(cd);
> >> -        Writer writer = new OutputStreamWriter(new
> >> FileOutputStream(propFile), Charsets.UTF_8);
> >> -        p.store(writer, "Written by CorePropertiesLocator on " + new
> Date());
> >> -      }
> >> -      catch (IOException e) {
> >> -        logger.error("Couldn't persist core properties to {}: {}",
> >> propFile.getAbsolutePath(), e);
> >> -      }
> >> +      writePropertiesFile(cd, propFile);
> >>     }
> >>   }
> >>
> >> @@ -75,14 +70,25 @@ public class CorePropertiesLocator imple
> >>   public void persist(CoreContainer cc, CoreDescriptor... coreDescriptors) {
> >>     for (CoreDescriptor cd : coreDescriptors) {
> >>       File propFile = new File(new File(cd.getInstanceDir()),
> >> PROPERTIES_FILENAME);
> >> -      try {
> >> -        Properties p = buildCoreProperties(cd);
> >> -        Writer writer = new OutputStreamWriter(new
> >> FileOutputStream(propFile), Charsets.UTF_8);
> >> -        p.store(writer, "Written by CorePropertiesLocator on " + new
> Date());
> >> -      }
> >> -      catch (IOException e) {
> >> -        logger.error("Couldn't persist core properties to {}: {}",
> >> propFile.getAbsolutePath(), e);
> >> -      }
> >> +      writePropertiesFile(cd, propFile);
> >> +    }
> >> +  }
> >> +
> >> +  private void writePropertiesFile(CoreDescriptor cd, File propfile)  {
> >> +    Properties p = buildCoreProperties(cd);
> >> +    OutputStream os = null;
> >> +    try {
> >> +      os = new FileOutputStream(propfile);
> >> +      Writer writer = new OutputStreamWriter(os, Charsets.UTF_8);
> >> +      p.store(writer, "Written by CorePropertiesLocator on " + new Date());
> >> +      writer.close();
> >> +    }
> >> +    catch (IOException e) {
> >> +      logger.error("Couldn't persist core properties to {}: {}",
> >> propfile.getAbsolutePath(), e);
> >> +    }
> >> +    finally {
> >> +      if (os != null)
> >> +        IOUtils.closeQuietly(os);
> >>     }
> >>   }
> >>
> >>
> >> Modified:
> >> lucene/dev/trunk/solr/core/src/test/org/apache/solr/core/TestSolrXmlP
> >> ersi
> >> stor.java
> >> URL:
> >> http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/test/org/
> >> apa
> >>
> che/solr/core/TestSolrXmlPersistor.java?rev=1502468&r1=1502467&r2=150
> >> 2
> >> 468&view=diff
> >>
> ==========================================================
> >> ====================
> >> ---
> >> lucene/dev/trunk/solr/core/src/test/org/apache/solr/core/TestSolrXmlP
> >> ersi
> >> stor.java (original)
> >> +++ lucene/dev/trunk/solr/core/src/test/org/apache/solr/core/TestSolr

> >> +++ Xml Persistor.java Fri Jul 12 08:25:36 2013

> >> @@ -70,8 +70,8 @@ public class TestSolrXmlPersistor {
> >>
> >>     SolrXMLCoresLocator persistor = new SolrXMLCoresLocator(new
> >> File("testfile.xml"), solrxml, null);
> >>     assertEquals(persistor.buildSolrXML(cds),
> >> -          "<solr><cores>\n"
> >> -        + "    <core name=\"testcore\" instanceDir=\"instance/dir/\"/>\n"
> >> +          "<solr><cores>" + SolrXMLCoresLocator.NEWLINE
> >> +        + "    <core name=\"testcore\" instanceDir=\"instance/dir/\"/>"
+
> >> SolrXMLCoresLocator.NEWLINE
> >>         + "</cores></solr>");
> >>   }
> >>
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > additional commands, e-mail: dev-help@lucene.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

 


Mime
View raw message