any23-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lewi...@apache.org
Subject svn commit: r5368 - /release/any23/1.0/RELEASE-NOTES.txt
Date Fri, 16 May 2014 20:37:59 GMT
Author: lewismc
Date: Fri May 16 20:37:59 2014
New Revision: 5368

Log:
add RELEASE-NOTES.txt

Added:
    release/any23/1.0/RELEASE-NOTES.txt

Added: release/any23/1.0/RELEASE-NOTES.txt
==============================================================================
--- release/any23/1.0/RELEASE-NOTES.txt (added)
+++ release/any23/1.0/RELEASE-NOTES.txt Fri May 16 20:37:59 2014
@@ -0,0 +1,373 @@
+                           Apache Any23 1.0
+                             Release Notes
+                         09/05/2014 (dd/mm/yyyy)
+
+Sub-task
+
+    [ANY23-148] - Programmes Ontology
+
+Bug
+
+    [ANY23-100] - Issue with RDFa extractor while processing nested properties
+    [ANY23-135] - Any23 RDFa Extractor ignores multiple prefix and property statements
+    [ANY23-136] - Some RDFa tests have incorrect expected results
+    [ANY23-168] - RDFa properties in <meta> elements not picked up
+    [ANY23-170] - Dependency error org.apache.commons:commons-csv:1.0-SNAPSHOT-rev1148315
+    [ANY23-172] - Fix minor issues with Any23 0.9.0 RC
+    [ANY23-173] - Please delete old releases from mirroring system
+    [ANY23-174] - Incorrect RDFa extractions
+    [ANY23-203] - Update version revisions from 0.9.1 to 1.0
+
+Improvement
+
+    [ANY23-65] - Update to RDFa extraction stylesheet
+    [ANY23-128] - html-rdfa11 extractor fails on mailto: anchors
+    [ANY23-130] - Improve aesthetics of the output format when straying from default java.io.PrintStream
+    [ANY23-137] - RDFa parser implementation proposal
+    [ANY23-179] - Improve Javadoc and throwing of IllegalArgumentException in Any23#createDocumentSource
+    [ANY23-180] - Create an Apache hosted jail running an Any23 service instance
+    [ANY23-181] - Upgrade NekoHTML to 1.9.20
+
+New Feature
+
+    [ANY23-134] - Create o.a.a.extractor.tika Parser and Extractor implementations
+    [ANY23-177] - Add support for JSON-LD
+
+Task
+
+    [ANY23-162] - Add package.java for all LKIFCore classes
+
+                           Apache Any23 0.9.0
+                             Release Notes
+                         28/10/2013 (dd/mm/yyyy)
+
+Sub-task
+
+    [ANY23-142] - LKIF-Core Vocabulary
+    [ANY23-143] - LRICore Vocabulary
+
+Bug
+
+    [ANY23-111] - Any23 raises an unmanaged exception from the Microdata parser
+    [ANY23-115] - Empty spans seem to break ANY23
+    [ANY23-161] - Fix service file generation
+    [ANY23-165] - "Invalid content" error if TITLE precedes encoding declaration in the document
+    [ANY23-171] - form.html not in correct location in service.
+
+Improvement
+
+    [ANY23-47] - Migrate basic-crawler classes to org.apache.nutch
+    [ANY23-164] - office-scraper ExcelExtractorFactory.java to accept application/x-tika-ooxml
and application/x-tika-msoffice formats
+
+New Feature
+
+    [ANY23-120] - Split CLI tools out into a new module
+
+Task
+
+    [ANY23-122] - Cleanup Distribution Mirrors
+
+                           Apache Any23 0.8.0
+                             Release Notes
+                         01/05/2013 (dd/mm/yyyy)
+                         
+Sub-task
+
+    [ANY23-109] - Missing tika-config.xml in o.a.a.mime
+    [ANY23-110] - DOAP Vocabulary
+
+Bug
+
+    [ANY23-44] - error when parsing a document from http://www.afdsi.org/docs/test/html/RDFa/_food-stream_.htm
+    [ANY23-78] - Download page links are broken
+    [ANY23-108] - Broken schema.org microdata extraction
+    [ANY23-112] - Fix incubation disclaimer
+    [ANY23-113] - Remove dependencies from parent pom.xml file
+    [ANY23-116] - Empty values are skipped when reading tab separated CSV.
+    [ANY23-156] - Add logging dependencies to plugins and service
+
+Improvement
+
+    [ANY23-2] - Add support for hreview-aggregate microformat.
+    [ANY23-26] - Upgrade dependency to Apache Tika 1.2
+    [ANY23-46] - Update Any23 web service
+    [ANY23-83] - Remove hardcoded formats throughout Any23 to make it useful as a library
+    [ANY23-101] - Use RDFFormat.NQUADS in nquads module
+    [ANY23-139] - Simplify site deploy plugging the maven-scm-publish-plugin
+    [ANY23-144] - Implement comprehensive naming of o.a.a.api.vocab classes
+
+New Feature
+
+    [ANY23-4] - Integrate W3C's RDFa test suite and pass all tests
+    [ANY23-85] - Split NQuads out into its own module
+    [ANY23-96] - Add user agent string to basic-crawler
+    [ANY23-117] - Split Mime type detection out into its own module
+    [ANY23-118] - Split Encoding detection out into its own module
+
+Task
+
+    [ANY23-41] - Write basic-crawler plugin documentation
+    [ANY23-125] - Drop the Incubating DISCLAIMER
+                         
+
+                             Apache Any23 0.7.0-incubating
+                              Release Notes
+                              25/06/2012
+
+Sub-task
+
+    [ANY23-25] - Update all Maven POM's in trunk
+    [ANY23-31] - Move any23 site documentation out of trunk and into its own SVN directory
+    [ANY23-53] - Bad Web Service documentation
+
+Bug
+
+    [ANY23-14] - Add support for Extractor sub results
+    [ANY23-20] - The Any23 PluginManager fails handing resource paths containing spaces.
+    [ANY23-34] - Plugin Integration Test Fails
+    [ANY23-37] - LGPL'ed components cannot be included in distribution packages
+    [ANY23-42] - Fix issue in RDFa11Parser.java is not resolving relative URIs correctly
+    [ANY23-49] - N3/NQ parsers ignoring stopAtFirstError flag
+    [ANY23-58] - HCardExtractor infinite loop and memory exhaustion
+    [ANY23-62] - ExtractionResultImpl loses all issues generated by sub extractions
+    [ANY23-73] - The ToolRunner CLI driver -p (--plugins-dir) option doesn't work because
parsed after the Tool list loading
+    [ANY23-77] - Facing a infinite loop problem in version 0.6.1 - Verify
+    [ANY23-78] - Download page links are broken
+    [ANY23-87] - Bogus arguement in o.a.a.cli.CrawlerTest
+    [ANY23-88] - any23 script -v or --version option doesn't display actual version
+    [ANY23-94] - The Microdata CLI tool doesn't work anymore
+    [ANY23-95] - Activate the IgnoreAccidentalRDFa filter for the Any23 Service instance
+    [ANY23-97] - The test suite was not running all tests, minor regressions occurred
+
+Improvement
+
+    [ANY23-18] - Add a new extractor for RDFa using java-rdfa
+    [ANY23-28] - Document munging of Any23 history to CHANGES.txt
+    [ANY23-32] - replace hardcoded bash script with generated via appassembler
+    [ANY23-33] - Replace proprietary SUN imports from Any23 classes.
+    [ANY23-45] - Improve issue verification support in Extractor tests
+    [ANY23-50] - Simplify plugin loading avoiding the classpath scanning
+    [ANY23-56] - Change repo-ext to Any23 SVN mirrior repo.
+    [ANY23-63] - The Any23 web service doesn't return the Issue Report generated by activated
Extractors, hiding major metadata issues
+    [ANY23-64] - Improve CLI uage aesthetics
+    [ANY23-70] - Establish searchable list archives
+    [ANY23-71] - improve the current CLI engine
+    [ANY23-74] - Disable domain triple generation in default configuration
+    [ANY23-75] - Improve runtime of the Microdata extractor on documents with many relations.
+    [ANY23-76] - Improve runtime of the Microformat extractor on documents with many relations.
+    [ANY23-82] - Don't use explicit reference to Log4j classes
+    [ANY23-86] - Better logging in SiteCrawlerTest
+
+New Feature
+
+    [ANY23-9] - Prepare a dedicated homepage for Any23
+    [ANY23-29] - Migrate code base to ASF infrastructure
+    [ANY23-57] - Create Any23 History documentation and add to site
+    [ANY23-59] - Create KEYS file for Any23
+    [ANY23-68] - Create Powered By documentation/page
+    [ANY23-102] - Any23 DOAP file
+
+Task
+
+    [ANY23-21] - Migrate all packages and classes to ORG.APACHE.ANY23
+    [ANY23-27] - Import revisions r1547 to r1607 from Google Code SVN to ASF SVN
+    [ANY23-36] - Merge GCode specific CHANGES.txt report in main changes.xml
+    [ANY23-39] - Write Down Overall Architecture Document to help new developers maintaining
the Any23 core
+    [ANY23-48] - Update Documentation (Site + READMEs) to reflect changes in shell script
usage
+    [ANY23-52] - Remove non ASF logos from Any23 Service page
+    [ANY23-66] - Fix Javadoc
+
+==========================================================================
+
+                             Apache Any23 0.6.1
+                              Release Notes
+
+Fixes
+
+ * Improved MIMEType detection for CSV input. [172, 176]
+
+==========================================================================
+
+                             Apache Any23 0.6.0
+                              Release Notes
+
+Fixes
+
+ * Fixed several bugs. [151, 153, 154, 155, 156, 164, 168]
+ * Removed unused Apache Any23 dependencies. [162]
+ * Introduced parent POM dependencyManagement. [163]
+ * Minor code refactoring. [142]
+ * Updated project documentation. [161]
+
+Enhancements
+
+ * Added support for Microdata [114, 141, 144, 145, 152, 157]
+ * Added RDFa 1.1 support for new prefix specification. [143]
+ * Added CSV Extractor (RDFizer). [150, 165]
+ * Added HTML/META Extractor. [148, 149]
+ * Improved Configuration programmatic management. [147]
+ * Added several flags to control metadata triples generation. [146]
+ * Improved nesting relationship explicitation in Microformat extractors. [80]
+ * Major Extractor interface refactoring. [160, 167]
+ * Improved TagSoup Extractor based error reporting. [159]
+ * Added command-line tool to print out the Apache Any23 declared vocabularies. [114]
+
+==========================================================================
+
+                              Apache Any23 0.6.0-M2
+                                Release Notes
+
+The release 0.6.0-M2 introduces major fixes on M1 milestone
+[154, 155, 156] and improves Configuration [147] and Microdata
+ error management[157].
+
+==========================================================================
+
+                             Apache Any23 0.6.0-M1
+                               Release Notes
+
+The release 0.6.0-M1 is an early preview of the
+Microdata support. [114]
+
+==========================================================================
+
+                             Apache Any23 0.5.0
+                              Release Notes
+
+Fixes
+
+ * Fixed wrong conversion of a generic XML file to RDF. [131]
+ * Fixed usage of 'base' tag when resolving relative URIs
+   in RDFa. [75]
+ * Fixed error parsing Turtle data. [87]
+ * Fixed issue with escaping in NQuads parser. [126]
+ * Fixed XML DTD validation attempt. [95]
+ * Fixed concurrent modification exception in
+   ExtractionContentBlocker filter. [86]
+ * Fixed mime type detection of direct input when source
+   contains blank chars. [83, 90]
+ * Fixed reporting when producing no triples. [79]
+ * Fixed any23-service packaging, added profile for excluding
+   embedded dependencies. [113]
+
+Enhancements
+
+ * Improved extraction report: added list of 
+   activated extractors. [89]
+ * Improved extraction of HTML link element. [133]
+ * Added XPath HTML extractor. [124]
+ * Added HRecipe Microformat extractor. [103]
+ * Added plugin support for Apache Any23. [111]
+ * Implemented HTML Scraper Plugin. [123]
+ * Upgraded to Sesame 2.4.0. [136]
+ * Upgraded to Jetty 8.0.0 [138]
+ * Upgraded maven-site-plugin. [85]
+ * Added flags to exclude metadata triples [134]
+ * Added removal of CSS related triples. [135]
+ * Improved overall documentation. [130]
+ * Overall POM refactoring. [125]
+
+==========================================================================
+
+                             Apache Any23 0.4.0 
+                              Release Notes
+
+* The any23-service module has been separated from the any23-core module,
+  the Ant build system has been dropped. [Issue 44]
+* Added support for HTML metadata (RDFa / Microformats) validation
+  and correction (validator). [Issue 77]
+* Added flag to disable the nesting relationship property 
+  enrichment. [Issue 67]
+* Improved coverage of Microformats tests. [Issue 65]
+* Improved documentation. [Issue 44]
+* Various code consolidation. [Issues 68, 69, 70, 71, 72, 73, 74, 77]
+
+==========================================================================
+
+	                         Apache Any23 0.3.0 
+                              Release Notes
+
+* Added detection and enrichment of nested microformats. [Issue #61]
+* Added detection and support of N-Quads as input and output format. [Issue #7]
+* General Improvements in RDFa extraction. [Issue #12, Issue #14]
+* Added support of Turtle embedded in HTML script tag. [Issue #62]
+* Improvement in encoding support. [Issue #43]
+* Improvement in Core API. [Issue #27]
+* Improved support for Species Microformat. [Issue #63]
+* General Code prettification.
+
+==========================================================================
+
+	                         Apache Any23 0.2.2 
+                              Release Notes
+
+* Fixed dependency management on Maven. A second level dependency of Xerces
+  introduced a conflict on the java.xml.transform API causing wrong XSLT 
+  transformations within RDFa extractor.
+
+==========================================================================
+
+	                         Apache Any23 0.2.1 
+                              Release Notes
+
+* Major applyFix on Tika configuration management. This applyFix solves the 
+  auto detection of the main Semantic Web related formats.
+
+==========================================================================
+
+                            Apache Any23 0.2
+                             Release Notes
+
+============
+Introduction
+============
+
+This release features a redesigned API and incorporating enhancements and
+bug fixes that have accumulated since the 0.1 release.
+Apart  from  some  new  or changed dependencies on the underlying libraries,
+this  version  comes  with an improved unit test coverage and other features
+like the automatic charset encoding detection and an improved documentation.
+Maven build system has been introduced.
+
+
+==================================
+Summary of major changes since 0.1
+==================================
+
+* Redesigned Java API
+    - Input from string, stream, file, or URI
+    - Allow choosing which extractors to use
+    - Report origin of triples (document/extractor) to client processors
+    - Various processors/serializers for extracted triples
+* Added flexible command-line tool for easy testing
+* Vastly improved website and documentation
+* Media type and encoding detection via Apache Tika
+* Switched RDF library from Jena to Sesame
+* Added Maven build
+* Better RDF extraction from Microformats
+* Extractors now come with an example file to document typical in- and output
+* Major refactoring
+* Lots and lots of bugfixes
+
+=================
+Supported formats
+=================
+
+* RDF/XML
+* Notation3 and Turtle
+* N-Triples
+* RDFa
+
+Various microformats, see http://sindice.com/developers/microformat on Sindice Microformats
support.
+
+===================
+Dependency Upgrade
+===================
+
+CyberNeko Html parser has been upgraded to 1.9.14.
+
+Apache Tika 0.3 has been replaced with 0.6, with the
+new  support  for  the automatic encoding detection.
+
+EOF
+



Mime
View raw message