From commits-return-1662-apmail-any23-commits-archive=any23.apache.org@any23.apache.org Mon Sep 21 03:02:54 2020 Return-Path: X-Original-To: apmail-any23-commits-archive@www.apache.org Delivered-To: apmail-any23-commits-archive@www.apache.org Received: from mxout1-he-de.apache.org (mxout1-he-de.apache.org [95.216.194.37]) by minotaur.apache.org (Postfix) with ESMTP id 8D80719DDC for ; Mon, 21 Sep 2020 03:02:53 +0000 (UTC) Received: from mail.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by mxout1-he-de.apache.org (ASF Mail Server at mxout1-he-de.apache.org) with SMTP id 6238D63867 for ; Mon, 21 Sep 2020 03:02:52 +0000 (UTC) Received: (qmail 88079 invoked by uid 500); 21 Sep 2020 03:02:51 -0000 Delivered-To: apmail-any23-commits-archive@any23.apache.org Received: (qmail 88040 invoked by uid 500); 21 Sep 2020 03:02:51 -0000 Mailing-List: contact commits-help@any23.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: any23-dev@any23.apache.org Delivered-To: mailing list commits@any23.apache.org Received: (qmail 88027 invoked by uid 99); 21 Sep 2020 03:02:50 -0000 Received: from Unknown (HELO svn01-us-east.apache.org) (13.90.137.153) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Sep 2020 03:02:50 +0000 Received: from svn01-us-east.apache.org (svn01-us-east.apache.org [127.0.0.1]) by svn01-us-east.apache.org (ASF Mail Server at svn01-us-east.apache.org) with ESMTP id 9365717A1C9 for ; Mon, 21 Sep 2020 03:02:49 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: svn commit: r41520 - /dev/any23/2.4/ Date: Mon, 21 Sep 2020 03:02:49 -0000 To: commits@any23.apache.org From: lewismc@apache.org X-Mailer: svnmailer-1.0.9 Message-Id: <20200921030249.9365717A1C9@svn01-us-east.apache.org> Author: lewismc Date: Mon Sep 21 03:02:49 2020 New Revision: 41520 Log: Stage Apache Any23 RC#1 dev artifacts Added: dev/any23/2.4/ dev/any23/2.4/RELEASE-NOTES.txt dev/any23/2.4/apache-any23-2.4-src.tar.gz (with props) dev/any23/2.4/apache-any23-2.4-src.tar.gz.asc dev/any23/2.4/apache-any23-2.4-src.tar.gz.sha512 dev/any23/2.4/apache-any23-2.4-src.zip (with props) dev/any23/2.4/apache-any23-2.4-src.zip.asc dev/any23/2.4/apache-any23-2.4-src.zip.sha512 dev/any23/2.4/apache-any23-cli-2.4.tar.gz (with props) dev/any23/2.4/apache-any23-cli-2.4.tar.gz.asc dev/any23/2.4/apache-any23-cli-2.4.tar.gz.sha512 dev/any23/2.4/apache-any23-cli-2.4.zip (with props) dev/any23/2.4/apache-any23-cli-2.4.zip.asc dev/any23/2.4/apache-any23-cli-2.4.zip.sha512 dev/any23/2.4/check.sh (with props) Added: dev/any23/2.4/RELEASE-NOTES.txt ============================================================================== --- dev/any23/2.4/RELEASE-NOTES.txt (added) +++ dev/any23/2.4/RELEASE-NOTES.txt Mon Sep 21 03:02:49 2020 @@ -0,0 +1,738 @@ + Apache Any23 2.4 + Release Notes + 20/09/2020 (dd/mm/yyy) + +Sub-task + + [ANY23-146] - CEN Metalex Vocabulary + [ANY23-149] - Expand SCHEMAORG Vocab + [ANY23-150] - Implement all vocab.sindice.net Vocabularies + [ANY23-269] - Support auto.schema.org + [ANY23-270] - Support bib.schema.org + +Bug + + [ANY23-427] - http://semanticweb.org/ down causes tests to fail + [ANY23-428] - RDFa parse issue if vocab not defined with trailing slash + [ANY23-430] - Microdata and HTML's attribute case + [ANY23-441] - TikaEncodingDetector: guessEncoding may throws an ArrayIndexOutOfBoundsException + [ANY23-446] - Fix bugs in Jsoup + [ANY23-449] - Fix the online microdata test failure + [ANY23-453] - Upgrade jsonld-java to 0.13.1 + +New Feature + + [ANY23-5] - Add support for archive input. + [ANY23-6] - Integrate MetaX support + +Improvement + + [ANY23-51] - Full support for rel-tag's + [ANY23-178] - Add fully annotated Javadoc to o.a.any23.source.* + [ANY23-183] - Address javac warning's in Any23 code base + [ANY23-202] - Add analytics on any23.org landing page + [ANY23-254] - Demo frontend should provide interactive CLI usage examples + [ANY23-281] - Build Policeman's Forbidden API Checker into Maven config + [ANY23-426] - Address Javadoc WARNING's + [ANY23-439] - Replace commons-lang with commons-lang3 + [ANY23-440] - any23 configuration documentation has a wrong property name + [ANY23-442] - Move HTML preprocessing logic from BaseRDFExtractor to semargl Extractors + [ANY23-443] - Improve efficiency of RDFa Extractor + [ANY23-444] - Update all dependencies and plugins + [ANY23-450] - Update Maven deps and plugin versions + +Wish + + [ANY23-225] - Fix Javadoc WARNING's in Any23 codebase + +Task + + [ANY23-72] - Evaluate the introduction of Aether as to improve the Any23 plugin management system + [ANY23-429] - Website Build Fails due to Javadoc issues + [ANY23-431] - Upgrade jsoup to v1.12.1 + [ANY23-432] - Upgrade owlapi to v5.1.11 + [ANY23-433] - Upgrade rdf4j to v3.0.0 + [ANY23-434] - Upgrade tika to v1.22 + [ANY23-435] - Upgrade httpclient to v4.5.10 + [ANY23-436] - Upgrade commons-csv to v1.7 + [ANY23-437] - Upgrade snakeyaml to v1.25 + [ANY23-438] - Upgrade slf4j-api to v1.7.28 + [ANY23-448] - Move service and plugins out of core + + Apache Any23 2.3 + Release Notes + 10/02/2019 (dd/mm/yyy) + +Sub-task + + [ANY23-184] - Update Javadoc in o.a.a.extractor.microdata.* + [ANY23-356] - Update dependencies + [ANY23-357] - Resolve mockito deprecation warnings + [ANY23-358] - Resolve junit.framework deprecation warnings & RDFa11Parser deprecation warnings + [ANY23-359] - Resolve org.apache.commons.io.IOUtils deprecation warning + [ANY23-360] - Resolve Xerces deprecation warnings + [ANY23-361] - Resolve Tika deprecation warning + [ANY23-362] - Resolve rdf4j deprecation warnings + [ANY23-363] - Update httpclient/httpcore to version 4.5.6/4.4.10 + [ANY23-364] - Resolve POI deprecation warnings + [ANY23-365] - Resolve additional warnings + [ANY23-366] - Resolve additional warnings in build + [ANY23-369] - Resolve overlapping classes + [ANY23-388] - It should be possible to configure the NTriplesWriter to use unicode points + [ANY23-404] - Make MicrodataExtractor compliant with default registry + [ANY23-405] - Parse microdata property values correctly + [ANY23-407] - Allow microdata itemids to be created from relative URLs + [ANY23-408] - Use document IRI as default namespace in microdata strict mode + [ANY23-409] - Allow multiple microdata itemtype values + [ANY23-410] - Fix microdata itemrefs + +Bug + + [ANY23-13] - Verify why the maven-changelog-plugin doesn't work properly + [ANY23-16] - Property URI generation for Microdata/schema.org + [ANY23-17] - problem detecting media type for turtle content with comment at the top + [ANY23-55] - any23 is not following the redirection + [ANY23-67] - Microdata extraction using obsolete RDF conversion scheme + [ANY23-154] - Not able to extract microdata in few test cases + [ANY23-167] - Microdata itemscope properties incorrectly attached + [ANY23-169] - Incorrect interpretation of relative and absolute paths with Microdata + [ANY23-188] - NPE when ICBMExtractor#getDescription()#getExtractorLabel() called + [ANY23-237] - Fix RDFa test 0087: stylesheet reserved word is stripped out + [ANY23-245] - Infinite loop on some malformed markup + [ANY23-322] - Any23 embedded service is broken + [ANY23-329] - master branch broken with pom.xml any23 version + [ANY23-331] - Tool service implementations declared in wrong module? + [ANY23-334] - SingleDocumentExtraction.createExtractionContext() uses UUID as defaultLanguage + [ANY23-336] - Parsing json-ld content takes prohibitively long time + [ANY23-337] - BenchmarkTripleHandler does not report accurate extraction interval times + [ANY23-338] - Json-ld comment parsing fails in rare cases + [ANY23-339] - Microdata extractor can sometime merge two different itemscopes into one + [ANY23-340] - Any23 extraction does not pass Nutch plugin test + [ANY23-344] - MicrodataExtractor not resolving urls correctly + [ANY23-345] - MicrodataExtractorTest has a duplicated test + [ANY23-346] - rdf4j versions 2.3.0, 2.3.1 contain a regression: we need to switch back to version 2.2.4 + [ANY23-347] - RDFParseException: the prefix "pw" is not bound + [ANY23-348] - IllegalArgumentException in MicrodataExtractor + [ANY23-349] - MicrodataExtractor errors for links that are telephone numbers + [ANY23-350] - RDFParseException: "icon" must be followed by ' = ' character + [ANY23-351] - NullPointerException in HCardExtractor + [ANY23-353] - RDFParseException: datatype rdf:langString requires a language tag + [ANY23-367] - latest.stable.released property is never used and out of date + [ANY23-368] - Jenkins builds are failing after running out of disk space + [ANY23-372] - LGPL-licensed transitive dependency + [ANY23-373] - Web page /install.html: software version variable was not decoded. + [ANY23-376] - IllegalArgumentException: invalid property name '' + [ANY23-377] - Microdata extractor replaces empty strings with "Null" + [ANY23-378] - JsonParseException caused by trailing commas in JSON-LD + [ANY23-379] - RDFa SAXParseException: invalid XML character + [ANY23-380] - RDFa SAXParseException: attribute was already specified + [ANY23-381] - JsonParseException: Illegal unquoted character + [ANY23-382] - Distinguish between fatal and recoverable json-ld parsing errors + [ANY23-383] - JsonParseException: Unexpected character 0x2028 + [ANY23-386] - Item's properties are in the wrong item since the 2.2 + [ANY23-387] - Possible OutOfMemoryError with bad deeply nested HTML + [ANY23-389] - RDFa extraction breaks when base element uses relative href + [ANY23-391] - ICAL vocab uses class "vcalendar" instead of "Vcalendar" + [ANY23-392] - Lunching maven-jetty-plugin: Problem accessing /apache-any23-service/resources/form.html + [ANY23-395] - any23.org 500 Internal Server Error + [ANY23-406] - Cannot suppress Tika warnings + [ANY23-411] - Use Content-Type to help determine encoding + [ANY23-415] - NTriplesExtractor tries all text/plain files, causing numerous fatal issues + [ANY23-416] - NTriplesExtractor does not recognize "application/n-triples" mimetype + [ANY23-420] - Handle Json+ld extraction failure + [ANY23-425] - iCal, jCal, xCal extractors aren't listed in META-INF/services + +New Feature + + [ANY23-81] - Interactive web service + +Improvement + + [ANY23-38] - Use a single logging tool: slf4j + [ANY23-190] - any23.org homepage busted on IE11 + [ANY23-212] - Improve naming convention for service output files + [ANY23-215] - Forward slashes in URL's should not be escaped in RDF output + [ANY23-231] - Make JSON Reporting output pretty print + [ANY23-240] - Option to process html tags as spaces in Microdata + [ANY23-323] - Update Eclipse RDF4J version to 2.3 + [ANY23-332] - Plugin-specific properties shouldn't be declared in default-configuration.properties + [ANY23-341] - Remove dependency on defunct commons-httpclient 3.1 + [ANY23-343] - Upgrade to jsonld-java v 0.12.0 + [ANY23-352] - Update to rdf4j version 2.3.2 + [ANY23-354] - Clean up dependencies + [ANY23-355] - Deprecate RDFa11Parser since Rio implementations are used instead + [ANY23-374] - Invalid nested item takes out everything + [ANY23-385] - Improve charset detection for (x)html documents + [ANY23-390] - Implement ICal, JCal, XCal extractors + [ANY23-393] - Any23 master to build under JDK 10.X + [ANY23-394] - JSON-LD Extractions Flag Errors in Google's Structured Data Tooling + [ANY23-396] - Overhaul WriterFactory API + [ANY23-399] - Upgrade Apache parent POM to version 21 + [ANY23-401] - Upgrade to Tika 1.19.1 + [ANY23-402] - Deprecate JSONWriter, JSONWriterFactory + [ANY23-403] - Upgrade to RDF4J 2.4.0 + [ANY23-414] - Support reverse itemprops in microdata + [ANY23-418] - Take another look at encoding detection + [ANY23-419] - Add J2EE depednencies such that service runs under JDK11 + [ANY23-424] - Update dependencies + +Test + + [ANY23-422] - Error message when any23 cli tool used + +Task + + [ANY23-333] - Augment use of Any23PluginManager in How to Register a Plugin documentation + [ANY23-423] - Update POM for the move to gitbox. + + Apache Any23 2.2 + Release Notes + 25/01/2018 (dd/mm/yyy) + +Sub-task + + [ANY23-155] - Test failure: testRunOnHTTPResource(org.apache.any23.cli.MicrodataParserTest) + [ANY23-267] - Entire extractions fail due to "The element type 'meta' must be terminated by the matching end-tag " + [ANY23-268] - Entire extraction task fails due to "Element type "t.length" must be followed by either attribute specifications, ">" or "/>" + +Bug + + [ANY23-12] - character are wrongly encoded in rdfxml output + [ANY23-131] - Nested Microdata are not extracted + [ANY23-140] - Revise Any23 tests to remove fetching of web content + [ANY23-166] - Parsing crashes with attributes that don't use quotes + [ANY23-201] - Service Regularly Times Out on DBPedia Queries + [ANY23-227] - not extracting opengraph rdfa + [ANY23-228] - Invalid URI + [ANY23-230] - any23.org redirects to single slash URI + [ANY23-256] - MicrodataParserTest failing locally but not on Jenkins + [ANY23-260] - Get Any23 listed as an Application capable of using DBPedia + [ANY23-266] - Fix Issues with Failing WebService Examples + [ANY23-271] - Address "...The entity "raquo" was referenced, but not declared" SAXParseException + [ANY23-273] - The content of elements must consist of well-formed character data or markup - no bogus comments + [ANY23-303] - JsonLdError: loading remote context failed: http://schema.org/ + [ANY23-306] - Absent binaries for version 2.0 + [ANY23-312] - Triple sub-pred-null should not be added into outcome. Change traversing method. + [ANY23-314] - Service fails to return extraction in case of extraction error + [ANY23-316] - Yaml parser does not halndle intentional null value + [ANY23-317] - Any23 fails when dealing with JavaScript + [ANY23-318] - ExtractionException handling in BaseRDFExtractor.java kills entire extraction + [ANY23-326] - parsing unclosed meta and input tags fails + +New Feature + + [ANY23-8] - Write a separate tool for RDFa/microformat detection tool usable in crawlers + [ANY23-233] - Add local extraction cache to Any23 service + +Improvement + + [ANY23-106] - Gracefully shut down Any23 service + [ANY23-213] - Implement JSOn reporting for the Any23 service + [ANY23-214] - ë (e-umlaut or diaeresis) not decoded in RDF output + [ANY23-249] - Update all W3C and other Standards Compliance within Any23 + [ANY23-280] - Refactor ContentExtractor to improve extraction flexibility + [ANY23-291] - JSON-LD should be looked up in entire HTML document, not just in + [ANY23-298] - Revisit the OGP.java vocabulary and update it + [ANY23-309] - "Scraper" misspelled as "Scarper" on Downloads webpage + [ANY23-319] - Upgrade jsonld-java dependency to 0.11.1 + [ANY23-324] - Replace net.sourceforge.nekohtml with jsoup + [ANY23-325] - Any23 incompatible with http://rdfa.info/test-suite/# + +Test + + [ANY23-320] - Address @Ignore tests in Any23 + +Wish + + [ANY23-210] - Address 1.0 Release Review Discrepancies + +Task + + [ANY23-40] - Complete Documentation for Plugin Management system + + + Apache Any23 2.1 + Release Notes + 14/09/2017 (dd/mm/yyy) + +Bug + + [ANY23-244] - Broken Links on Web-Site + [ANY23-282] - Replacement for all Sindice namespaces and URI's + [ANY23-304] - Add extractor for OpenIE + [ANY23-305] - Missing appender in command line tool + [ANY23-308] - Adding option "-d" to yaml file parsing gives error + [ANY23-310] - Rover displays wrong statistical values + +Improvement + + [ANY23-206] - Overhaul Any23 site documentation + [ANY23-301] - Forward all logs into STDERR stream + +New Feature + + [ANY23-257] - Support OWL as an input format + +Task + + [ANY23-283] - access to analysis.apache.org + + Apache Any23 2.0 + Release Notes + 03/02/2017 (dd/mm/yyy) +Sub-task + + [ANY23-243] - Overhaul and update README.txt + +Bug + + [ANY23-79] - No execute permissions in command line tool + [ANY23-92] - NQuadsParser does not require whitespace between elements + [ANY23-99] - NQuadsWriter should force ASCII in OutputStream constructor + [ANY23-153] - Automatically Generate EARL reports for Any23 RDF Parsers + [ANY23-176] - DOC: Apache Any23 Installation Guide + [ANY23-200] - Build revision is not correctly defined + [ANY23-219] - rover is does not work with -f nquads option + [ANY23-235] - NQuads links broken on Supported Formats Page + [ANY23-236] - Port Any23 site to Apache CMS + [ANY23-248] - NTriplesWriter on hadoop : issue with MIME type/Upgrade sesame dependencies to 2.7.14 + [ANY23-252] - JSON-LD format MIME type is not detected + [ANY23-253] - JSON-LD cannot be processed by Rover + [ANY23-255] - apache-any23-quads dependency should not be test in core pom.xml + [ANY23-265] - ThreadSafety issue in ItemPropValue + [ANY23-272] - Service fails to start with any23server.bat + [ANY23-277] - Any23 master branch will not build to to build due to lacking maven-assembly-plugin + [ANY23-279] - Fix EmbeddedJSONLDExtractor ExtractorDescription getDescription() implementation + [ANY23-296] - Tar complains about groupid value being too big + [ANY23-302] - rover JSON output is not valid + +Improvement + + [ANY23-80] - Split out command line tools into a separate module + [ANY23-163] - VocabPrinter tool broken with No writer factory available for RDF format N-Quads (mimeTypes=text/x-nquads; ext=nq) + [ANY23-185] - Add missing element attributes to HTMLMetaExtractor + [ANY23-207] - Implement Microformats2 + [ANY23-246] - Add Open Graph Protocol and Facebook prefixes to popular.prefixes + [ANY23-247] - FIX Attribute name "itemscope" associated with an element type "html" must be followed by the ' = ' character. + [ANY23-250] - Upgrade to Tika 1.7 + [ANY23-261] - Tiny typo in Data Extraction documentation source example + [ANY23-263] - Upgrade to Tika 1.14 + [ANY23-274] - Change any23.microdata.ns.default configuration value to http://schema.org + [ANY23-276] - Upgrade sesame dependencies to RDF4J + [ANY23-278] - Upgrade all Maven plugin versions in parent pom.xml + [ANY23-293] - Package log4j configuration with core appassembler + [ANY23-297] - Any23 doesn't build under JDK1.8 + [ANY23-299] - Missing YAML to RDF parser + [ANY23-300] - Ignore NetBeans configuration files + +Task + + [ANY23-141] - Upgrade OpenRDF Sesame to 2.7.0 + [ANY23-242] - Address issues with 1.1 #1 RC + +Wish + + [ANY23-19] - Abstract away any specific RDF APIs + [ANY23-226] - Extract JSON-LD embedded in HTML + + Apache Any23 1.1 + Release Notes + 15/10/2014 (dd/mm/yyyy) +Bug + + [ANY23-205] - Remove xrefs from Any23 site and replave with Git(hub) links + [ANY23-220] - Run crawler plugin on Apache Any23 site + [ANY23-234] - No writer factory available for RDF format N-Quads (mimeTypes=text/x-nquads; ext=nq) + +Improvement + + [ANY23-157] - Update Any23 site to accommodate move to Git. + [ANY23-197] - Extract embedded json-ld from html documents + [ANY23-204] - fix url encoding problem : PR#3 + [ANY23-209] - Bug in site generation + [ANY23-221] - Enable JSON-LD as an input format for the WebService at any23.org + [ANY23-238] - Fix generation of BNode name for microdata when 'itemid' is given without a value. + +New Feature + + [ANY23-7] - Performance test suite + [ANY23-160] - [SECURITY] Frame injection vulnerability in published Javadoc + +Task + + [ANY23-222] - Push 1.1-SNAPSHOT artifacts to the Any23 website + + + Apache Any23 1.0 + Release Notes + 09/05/2014 (dd/mm/yyyy) + +Sub-task + + [ANY23-148] - Programmes Ontology + +Bug + + [ANY23-100] - Issue with RDFa extractor while processing nested properties + [ANY23-135] - Any23 RDFa Extractor ignores multiple prefix and property statements + [ANY23-136] - Some RDFa tests have incorrect expected results + [ANY23-168] - RDFa properties in elements not picked up + [ANY23-170] - Dependency error org.apache.commons:commons-csv:1.0-SNAPSHOT-rev1148315 + [ANY23-172] - Fix minor issues with Any23 0.9.0 RC + [ANY23-173] - Please delete old releases from mirroring system + [ANY23-174] - Incorrect RDFa extractions + [ANY23-203] - Update version revisions from 0.9.1 to 1.0 + +Improvement + + [ANY23-65] - Update to RDFa extraction stylesheet + [ANY23-128] - html-rdfa11 extractor fails on mailto: anchors + [ANY23-130] - Improve aesthetics of the output format when straying from default java.io.PrintStream + [ANY23-137] - RDFa parser implementation proposal + [ANY23-179] - Improve Javadoc and throwing of IllegalArgumentException in Any23#createDocumentSource + [ANY23-180] - Create an Apache hosted jail running an Any23 service instance + [ANY23-181] - Upgrade NekoHTML to 1.9.20 + +New Feature + + [ANY23-134] - Create o.a.a.extractor.tika Parser and Extractor implementations + [ANY23-177] - Add support for JSON-LD + +Task + + [ANY23-162] - Add package.java for all LKIFCore classes + + Apache Any23 0.9.0 + Release Notes + 28/10/2013 (dd/mm/yyyy) + +Sub-task + + [ANY23-142] - LKIF-Core Vocabulary + [ANY23-143] - LRICore Vocabulary + +Bug + + [ANY23-111] - Any23 raises an unmanaged exception from the Microdata parser + [ANY23-115] - Empty spans seem to break ANY23 + [ANY23-161] - Fix service file generation + [ANY23-165] - "Invalid content" error if TITLE precedes encoding declaration in the document + [ANY23-171] - form.html not in correct location in service. + +Improvement + + [ANY23-47] - Migrate basic-crawler classes to org.apache.nutch + [ANY23-164] - office-scraper ExcelExtractorFactory.java to accept application/x-tika-ooxml and application/x-tika-msoffice formats + +New Feature + + [ANY23-120] - Split CLI tools out into a new module + +Task + + [ANY23-122] - Cleanup Distribution Mirrors + + Apache Any23 0.8.0 + Release Notes + 01/05/2013 (dd/mm/yyyy) + +Sub-task + + [ANY23-109] - Missing tika-config.xml in o.a.a.mime + [ANY23-110] - DOAP Vocabulary + +Bug + + [ANY23-44] - error when parsing a document from http://www.afdsi.org/docs/test/html/RDFa/_food-stream_.htm + [ANY23-78] - Download page links are broken + [ANY23-108] - Broken schema.org microdata extraction + [ANY23-112] - Fix incubation disclaimer + [ANY23-113] - Remove dependencies from parent pom.xml file + [ANY23-116] - Empty values are skipped when reading tab separated CSV. + [ANY23-156] - Add logging dependencies to plugins and service + +Improvement + + [ANY23-2] - Add support for hreview-aggregate microformat. + [ANY23-26] - Upgrade dependency to Apache Tika 1.2 + [ANY23-46] - Update Any23 web service + [ANY23-83] - Remove hardcoded formats throughout Any23 to make it useful as a library + [ANY23-101] - Use RDFFormat.NQUADS in nquads module + [ANY23-139] - Simplify site deploy plugging the maven-scm-publish-plugin + [ANY23-144] - Implement comprehensive naming of o.a.a.api.vocab classes + +New Feature + + [ANY23-4] - Integrate W3C's RDFa test suite and pass all tests + [ANY23-85] - Split NQuads out into its own module + [ANY23-96] - Add user agent string to basic-crawler + [ANY23-117] - Split Mime type detection out into its own module + [ANY23-118] - Split Encoding detection out into its own module + +Task + + [ANY23-41] - Write basic-crawler plugin documentation + [ANY23-125] - Drop the Incubating DISCLAIMER + + + Apache Any23 0.7.0-incubating + Release Notes + 25/06/2012 + +Sub-task + + [ANY23-25] - Update all Maven POM's in trunk + [ANY23-31] - Move any23 site documentation out of trunk and into its own SVN directory + [ANY23-53] - Bad Web Service documentation + +Bug + + [ANY23-14] - Add support for Extractor sub results + [ANY23-20] - The Any23 PluginManager fails handing resource paths containing spaces. + [ANY23-34] - Plugin Integration Test Fails + [ANY23-37] - LGPL'ed components cannot be included in distribution packages + [ANY23-42] - Fix issue in RDFa11Parser.java is not resolving relative URIs correctly + [ANY23-49] - N3/NQ parsers ignoring stopAtFirstError flag + [ANY23-58] - HCardExtractor infinite loop and memory exhaustion + [ANY23-62] - ExtractionResultImpl loses all issues generated by sub extractions + [ANY23-73] - The ToolRunner CLI driver -p (--plugins-dir) option doesn't work because parsed after the Tool list loading + [ANY23-77] - Facing a infinite loop problem in version 0.6.1 - Verify + [ANY23-78] - Download page links are broken + [ANY23-87] - Bogus arguement in o.a.a.cli.CrawlerTest + [ANY23-88] - any23 script -v or --version option doesn't display actual version + [ANY23-94] - The Microdata CLI tool doesn't work anymore + [ANY23-95] - Activate the IgnoreAccidentalRDFa filter for the Any23 Service instance + [ANY23-97] - The test suite was not running all tests, minor regressions occurred + +Improvement + + [ANY23-18] - Add a new extractor for RDFa using java-rdfa + [ANY23-28] - Document munging of Any23 history to CHANGES.txt + [ANY23-32] - replace hardcoded bash script with generated via appassembler + [ANY23-33] - Replace proprietary SUN imports from Any23 classes. + [ANY23-45] - Improve issue verification support in Extractor tests + [ANY23-50] - Simplify plugin loading avoiding the classpath scanning + [ANY23-56] - Change repo-ext to Any23 SVN mirrior repo. + [ANY23-63] - The Any23 web service doesn't return the Issue Report generated by activated Extractors, hiding major metadata issues + [ANY23-64] - Improve CLI uage aesthetics + [ANY23-70] - Establish searchable list archives + [ANY23-71] - improve the current CLI engine + [ANY23-74] - Disable domain triple generation in default configuration + [ANY23-75] - Improve runtime of the Microdata extractor on documents with many relations. + [ANY23-76] - Improve runtime of the Microformat extractor on documents with many relations. + [ANY23-82] - Don't use explicit reference to Log4j classes + [ANY23-86] - Better logging in SiteCrawlerTest + +New Feature + + [ANY23-9] - Prepare a dedicated homepage for Any23 + [ANY23-29] - Migrate code base to ASF infrastructure + [ANY23-57] - Create Any23 History documentation and add to site + [ANY23-59] - Create KEYS file for Any23 + [ANY23-68] - Create Powered By documentation/page + [ANY23-102] - Any23 DOAP file + +Task + + [ANY23-21] - Migrate all packages and classes to ORG.APACHE.ANY23 + [ANY23-27] - Import revisions r1547 to r1607 from Google Code SVN to ASF SVN + [ANY23-36] - Merge GCode specific CHANGES.txt report in main changes.xml + [ANY23-39] - Write Down Overall Architecture Document to help new developers maintaining the Any23 core + [ANY23-48] - Update Documentation (Site + READMEs) to reflect changes in shell script usage + [ANY23-52] - Remove non ASF logos from Any23 Service page + [ANY23-66] - Fix Javadoc + +========================================================================== + + Apache Any23 0.6.1 + Release Notes + +Fixes + + * Improved MIMEType detection for CSV input. [172, 176] + +========================================================================== + + Apache Any23 0.6.0 + Release Notes + +Fixes + + * Fixed several bugs. [151, 153, 154, 155, 156, 164, 168] + * Removed unused Apache Any23 dependencies. [162] + * Introduced parent POM dependencyManagement. [163] + * Minor code refactoring. [142] + * Updated project documentation. [161] + +Enhancements + + * Added support for Microdata [114, 141, 144, 145, 152, 157] + * Added RDFa 1.1 support for new prefix specification. [143] + * Added CSV Extractor (RDFizer). [150, 165] + * Added HTML/META Extractor. [148, 149] + * Improved Configuration programmatic management. [147] + * Added several flags to control metadata triples generation. [146] + * Improved nesting relationship explicitation in Microformat extractors. [80] + * Major Extractor interface refactoring. [160, 167] + * Improved TagSoup Extractor based error reporting. [159] + * Added command-line tool to print out the Apache Any23 declared vocabularies. [114] + +========================================================================== + + Apache Any23 0.6.0-M2 + Release Notes + +The release 0.6.0-M2 introduces major fixes on M1 milestone +[154, 155, 156] and improves Configuration [147] and Microdata + error management[157]. + +========================================================================== + + Apache Any23 0.6.0-M1 + Release Notes + +The release 0.6.0-M1 is an early preview of the +Microdata support. [114] + +========================================================================== + + Apache Any23 0.5.0 + Release Notes + +Fixes + + * Fixed wrong conversion of a generic XML file to RDF. [131] + * Fixed usage of 'base' tag when resolving relative URIs + in RDFa. [75] + * Fixed error parsing Turtle data. [87] + * Fixed issue with escaping in NQuads parser. [126] + * Fixed XML DTD validation attempt. [95] + * Fixed concurrent modification exception in + ExtractionContentBlocker filter. [86] + * Fixed mime type detection of direct input when source + contains blank chars. [83, 90] + * Fixed reporting when producing no triples. [79] + * Fixed any23-service packaging, added profile for excluding + embedded dependencies. [113] + +Enhancements + + * Improved extraction report: added list of + activated extractors. [89] + * Improved extraction of HTML link element. [133] + * Added XPath HTML extractor. [124] + * Added HRecipe Microformat extractor. [103] + * Added plugin support for Apache Any23. [111] + * Implemented HTML Scraper Plugin. [123] + * Upgraded to Sesame 2.4.0. [136] + * Upgraded to Jetty 8.0.0 [138] + * Upgraded maven-site-plugin. [85] + * Added flags to exclude metadata triples [134] + * Added removal of CSS related triples. [135] + * Improved overall documentation. [130] + * Overall POM refactoring. [125] + +========================================================================== + + Apache Any23 0.4.0 + Release Notes + +* The any23-service module has been separated from the any23-core module, + the Ant build system has been dropped. [Issue 44] +* Added support for HTML metadata (RDFa / Microformats) validation + and correction (validator). [Issue 77] +* Added flag to disable the nesting relationship property + enrichment. [Issue 67] +* Improved coverage of Microformats tests. [Issue 65] +* Improved documentation. [Issue 44] +* Various code consolidation. [Issues 68, 69, 70, 71, 72, 73, 74, 77] + +========================================================================== + + Apache Any23 0.3.0 + Release Notes + +* Added detection and enrichment of nested microformats. [Issue #61] +* Added detection and support of N-Quads as input and output format. [Issue #7] +* General Improvements in RDFa extraction. [Issue #12, Issue #14] +* Added support of Turtle embedded in HTML script tag. [Issue #62] +* Improvement in encoding support. [Issue #43] +* Improvement in Core API. [Issue #27] +* Improved support for Species Microformat. [Issue #63] +* General Code prettification. + +========================================================================== + + Apache Any23 0.2.2 + Release Notes + +* Fixed dependency management on Maven. A second level dependency of Xerces + introduced a conflict on the java.xml.transform API causing wrong XSLT + transformations within RDFa extractor. + +========================================================================== + + Apache Any23 0.2.1 + Release Notes + +* Major applyFix on Tika configuration management. This applyFix solves the + auto detection of the main Semantic Web related formats. + +========================================================================== + + Apache Any23 0.2 + Release Notes + +============ +Introduction +============ + +This release features a redesigned API and incorporating enhancements and +bug fixes that have accumulated since the 0.1 release. +Apart from some new or changed dependencies on the underlying libraries, +this version comes with an improved unit test coverage and other features +like the automatic charset encoding detection and an improved documentation. +Maven build system has been introduced. + + +================================== +Summary of major changes since 0.1 +================================== + +* Redesigned Java API + - Input from string, stream, file, or URI + - Allow choosing which extractors to use + - Report origin of triples (document/extractor) to client processors + - Various processors/serializers for extracted triples +* Added flexible command-line tool for easy testing +* Vastly improved website and documentation +* Media type and encoding detection via Apache Tika +* Switched RDF library from Jena to Sesame +* Added Maven build +* Better RDF extraction from Microformats +* Extractors now come with an example file to document typical in- and output +* Major refactoring +* Lots and lots of bugfixes + +================= +Supported formats +================= + +* RDF/XML +* Notation3 and Turtle +* N-Triples +* RDFa + +Various microformats, see http://sindice.com/developers/microformat on Sindice Microformats support. + +=================== +Dependency Upgrade +=================== + +CyberNeko Html parser has been upgraded to 1.9.14. + +Apache Tika 0.3 has been replaced with 0.6, with the +new support for the automatic encoding detection. + +EOF + Added: dev/any23/2.4/apache-any23-2.4-src.tar.gz ============================================================================== Binary file - no diff available. Propchange: dev/any23/2.4/apache-any23-2.4-src.tar.gz ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: dev/any23/2.4/apache-any23-2.4-src.tar.gz.asc ============================================================================== --- dev/any23/2.4/apache-any23-2.4-src.tar.gz.asc (added) +++ dev/any23/2.4/apache-any23-2.4-src.tar.gz.asc Mon Sep 21 03:02:49 2020 @@ -0,0 +1,16 @@ +-----BEGIN PGP SIGNATURE----- + +iQIzBAABCgAdFiEE23tRmRIcCKXI9AUrOkcX8Ei66/YFAl9oEsoACgkQOkcX8Ei6 +6/YC0BAAlyAqWrg3TZBgO4g1cKPqemifVX3/SGE+US6ddKNITebSIWDutXjXPzlg +2rGmwvYGYoscjetU8XcZo/UWXUadNv0KU8C0bqTW5EWRIT7yVks8rbQ0WyOehrsB +YpizRzC/IWdEg/xS1zkngYdRh2mKtF4aFdM5b0brJDlrrmW3Kbp85FpA8KrzimwT +qfSEndd4Bdos7RbBa1W4pGDK9CQmgkCaoANg0LKjiTIsoFLA7nGIaq/xHr1ConWb +XQp0NZxUL5acCmKXZFL+yS798ymEuyOligGehc2BdI/DPIoBi5vrC2SZfu1aU3Js +TU+VeScDbg89mg1tGuKFCh//BBgo3YFytiESq1IJDnW2Bj3dATmuVSTudyhd0f6A +J2edPPjzYUFujhEzazBG+GGDA5oI8P2FrPV5+FvfpcQtGiazMVwvxKMvfXphVylO +FSM1CxPeD4EXTRBQhC0vAYEd0rMTQwtw8cnYaHMc5YjgIZMUnEPo7ksHeAVvYX4t +5sZfAIBxLMH0T5zbHoq8prqfkvh4sfspPtT8/sHirknyE/PoOrK6HXJ0jPX5v1/4 +ec3677xva6DN049bWiu8A2KcxzOW84lvOipFjThCOHptb9IPZrE88QSRntGBnVpQ +Ay2K7gf25e6kc3rf3v3C6xddOL6lHAuh8TtaeILFvYI18kpL29o= +=Ug51 +-----END PGP SIGNATURE----- Added: dev/any23/2.4/apache-any23-2.4-src.tar.gz.sha512 ============================================================================== --- dev/any23/2.4/apache-any23-2.4-src.tar.gz.sha512 (added) +++ dev/any23/2.4/apache-any23-2.4-src.tar.gz.sha512 Mon Sep 21 03:02:49 2020 @@ -0,0 +1,4 @@ +./apache-any23-2.4-src.tar.gz: 6FFA755B E07EFC64 2E45D14D 35E619E4 F9AD2663 + 2D29CE04 B80582BF 05FC5B0B F316895D D535E18C + 4BA5CB54 A19BED15 7A774F31 636A14ED 64F49669 + 555CBAAB Added: dev/any23/2.4/apache-any23-2.4-src.zip ============================================================================== Binary file - no diff available. Propchange: dev/any23/2.4/apache-any23-2.4-src.zip ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: dev/any23/2.4/apache-any23-2.4-src.zip.asc ============================================================================== --- dev/any23/2.4/apache-any23-2.4-src.zip.asc (added) +++ dev/any23/2.4/apache-any23-2.4-src.zip.asc Mon Sep 21 03:02:49 2020 @@ -0,0 +1,16 @@ +-----BEGIN PGP SIGNATURE----- + +iQIzBAABCgAdFiEE23tRmRIcCKXI9AUrOkcX8Ei66/YFAl9oEsoACgkQOkcX8Ei6 +6/aV4g//Yq4S/JKUMei3vxUpBkK94rC1m9qqes+as7w/SjDiM1bAAcR80EV/UaqW +rwO5PJAxr2ckYKrsDCs7d8DWcXlov0Yga0GNiRhHu2tcZnKkF1K/VQhHtQrvTW/T +lv4mM8FUeoCIWtVM4uoIz/eL1CcsVb6/cYxWscZ0+nU+vMI0cq9n/DcjUdNgzju2 +/AHbdevGyzRvWEJdl8a9lDKNDoEXzJT+8Ic7aU84NV9M4wZXvM1MuvyQr0P/gFMe +MRNJKPlDskqKh9A1e8LwUXJJRfLf8ZcvopKDWOcPvHoL8qyK6XqyN10kPG+b97aM +jJ3Y1EGrZ2/UosmJx4hLKvyUJ37me8H8lRUSuN7W2mbBSUBoZdT3mh70eHqOVrAB +0MNavIspLIzIM53PVF7fs7xLnkuGAU+PxBrejXTE5grKeayoBn/Eqnx9QcflZ++j +Pw01TN4h3twmcmCf+0sFq49Jj2XRDfuQSyoKLY3vEE3g+b1530jqSeDnx3kPx26/ +/wmw/f5wRPcUtc35oCdr403YD3ANOCl+ty6X6e2ID6fxy49EK72hcLMLQWb++OR/ +dm/QFNbrIvBg4xO0HmS9g5tzWYAWo1ATU+ouraiceRL6iIViGowbu2QOx0Gk8VLJ +6mD6On/NrYVqCsP4mEWRKTh7QU6YueRptjNRhjMf99o2yQNbJH0= +=f80S +-----END PGP SIGNATURE----- Added: dev/any23/2.4/apache-any23-2.4-src.zip.sha512 ============================================================================== --- dev/any23/2.4/apache-any23-2.4-src.zip.sha512 (added) +++ dev/any23/2.4/apache-any23-2.4-src.zip.sha512 Mon Sep 21 03:02:49 2020 @@ -0,0 +1,4 @@ +./apache-any23-2.4-src.zip: C0906AF6 B4730B9A 000E51B6 898456F4 AAF63A5C + 6B8D86B5 B50C3CF2 00EDD6B3 5E0DC64C C9AAD819 + 843CE9E2 176A07F0 E5BAC078 69EAAAB2 22683F78 + 03B2FF35 Added: dev/any23/2.4/apache-any23-cli-2.4.tar.gz ============================================================================== Binary file - no diff available. Propchange: dev/any23/2.4/apache-any23-cli-2.4.tar.gz ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: dev/any23/2.4/apache-any23-cli-2.4.tar.gz.asc ============================================================================== --- dev/any23/2.4/apache-any23-cli-2.4.tar.gz.asc (added) +++ dev/any23/2.4/apache-any23-cli-2.4.tar.gz.asc Mon Sep 21 03:02:49 2020 @@ -0,0 +1,16 @@ +-----BEGIN PGP SIGNATURE----- + +iQIzBAABCgAdFiEE23tRmRIcCKXI9AUrOkcX8Ei66/YFAl9oEwAACgkQOkcX8Ei6 +6/bkghAAot8kEXQDF47gZQhKf1k4REoZhQX0+iFbquVsF/4DZ4EYMpcRDkAJYGSa +b9++tznG48VJafdFJGuR9uxRUpwcksUWMz+wvQ+EpP7iRW056SworKj1sxrRnTfq +a//kNas2B+b5i/9aNf0cbQQZfOna58zdj5gvNGMKTjUKQz3F5k/JikrQ4MA0Kwbg +0gt0Bq+kkW+NILit3CP/Dr57Fn7FZOK0cuZu+/JpDuk39WkfIbAEmhoKEqJ9pwEC +kt2rpjdwAfXL2tyBPm2A4zp6UyNyWYGwdqZPTneBoeOow3UOrmcnrrt94DrZZ509 +J/ErhzkUs+h3Ma5d4d1U9r6P4SFFzcocxJXohreIImbtaiYzBE0jhzmeH9bf7JUV +cl6R3XF8bqL7QiWeFxCOvp1k+Rd5TslUTrNmVgBfjgdWLmddRvTJAW1W2xh3BzCW +NHWYXKGQEV0qKTcNjGKhEscNt0zDCCDmZBtGvvCUP9D4tQ1zL7YXiAaEL2RSIKPr +H3R6TCaf7kqfiRV+8w7wMD/Iy7H+aXVDtcfT4EcS6G0dcpjI1CAM/MZ5w+tHa5Zs +wfkGwZLUotkVT7TRhuP0v6NEMTe+kQLOkkHsz0Q3wX1zmz6G3nU9sneCusXbNQBC +MigaWorLacQI2DfXfZPcMlil4ZAyUPez+Woal+9gEy911HcSX6o= +=9P1K +-----END PGP SIGNATURE----- Added: dev/any23/2.4/apache-any23-cli-2.4.tar.gz.sha512 ============================================================================== --- dev/any23/2.4/apache-any23-cli-2.4.tar.gz.sha512 (added) +++ dev/any23/2.4/apache-any23-cli-2.4.tar.gz.sha512 Mon Sep 21 03:02:49 2020 @@ -0,0 +1,4 @@ +./apache-any23-cli-2.4.tar.gz: 97B3BEDD 83D35C67 9E22FE37 CA467F8E D51DF6FC + F207706C 33A4AC4A 4CE66E8E 830B642B 2B688BDA + 9BD650C3 0E0DBBCB BA10F738 D91C3FC0 48C3825A + 36626991 Added: dev/any23/2.4/apache-any23-cli-2.4.zip ============================================================================== Binary file - no diff available. Propchange: dev/any23/2.4/apache-any23-cli-2.4.zip ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: dev/any23/2.4/apache-any23-cli-2.4.zip.asc ============================================================================== --- dev/any23/2.4/apache-any23-cli-2.4.zip.asc (added) +++ dev/any23/2.4/apache-any23-cli-2.4.zip.asc Mon Sep 21 03:02:49 2020 @@ -0,0 +1,16 @@ +-----BEGIN PGP SIGNATURE----- + +iQIzBAABCgAdFiEE23tRmRIcCKXI9AUrOkcX8Ei66/YFAl9oEwIACgkQOkcX8Ei6 +6/YR/A/+MVFZ8k2U1Vtb/dM6Y/5NlCgc8INseP/qP3M0sXbS4dFmYo4qUfSJomaF +rOMFHCVzdkx4kcr+K+cVfJfas7mjJd+qhrTzpVauFD4s5mgBvbzM3Xwr49xSnLRd +l1tsmcEovfbWLZYwtBPYogBYm885SQUEYzoahgQDNsIuKlm/7+Nk9lQlqZvzmwdF +Fg45A6qYnHeGczzKmlox38ez8h+imGT1cDt7hxmEfQGZ7J19gmPtFmcDMbakvJFw +xqc17Gvj1bsMau+sgkmDhxM5JvXOAOiFrA+l6TV1SHdfuW1QuTdbFtKqZdNaKcTy +imbeV78AvvbBsQ8pPY2TzfefTE8eYqPXV6MYeZpZOD0erZqHqAIWvuMRcovaPGa5 +rDY1RYuF3d9Smt29Ixg2MsFIPGoChAFxW3BC5A0wmyaHMh2TOlHfSzmqU789VSPv +SrFT6n+KtMimzeN19E8vAL//ivURG//ABFSgShGAG/jKDhbMjgMtVg7d+pL5S4G5 +mpA6eSeThZzIF54r/lh9DlOU92BJxwZ35sh+rS75rT7yWUaf318vqioT9GQQYkOd +z8Id1g3yR45FBtlXmhNmw+o0C0ftriK3ouc7jif9KjugyhQillv+xH8egDkjA4R+ +d6X4gZxuFP7ZBzTNEw9eLVGrhzQz6OLKOMZuH2vDQ5ZcRlGuM3M= +=Mo1m +-----END PGP SIGNATURE----- Added: dev/any23/2.4/apache-any23-cli-2.4.zip.sha512 ============================================================================== --- dev/any23/2.4/apache-any23-cli-2.4.zip.sha512 (added) +++ dev/any23/2.4/apache-any23-cli-2.4.zip.sha512 Mon Sep 21 03:02:49 2020 @@ -0,0 +1,4 @@ +./apache-any23-cli-2.4.zip: 3FFE9932 AE957ADE 17F554A8 825F50EA 63EB4225 + 2C30C877 FAF19530 B4583C54 CA89F9D9 A8CBD12F + A235D585 E522A4FB FF07A559 826DE20C 833E0D3C + A1706496 Added: dev/any23/2.4/check.sh ============================================================================== --- dev/any23/2.4/check.sh (added) +++ dev/any23/2.4/check.sh Mon Sep 21 03:02:49 2020 @@ -0,0 +1,5 @@ +#!/bin/bash +for file in `find . -type f -iname '*.zip'` +do + gpg --print-md SHA512 ${file} > ${file}.sha512 +done \ No newline at end of file Propchange: dev/any23/2.4/check.sh ------------------------------------------------------------------------------ svn:executable = *