maven-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Osipov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MRESOLVER-90) HTML content in POM: Maven should validate content before storing in local repo
Date Mon, 01 Jul 2019 20:21:00 GMT

    [ https://issues.apache.org/jira/browse/MRESOLVER-90?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16876468#comment-16876468
] 

Michael Osipov commented on MRESOLVER-90:
-----------------------------------------

There is always room for improvement, but someone has to implement this ;-)

bq. You could change the default so checksums are validated by default

I tried, it was pulled back for compat reasons. I will retry for 3.7.0.

bq. You could first download the checksums. If the downloaded checksum is containing HTML
it is not a checksum and any further download for that artifact could already be aborted with
an error.

What if the checksum file contains just {{123}} or something else, but not HTML?

bq. You could try to detect if the content is HTML (what is quite easy). Assuming the type
is not "html" or "xhtml" you could consider it as invalid

Content type or sniffing?

bq. You could at least add a validation for pom files. We know that POM files are XML and
we even have a parser that can validate a POM. Therefore for POMs we could reject entirely
invalid content before putting it persitenty into local repo

The POMs are already parsed by the model builder/parser and this would cause duplicate proccess
tasks which will impact performance.

Please look at {{org.eclipse.aether.connector.basic.BasicRepositoryConnector.get(Collection<?
extends ArtifactDownload>, Collection<? extends MetadataDownload>)}} as well as the
{{org.eclipse.aether.connector.basic.BasicRepositoryConnector.GetTaskRunner.fetchChecksum(URI,
File)}}.

This is a starting point to improve things.

> HTML content in POM: Maven should validate content before storing in local repo
> -------------------------------------------------------------------------------
>
>                 Key: MRESOLVER-90
>                 URL: https://issues.apache.org/jira/browse/MRESOLVER-90
>             Project: Maven Resolver
>          Issue Type: New Feature
>    Affects Versions: 1.4.0
>         Environment: both with maven 3.6.0 in CMD or in Eclipse 4.9.0
>            Reporter: Jörg Hohwiller
>            Assignee: Michael Osipov
>            Priority: Major
>
> For some odd reasons somethimes errors just happen and a maven repo delivers an HTML
error or login page for a request of a POM or JAR file. It seems as if the status code is
valid then Maven (might be anything under the hood, maybe even ether?) is saving the result
without any sanity check or validation.
> Therefore I frequently end up with "POM" or "JAR" files in my local repo that are no
XML but HTML nonsens.
>  
> Example:
> {code:java}
> <!--
>    DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS HEADER.
>  
>     Copyright (c) 2007 Sun Microsystems Inc. All Rights Reserved
>  
>     The contents of this file are subject to the terms
>     of the Common Development and Distribution License
>     (the License). You may not use this file except in
>     compliance with the License.
>     You can obtain a copy of the License at
>     https://opensso.dev.java.net/public/CDDLv1.0.html or
>     opensso/legal/CDDLv1.0.txt
>     See the License for the specific language governing
>     permission and limitations under the License.
>     When distributing Covered Code, include this CDDL
>     Header Notice in each file and include the License file
>     at opensso/legal/CDDLv1.0.txt.
>     If applicable, add the following below the CDDL Header,
>     with the fields enclosed by brackets [] replaced by
>     your own identifying information:
>     "Portions Copyrighted [year] [name of copyright owner]"
>     $Id: index.html,v 1.2 2008/06/25 05:48:51 qcheng Exp $
> -->
> <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
> <html>
> <head>
> <title>Please Wait While Redirecting to Login page</title>
> <script language="JavaScript"> <!--
> function redirectToAuth() {
>     var params = getQueryParameters();
>     var url = 'UI/Login';
>     if (params != '') {
>         url += params;
>     }
>     top.location.replace(url);
> }
> function getQueryParameters() {
>     var loc = '' + location;
>     var idx = loc.indexOf('?');
>     if (idx != -1) {
>         return loc.substring(idx);
>     } else {
>         return '';
>     }
> }
> //-->
> </script>
> </head>
> <body bgcolor="#FFFFFF" onLoad="redirectToAuth();">
> </body>
> </html>
> {code}
> I would expect maven to verify the content before officially placing it in the correct
location inside the local maven repository on my disc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message