httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter <...@citylink.dinoex.sub.org>
Subject [users@httpd] Apache converts GZIPed data into UTF-8 - bug or feature?
Date Sun, 14 Apr 2019 15:06:40 GMT

Hello,

 Configuring a REVERSE PROXY, I try to *relocate* the "mountpoint"
URL; i.e. change the filepath, so that http://myhost/stage/myapp will
reach the backend server as http://backend/myapp.

This seems to me a fairly commen task, as one often has an app server
running it's app on it's server-root, while it needs to be published
under a specific path.

The doc says one can do it this way:

Location </stage>
   ProxyPass "http://backend:5970"
   ProxyPassReverse "http://backend:5970"
</Location>

I found that this doesn't help me much, because it does not
relocate the URLs in the body of a document. To solve this,
I found to include "proxy_html_module", according the instructions
in "extra/proxy-html.conf": activated these features

LoadFile        /usr/local/lib/libxml2.so
LoadModule      proxy_html_module       libexec/apache24/mod_proxy_html.so
LoadModule      xml2enc_module          libexec/apache24/mod_xml2enc.so

and added this to my Location-Container:

  ProxyHTMLEnable On
  ProxyHTMLURLMap http://backend:5970/ /stage/
  ProxyHTMLURLMap / /stage/

This nicely solved my problem, but now weird errors appeared, which
took me a night to hunt down. I finally figured the problem is
the xml2enc_module, which does very serious damage: When the backend
sends a CSS stylesheet file, it looks this way:

>From Backend to Apache:

> HTTP/1.1 200 OK
> Last-Modified: Sun, 14 Apr 2019 05:53:41 GMT
> Content-Type: text/css
> Content-Length: 20
> Content-Encoding: gzip
> Vary: Accept-Encoding
> Connection: keep-alive
> Server: thin

>From Apache to Client:

> HTTP/1.1 200 OK
> Date: Sun, 14 Apr 2019 06:39:43 GMT
> Server: thin
> Last-Modified: Sun, 14 Apr 2019 05:53:41 GMT
> Content-Type: text/css;charset=utf-8           (!!!)
> Content-Encoding: gzip
> Vary: Accept-Encoding
> Content-Length: 24                             (!!!)
> Keep-Alive: timeout=15, max=100
> Connection: Keep-Alive

We can see that the Content-Type was modified to mention "utf-8", and
the size has increased from 20 to 24 bytes.

Let's look at the content:

>From Backend to Apache:

        0x00f0:                 1f8b 0800 e4ca b25c 0003        .......\..
        0x0100:  0300 0000 0000 0000 0000                 ..........

This is the correct 20-byte hexcode of a gzip'd file of length 0.

>From Apache to Client:

        0x0140:                           1fc2 8b08 00c3            ......
        0x0150:  a4c3 8ac2 b25c 0003 0300 0000 0000 0000  .....\..........
        0x0160:  0000                                     ..

This is obviousely valid UTF-8 text.

But no browser can make anything of this, because it cannot
be reverted to the original gzip data, which is not a charset,
it is binary!
What we get instead is a load error in the Web Developer, or,
if we try to load the CSS-file directly, it says:

  Content Encoding Error
  The page you are trying to view cannot be shown because it uses an invalid or unsupported
form of compression.

(not very helpful either, so this gives quite a while to search
around, if one is not specifically involved in Web technology
and does this just for fun.)

The easy workaround is to switch off that xml2enc_module. But then
there are these annoying warnings when starting the server:
[Sun Apr 14 09:24:06.153900 2019] [proxy_html:notice] [pid 48178] AH01425: I18n support in
mod_proxy_html requires mod_xml2enc. Without it, non-ASCII characters in proxied pages are
likely to display incorrectly.

(Uh, hm. It does *not* mention about _bytes_ in _gzip_ data that appear to
appear incorrectly _WITH_ it.) 

Anyway, I think this is so bogus that bogus is no longer a word for it.
Why is this happening, and what is to blame?

~~~~~~~~~~~~~
Server says:
Version: Apache/2.4.39 (FreeBSD) PHP/7.2.17 mod_scgi/1.15 OpenSSL/1.0.2o-freebsd
Server Built: unknown
Server loaded APR Version: 1.6.5
Compiled with APR Version: 1.6.5
Server loaded APU Version: 1.6.1
Compiled with APU Version: 1.6.1
Module Magic Number: 20120211:84
[and lots more of such; in case any is of interest for this matter, just ask]

$ pkg which /usr/local/lib/libxml2.so
/usr/local/lib/libxml2.so was installed by package libxml2-2.9.8

$ uname -a
FreeBSD myhost 11.2-RELEASE-p9 FreeBSD 11.2-RELEASE-p9 #0 r343946M#C51:240: Thu Mar 28 03:44:30
UTC 2019     root@myhost:/usr/src/sys/i386/compile/E1R11V1  i386


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Mime
View raw message