httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MegaBrutal <megabru...@gmail.com>
Subject [users@httpd] Rejecting requests for unknown virtual hosts
Date Sat, 22 Oct 2011 10:52:05 GMT
Hello all,

What I'd like to do is to only enable requests for my exact virtual hosts,
and deny all requests those either supply an unknown virtual host, or don't
supply a virtual host at all (don't send any "Host:" header). It is my usual
tactics I use on any of my servers. Primarily I do this to keep out at least
those vulnerability scanner bots those are primitive enough to shoot by IP
and not by domain name, and so they don't supply a valid hostname in their
"Host:" header.

While I've read some doc on virtual hosting, mostly the official guide on
the Apache website <http://httpd.apache.org/docs/2.2/vhosts/name-based.html>,
I still couldn't achieve what I originally wanted. I thought I did, I just
noticed the problem today. Most tutorials don't deal with the problem I
found.

If the following text is TL;DR for you, please jump to my questions at the
end. If you're curious why I asked them, continue to read.

What I tried:

# Include the virtual host configurations:
Include sites-enabled/

# Deny all unknown virtual host names
<VirtualHost *:80>
    ServerName *
    DocumentRoot /var/www
    <Location />
        Order allow,deny
#        Allow from googlebot.com
        Allow from 127.0.0.1
    </Location>
    SetEnvIf Remote_Addr "127\.0\.0\.1" localhostlog
    CustomLog "/var/log/apache2/access.log" combined env=localhostlog
    CustomLog "/var/log/apache2/reject.log" vhost_combined env=!localhostlog
    ErrorLog "/var/log/apache2/reject_error.log"
</VirtualHost>


There is an include for my virtual hosts, configs for them are stored in
separate files. Then there is a "default" virtual host config that is
supposed to take effect when the request doesn't match for any of my defined
virtual hosts. Though it is still a valid virtual host that only I may
access from localhost. I keep my phpMyAdmin there, for example, so it is
physically inaccessible for any outsiders. Myself use it by building an SSH
tunnel to access it. Since no one else has access to the server, and I don't
run any proxies, I can be quite sure that no one can access my stuff besides
me.

(Irrelevant sidenote, just for curious guys: sometimes I still let access to
GoogleBot, just to present it a robots.txt file that denies it to crawl
anything. I did this when I got an IP address for a VPS which was previously
used by someone who's dumb enough to don't point a domain name for his site,
or let his domain name still point to his old IP - many users was looking
for the guy's site on my server, most of them were referred by Google. I
thought that maybe it will help Google to delete its indexes more quickly if
I show it a valid site with a denying robots.txt. That's the story of the
commented "Allow" line for Googlebot.)

Also I log these invalid requests to reject.log (it's better if I can see
and analyze the hopeless attempts of vulnerability scanners either way),
except the valid requests from localhost which are logged to access.log.
Requests for the respective virtual hosts are also logged to distinct files.

So what's the problem? Everything seems to work. Valid virtual hosts are
served accurately, requests are logged to the correct logfiles. Clients
those supply an invalid virtual host are presented with a cute,
well-deserved 403. So what's the problem?

Vulnerability scanners those not only send an invalid virtual hostname, *but
doesn't send a "Host:" header at all*, are still get served by my first
virtual host, not this last virtual host that would give a 403 by design.
But I just got informed, this is the expected, documented behaviour:

If no matching virtual host is found, then *the first listed virtual
host*that matches the IP address will be used.
> (Apache website on Name-based Virtual Host Support<http://httpd.apache.org/docs/2.2/vhosts/name-based.html>
> )
>

OK, make it the first virtual host config! Naive! If I put my all-rejecting
virtual host before the include for my specific virtual hosts, then all
request will be served by the rejecting virtual host - even request for my
legit virtual host names. But it is also the expected behaviour:

Now when a request arrives, the server will first check if it is using an IP
> address that matches the NameVirtualHost<http://httpd.apache.org/docs/2.2/mod/core.html#namevirtualhost>.
> If it is, then it will look at each <VirtualHost><http://httpd.apache.org/docs/2.2/mod/core.html#virtualhost>section
with a matching IP address and try to find one where the
> ServerName <http://httpd.apache.org/docs/2.2/mod/core.html#servername> or
> ServerAlias matches the requested hostname. If it finds one, then it uses
> the configuration for that server.
>

Apache tries to find a suitable virtual host config by looking from up to
down. Of course, "*" matches everything, so the all-rejecting virtual host
config will catch all requests, the other virtual hosts won't be checked
ever.

Interesting to note that even my first virtual host gives a 400 (Bad
Request) response for any requests those lacking the "Host:" header, I don't
know what is the reason, but I don't have any problems with it, since this
is what I originally wanted to do - reject requests without "Host:" header.
The problem, then, since the request is still processed by one of my legit
virtual hosts, it will be logged to the virtual host's specific access log
file, and not to reject.log. Secondarily, the legit virtual host will reveal
its name in the text of the 400 response:

Bad Request
>
> Your browser sent a request that this server could not understand.
>
> Apache/2.2.16 (Debian) Server at my_legit_virtual_host.domain.tld Port 80
>

Why should I help the vulnerability scanner bot by telling it a valid
virtual hostname it didn't know before?

Sorry for the long elaborated e-mail. My questions are:

   - How can I reject all requests without a "Host:" header, while still
   logging them to the reject.log file?
   - Is there a directive (or is it planned in a future release) that tells
   Apache to don't match host-less requests to any of the virtual hosts ever,
   just refuse them by itself? (By still enabling an option to log those
   invalid requests.)


Thanks for your help in advance,
MegaBrutal

Mime
View raw message