httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ruben Safir <>
Subject Re: [users@httpd] Re: mirror a html site
Date Sun, 24 Dec 2017 23:37:01 GMT
On 12/24/2017 01:49 PM, Miguel González wrote:
> On 12/24/17 12:53 AM, Good Guy wrote:
>> On 23/12/2017 10:26, Miguel González wrote:
>>>   A hosting company with their builder tool created a static html site
>>> that can´t be downloaded.
>> Did you try this tool?
>> <>
>> If not please provide a link of the site because there is no such thing
>> as "can´t be downloaded" when the site is visible to the public.
> What I mean is that the company doesn´t provide any FTP access to
> download the files.
> I did use httrack and at least I could keep a backup of the website (not
> complete, because It wasn´t able to download links with spanish characters).
> Unfortunately as I said, it creates folders for the cdn entries and the
> structure of the website is using
> structure with subfolders for each cdn.
> For the time being I am using wget -mkEp which is still using the cdn
> entries from the company. It´s not the best solution but in case they
> turn of the cdns It will be much "easier" to change links manually.
> thanks!

Scraping the website largely depends on the amount of javascript garbage
on the pages.  The straight html and source can be pulled by LWP and
w3m, fairly easily.

So many immigrant groups have swept through our town
that Brooklyn, like Atlantis, reaches mythological
proportions in the mind of the world - RI Safir 1998

DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002 - Leadership Development in Free Software - Unpublished Archive - coins!

Being so tracked is for FARM ANIMALS and and extermination camps,
but incompatible with living as a free human being. -RI Safir 2013

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message