[personal profile] contrarianarchon
Achievement get! I have successfully used wget to download the entirety of a smallish wordpress site because it took my fancy and I had no other way to sensibly archive the entire thing. (And I spent so long mourning not knowing how to use it and it turned out to be pretty simple, thanks at least in part to [personal profile] brin_bellway already having worked out many of the obvious pitfalls and then posting the details very helpfully.)

...

This is more power than I should have. I suspect I will be using up a worrying amount of data capacity on this... For a good cause, of course, but even so.

Date: 2020-06-16 08:35 pm (UTC)
mindstalk: (Default)
From: [personal profile] mindstalk
wget has many options. I noted the ones that might be of interest to me

-o logfile
-nd #no directories
-nH #no host directory
-P prefix directory, #i.e. where to save to, default .
--header=header-line
--referer=url
-U agent-string or --user-agent=""
-r #recursive, default depth 5
-l depth #default 5
-k #--convert-links for local
-m #--mirror, -r -N -l inf --no-remove-listing #an FTP thing
-N #--timestamping
-p #--page-requisites, dependencies even if beyond depth
-H #--span-hosts
-np #--no-parent, never ascend

so for mirroring I often want wget -mk -nH or wget -mk -nH -np

Profile

contrarianarchon

September 2024

S M T W T F S
1234567
8 91011121314
15161718192021
22232425262728
2930     

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags

No cut tags
Page generated Jun. 3rd, 2025 05:49 am
Powered by Dreamwidth Studios