Using Wget for Recursive FTP

We have found that using wget to recursively grab FTP contents is useful in the following situation:

  • You cannot use rsync or scp due to restricted or no shell access on the remote server
  • You need to recursively get directories and globbing with mget * isn't working as expected
  • Transferring files to an intermediate workstation first is not feasible due to time/filesize constraints

If all of these apply to you, then a recursive wget via FTP is probably your best bet. On the surface recursive wget is quite simple:

wget -r

The problem here is two-fold. If you simply run that command as given assuming the data used above, wget will download data to / It's easy enough to move this content after being copied to the correct location, but we'd rather do it all in one step! So assuming you have switched the the directory where you want the files copied (i.e. public_html), the following command will perform the download and remove the directory appending functionality:

wget -r -nH --cut-dirs=4

In the above example, --cut-dirs was set to 4 because we are transversing down 4 directories to get the source data(/path/to/web/content/), so be sure to modify that number based on your particular case. The -nH option specifies to not include the host portion of the wget URL in the path as well.

Finally, you can also add -nc to ensure that any locally existing files are not overwritten by duplicates coming in from the remote server.