I’m porting a few sites from Jekyll over to WordPress and needed to get a list of the urls in order to add redirects (HTTP 301s). Using a bit of UNIX-fu made this simple
Find all html pages
find _site/ -iname "*html" > url-list.txt
Edit the _site part to be the domain to the new WordPress site, like: http://example.com/
emacs url-list.txt
Pipe each url into curl so the WordPress blog is hit with each url. This will cause the Redirections plugin to log the request so I can go through each one. (-I
tells curl to show the HTTP headers only).
cat url-list.txt |xargs curl -I
Finally when I’m done I can rerun the last command and check that all the urls are redirecting correctly:
HTTP/1.1 301 Moved Permanently Date: Fri, 15 Apr 2011 23:24:33 GMT Server: Apache/2.2.14 (Ubuntu) X-Powered-By: PHP/5.3.2-1ubuntu4.7 Location: /category/blog/page/2/ Vary: Accept-Encoding Connection: close Content-Type: text/html