Category: migrate

Migration from WordPress to Second Crack

The first step in migrating from WordPress to Second Crack was to move all the posts from the WordPress database to .md files on disk. This can be a cumbersome task and I have not found a good way to do this.
The process I followed was manually going from post to post and converting it to MarkDown. You can use your favourite search engine to find a convertor that fits you, I used a javascript solution I was able to use inside my webbrowser. The blog I migrated
from did not have any pages, so there was no need for me to migrate those.

Posts

Using the query below I fetched the posts from the database (where wp_ is the WordPress table prefix for your tables), this returns all posts and drafts present in the WordPress database, ignoring all revisions:

SELECT post_date, post_name, post_content, post_status, guid FROM wp_posts WHERE post_type = 'post';

Permalinks

After you have completed the migration of the posts, the hard part is done. What remains now is to handle the permalinks that have changed in the migration. My blog used to run on an Apache server with the mod_rewrite module on to make prettier URLs possible.
Unfortunately the permalink structure of Second Crack is fixed. If someone had bookmarked or links to one of my old blog URLs that would have ended up in an ugly 404 Not Found page. In order to solve this problem I created a custom 404 page with URL detection.

Configurating a custom HTML page when a page cannot be found is easy on both Apache and nginx.

#Apache (this is already in the .htaccess provided by Second Crack):
ErrorDocument /404.html

#nginx:
errorpage /404.html

In the www folder to the blog I created a 404.html document with the same template as the other blog pages. The blog contents is replaced with a simple statement that the Requested document could not be found.

Now if someone visited an old URL they would at least see a pretty error page. But because of the static structure of the posts I saw a possibility to redirect the user to the correct page in the new structure.
I started with the permalinks and issued the following SQL query to get the permalinks and convert those to the Second Crack post structure. Keep in mind that the WordPress permalinks end with a / and the Second Crack permalinks do not.

SELECT CONCAT(CONCAT(DATE_FORMAT(post_date, '%Y/%m/'), post_name), '/'), CONCAT(DATE_FORMAT(post_date, '%Y/%m/%d'), post_name) FROM wpdg_posts WHERE post_status = 'publish' AND post_type='post';

Javascript solution

These permalinks are used in a piece of javascript which attempts to determine whether the not found URL is an old permalink:

var oldWordpressPermalinks = new Array(
        new Array('/2012/11/unfreezing-putty-after-pressing-ctrls/','/2012/11/01/unfreezing-putty-after-pressing-ctrls'),
        new Array('/2012/11/installing-mpd-on-windows/','/2012/11/16/installing-mpd-on-windows'),
        new Array('/2012/11/decrease-video-file-size-using-ffmpeg/','/2012/11/19/decrease-video-file-size-using-ffmpeg'),
        ...
);

for( iter = 0; iter < oldWordpressPermalinks.length - 1; iter++) {
    if( oldWordpressPermalinks[iter][0] == path ) { 
        window.location.href = 'http://' + window.location.hostname + oldWordpressPermalinks[iter][1];
        break;
    }
}

You can see it for yourself, just visit a non-existing page on this blog.

Category and tag links in WordPress can be processed in a similar matter. Second Crack does not support categories, so here the assumption is made that there is a tagname for the category. There is also no check whether the tag where is being redirected to actual exists.

//check for URLs of the format/category/howto/page/2 (the page/2 part is optional)
var catPatt = /^\/category\/([a-zA-Z0-9]+)\//;
catRes = catPatt.exec(path);
if( catRes ) { //match
    window.location.href = 'http://' + window.location.hostname + '/tagged-' + catRes[1] + '.html';
}

//check for /tag/mantisbt
var tagPatt = /^\/tag\/([a-zA-Z0-9]+)$/;
tagRes = tagPatt.exec(path);
if( tagRes ) { //match
    window.location.href = 'http://' + window.location.hostname + '/tagged-' + tagRes[1] + '.html';
}

There are some other WordPress URLs currently not covered, such as /2013/01 but that was more complicated, since there are also physical folders on disk with the same name.

RSS Feed

Last item on the list for the migration is the RSS feed. In my WordPress configuration the feed was located on /feed/. I have configured my nginx webserver to permanently redirect to /rss.xml, the RSS location Second Crack.

#Apache:
Redirect 301 /feed/ /rss.xml

#nginx:
rewrite ^/feed/$ /rss.xml permanent;

Place this in your nginx server block and all subscribers to your old blog will still be able to access the new feed.

Leave a Comment