I’ve been writing blog posts (of varying quality) since 2006. One of the things that I always dread about moving blog platforms is the export and import process. This morning I managed to get all the blog posts across surprisingly quickly.
The last system I was using that the posts existed in was a simple MySQL database (simple because I built it quickly in 2010). I still haven’t turned off my old hosting which provides a PHP MySQL interface so I able to export all that history to a JSON file in one click.
A few lines of Python later and there is a simple method of opening up the JSON file and outputting a new MarkDown file for each blog post complete with Front Matter. The Gist is on Github.
import json | |
import os | |
with open('oldblog.json') as f: | |
data = json.load(f) | |
sorted_pages = sorted(data, key=lambda k: int(k['id'])) | |
posts = (x for x in sorted_pages if x['page_status'] in 'published') | |
for post in posts: | |
print post['id'] + " " + post['page_title'] | |
filename = "blog/%s/%s.md" % (post['page_datetime'][:4], post['page_url']) | |
if not os.path.exists(os.path.dirname(filename)): | |
os.makedirs(os.path.dirname(filename)) | |
file = open(filename,"w") | |
file.write("---\n") | |
file.write("title: %s\n" % post['page_title'].replace(':','-').encode('utf-8', 'ignore')) | |
file.write("type: article" + "\n") | |
file.write("tags: %s\n" % post['page_category']) | |
file.write("date: %s\n" % post['page_datetime']) | |
if post['page_header_image']: | |
headImage = post['page_header_image'].replace('/assets/uploads/', '/_assets/img/blog/imported/') | |
file.write("leadImage: %s\n" % headImage) | |
file.write("---\n") | |
file.write("%s\n" % post['page_content'].encode('utf-8', 'ignore')) | |
file.close() |
The other thing I dread about moving platforms is the words in those old posts… Each time I flick back through the older posts, I remember that I really don’t recommend reading the blog posts pre 2010… I mean, teenage me had far too much time on his hands, and decided to use that to write mostly drivel…
Currently reading through blog posts I wrote when I was 16… Blegh. Trying to work out if I want to put them back online for the history of it all…
— James Doc (@jamesdoc) 1 September 2018
But, it feels wrong to just delete them. So they are there. For the histories. It made sense to include a health warning at the top of those posts so I’ve added a new filter my 11ty config to warn people if they stumble upon them:
eleventyConfig.addFilter("dateComparison", (dateObj, compareWith)=> {
return DateTime.fromJSDate(dateObj) < DateTime.fromISO(compareWith)
})
You can see it being used in my post template file.
Post changelog
- 2022-01-13 – Tidy up a number of blog posts for tags
- 2020-05-17 – Decouple gulp from SCSS generation
- 2018-12-24 – Generate (but not use yet) RWD images
- 2018-09-01 – Importing all the old blog posts