Importing Drupal blogs to Jekyll
I had about 5 Drupal blogs that I wanted to export to Jekyll. Most of them were a mix of blog and HTML website, although the biggest one also used Drupal to generate some info pages, like the About page. They ranged from 1-200 posts. Of course, the biggest blog had comments. Obviously, I started with the smallest blogs and worked my way up.
I started with the
jekyll-import
plug-in on Github, which creates a _posts
directory with your blog posts and Jekyll-ized file names, like
so:
Front matter is added to each post that looks like so: Breaking that down:
- all tags (are there categories in Drupal? I only ever used tags) become categories which makes for a very funky directory structure and makes referencing your other posts a pain to type, so one of the first things I had to do was to manually go in an pick one or 2 categories and change the rest to tags. If you've got bigger blogs, you will probably want to mod the conversion plug-in to do that part for you, especially if you do any cross-linking of your own posts.
- the layout were variously "story" or "blog" and since my theme used "post" as the blog post layout, I just symlinked both "story" and "blog" to "post" and was done with that problem 5 seconds later.
- created, I'm sure is a date/time stamp in a format that I don't immediately recognize, but the post date is also in the filename, so I'm assuming there's no issue there.
The last detail is the post text itself, which looks like so:
The ^M
causes problems if they appear in the excerpt,
so I remove them wholesale while I'm editing the categories/tags
which is a quick keyboard macro search and replace
in emacs.
Speaking of excerpts, the manual ones you put into Drupal transfer
(usually with <!--break -->
, which can be reused
with excerpt_separator: <!--break -->
added to your
page front matter). Sometimes a chunk of excerpt text has been
auto-generated in the conversion and added to the page front
matter. The rest of the time there is no excerpt, and the whole
"By default this is the
first paragraph of content in the post" is an inconsistent lie
(as of jekyll 3.8.5). So, if you're a wordy blogger you need to fix
the excerpt thing, too, for your blog lists.
Starting to sound like a lot of manual work, eh? Of course, you could either mod the import plug-in or write a supplemental post-processing script, because very little of this needs to be human-guided decision making. I was just doing it by hand for my 1, 5, and 20 post blogs. With my 200 post blog I might decide that writing and debugging a script is less work that just doing it with keyboard macros...
I forgot to mention that there are also all manner of directories
with post stubs that redirect and refresh to the _posts
posts for category/tag directories and date directories. I just
deleted all of those and let Jekyll recreate what directories it
wants.
Lastly, I was not using my Drupal in the most standard of manners, so sometimes I had to go hunting on the server for where I had squirreled away my images, although some images came along for the ride as you'd hope and expect them to.