Generating a Jekyll Site on Mercurial (Hg) Push

As we mentioned in our last post, we plan to share aspects of how we moved to Jekyll. This is the first of these posts. With a database-based solution, updating is easy; when the user updates content through the user interface, the new content is served when the page or post is requested. However, with a static site, an “update” is technically any change to the underlying files from which the site is generated. Typically, though, this is marked by source control commit and push to a master repository. GitHub Pages, the product for which Jekyll was developed, uses this as a flag to regenerate the site.

For a few different reasons, we did not wish to host our sites on GitHub. We generally use Mercurial (Hg) for our source code control, and the master repository resides on a different server than the one from which the site is served. With the need to regenerate the site after each site’s master repository receives a push, there were a few different options we considered.

  1. When a push occurs, regenerate the site on the Hg server, then use scp to delete the old files from and copy the new files to the web server.
  2. Set up a sandbox on the Hg server that updates and regnerates each time a push occurs, and run rsync on the web server to check for updates every so often.
  3. When a push occurs, notify the web server, and have it regenerate the site.

The first option has the potential to run afoul of SSH rate limits, plus has the potential to require much more data transfer than option 3. The second option had the advantage of running a process local to the Hg server, but would have required disk space utilization that we didn’t really need; and, as Jekyll regenerates all the pages in a site, rsync would have likely ended up transferring all the data for every update anyway, losing one of its benefits. The third option required Jekyll to be installed on the web server, and uses it for processing, potentially taking cycles that could be used to serve web pages.

Eventually, we decided to go with option 3. On the Hg server, in the master repository for each site, we put the following in .hg/hgrc (the following examples are for this site):

[hooks]
incoming = /opt/jobs/notify.tech-blog.sh

…and then, notify.tech-blog.sh

#!/bin/bash
ssh user@web.server touch /opt/jobs/jekyll/.tech-blog

That is the only logic required on the Hg server. Now, over on the web server, we need logic to regenerate the site and make it live. Since we have multiple sites, we wrote a script that has a few variables, so it could be duplicated for other sites. The following is tech-blog.sh:

##
## DJS Consulting Tech Blog Jekyll Build Script
##
## This will check out, build, and replace a Jekyll-generated site. Just update
## the parts under the "Env" heading for the specific site.
##

## Env
REPO=jekyll.tech-blog
DEST=/path/to/public/techblog
TRIGGER=.tech-blog
## /Env

cd /opt/jobs/jekyll

if [ -e $TRIGGER ]
then
  rm $TRIGGER

  ## Check out the site and build it
  hg clone ssh://user@hg.server/HgPath/$REPO $REPO
  cd $REPO
  jekyll build

  ## Copy it to the proper directory
  cd _site
  rm -r $DEST/*
  cp -r * $DEST

  ## Clean up
  cd ../..
  rm -r $REPO
fi

This script isn’t perfect; it needs to check the exit code from the Jekyll build process, and notify on a failed build. That would be a nice addition. However, with Jekyll being the same on both development and production, and a single committer, this is fine for our purposes.

Finally, each script needs to be run to check for the presence of the semaphore (or TRIGGER, as the script calls it). The following cron definition will check every 4 minutes for a change.

*/4 *   *   *   *    /opt/jobs/jekyll/tech-blog.sh > /dev/null

Overall, we’re pleased with the results. The inter-server communication is light, only requiring one initiated ssh connection from each server, so we won’t run afoul of rate limits. With the work being done on the destination server, the amount of time where there are no files in the directory (between the rm -r $DEST/* and the time the cp -r * $DEST finishes) is very short; it would have been much longer if the directory were being repopulated across the network, or more complex if we added a staging area on the web server. Each piece can be run separately, and if we’ve committed a post with a future date, we can run the same touch command to make that post appear.

Next time, we’ll discuss our experiences converting a non-WordPress site.


Tech Blog v4

From August 2011 until today, this site has been running under WordPress. During this time, we have done many experiments with several other blog platforms, but none of them made it to the “import all the old stuff from this one - this is what we’re looking for!” stage. As you may have already guessed, though, this is no longer the case. WordPress does what it does very well. However, the last post before this one was August… of 2014! That means that, every page served from this site, a script was run on the server that accessed a database and dynamically assembled the page. This content did not need to be dynamic - even when there is a new post, there is very little in the way of dynamic content here.

Jekyll is a static site generator; these types of applications generate static sites based on a collection of templates and content files, that can then be served with no backing data store. Now, we can utilize the blazing fast nginx web server to push these files as quick as people can request them, where the request does not even have to escape the process.

There will be more to come on Jekyll; there are at least two posts to be written, one on automating the build process and another on the migration from WordPress. Until then, though, there are redirects that ensure the RSS feeds for both the main blog and the xine RPMs require no changes, and the category pages have redirects as well. If something does not look right, let us know via either of the social media accounts linked above.


gxine 0.5.908 RPM

Below are the RPMs for gxine version 0.5.908. See About the xine RPMs for information on how these were built.

gxine — The main gxine program
gxineplugin — Browser plugin library for gxine

To use this, you’ll also need xine-lib - as of this release, the most recent release of xine-lib is 1.2.6. The latest xine-lib 1.1 is 1.1.21.

(To save disk space, only the current release and two prior releases will be maintained.)


xine-lib 1.2.6 RPM

Below are the library and development RPMs for xine-lib version 1.2.6. Be sure to check out the About the xine RPMs post for information on how these were built.

xine-lib — The main xine library
xine-lib-dev — The development xine library (needed if you’re building an interface against xine-lib)
xine-lib-doc — Documentation

You’ll also need a user interface - as of this release, the most current release of xine-ui is 0.99.9, and the most current release of gxine is 0.5.908.

(To save disk space, only the current release and two prior releases in the 1.2-series will be maintained.)


xine-ui 0.99.9 RPM

Below is the RPM for xine-ui version 0.99.9. See the About the xine RPMs post for information on how this RPM was built.

xine-ui — The user interface

To use this, you’ll also need xine-lib - as of this release, the most recent release of xine-lib is 1.2.6. The latest xine-lib 1.1 is 1.1.21.

(To save disk space, only the current release and two prior releases will be maintained.)


A Handy C# Async Utility Method

In the course of writing C# code utilizing the new (for 4.5.1) Task-based asynchronous programming, I’ve run across a couple of places where the “await” keyword either is not allowed (a catch block or a property accessor) or the “async” keyword greatly complicates the syntax (lambda expressions). I’ve found myself writing this method for two different projects, and so I thought I would drop this Q&D, more-comments-than-code utility method here for others to use if you see the need.

(UPDATE: This works well in console applications; it can cause deadlocks in desktop and web apps. Test before you rely on it.)

/// <summary>
/// Get the result of a task in contexts where the "await" keyword may be prohibited
/// </summary>
/// <typeparam name="T">The return type for the task</typeparam>
/// <param name="task">The task to be awaited</param>
/// <returns>The result of the task</returns>
public static T TaskResult<T>(Task<T> task)
{
    Task.WaitAll(task);
    return task.Result;
}

And, in places where you can’t do something like this…

/// <summary>
/// A horribly contrived example class
/// </summary>
/// <remarks>Don't ever structure your POCOs this way, unless EF is handling the navigation properties</remarks>
public class ExampleClass
{
    /// <summary>
    /// A contrived ID to a dependent entity
    /// </summary>
    public int ForeignKeyID { get; set; }

    /// <summary>
    /// The contrived dependent entity
    /// </summary>
    public DependentEntity DependentEntity
    {
        get
        {
            // Does not compile; can't use await without async, can't mark a property as async
            return await Data.DependentEntities
                .FirstOrDefaultAsync(entity => entity.ID == ForeignKeyID);
        }
    }
}

…you can instead do this in that “DependentEntity” property…

    /// <summary>
    /// The contrived dependent entity
    /// </summary>
    public DependentEntity DependentEntity
    {
        get
        {
            return UtilClass.TaskResult<DependentEntity>(Data.DependentEntities
                .FirstOrDefaultAsync(entity => entity.ID == ForeignKeyID));
        }
    }