Announcing SFM Version 1.2

The SFM team has overcome the confluence of a series of conferences, vacations, holidays, nasty colds, and other deadlines to get version 1.2 wrapped up. In this post, I’m going to share a few of the highlights of this release.

Your users will start receiving harvest status emails. These emails will summarize the recent harvesting activity for their collections.

Harvest email

By default the emails will come every day; if that is too often, the frequency can be changed with the new user self-management feature.

User self-management

With improvements to the harvesters and the harvest list table on the collection detail page, monitoring harvests from the UI will be easier.

Harvest list table

The table now includes statistics on the harvest and time since the last update. With changes to the harvesters, the updates will be more frequent (when the harvest starts, roughly every 30 minutes thereafter, and when the harvest completes).

In addition to providing statistics on the items collected, the collection set and collection detail pages now include statistics on the WARC files.

WARC statistics

Behind the scenes, there are significant improvements to the harvesters. The improvements include:

  • Graceful shutdown of harvesters (e.g., for upgrades)
  • Harvests resume when interrupted by a shutdown. (This works for exporters as well.)
  • Harvest status is changed to “running” as soon as a harvest begins.
  • Harvesters retry twice after social media client exits with errors.

For more details, see the blog post “Anatomy of a Social Media Harvester.”

Lastly, we continue build out SFM’s documentation. In this release, we added a page on collection types, which describes how SFM harvests from each of the social media APIs. We’ve attempted to provide practical advice as understanding the affordances of the social media APIs can be really tricky. Your suggestions on this documentation is welcome.

Our next development milestone is already underway, with work proceeding on adding support for importing/exporting collections, HTTPS, and improved monitoring of harvesters.