A Better Calibre of Kindling

(Originally posted 2012-01-23.)

You might consider it showing off if I mention I got a Kindle for Xmas. Feel free to. πŸ™‚ But I’d like to share my experience with you – as you might find it useful anyway.

First, I really like the Kindle as it stands. Mine is a Keyboard 3G one. I felt both the “keyboard” and 3G elements were important:

  • I surmised (correctly) I’d want to take notes.
  • I surmised (equally correctly) I’d want to be able to do things wherever I was that would need access to “Kindle Central”. (Actually, access at 35,000 feet will have to wait.)

I’ve found the basic act of reading on the Kindle to be at least as rewarding as reading paper books. I also appreciate putting an end to being engulfed by the rising tide of new books.

(In the house I seem to be the one that wants to keep books once I’ve read them. I’m also the one who doesn’t feel the need to complete a book if I’ve read it. So I have several books on the go at the same time on the Kindle and it’s kept track of where I am with them all. Yes, I know it’s called a bookmark so no distinct advantage there.)

I also appreciate the social aspect:

  • Sharing snippets via the Kindle website and posting links to them on Twitter. Some of you will have seen that – probably most of you given I propagate tweets to Facebook and LinkedIn.
  • I’m re-reading Terry Pratchett’s “The Colour Of Magic” and it’s nice to see “you and 5 people” against key quotes. I don’t know who these people are but already I feel kinship with them. πŸ™‚

Book delivery is pretty swift – which is much more than can be said of ordering paper books. And I’ve used the “try a sample” capability several times: With both positive and negative buying outcomes. I’m using the Amazon “Wish List” as my queue for acquiring books so I don’t necessarily buy immediately.


There isn’t much need for curation but my tool of choice for doing so is Calibre which is available for Windows, Linux and OS X. (I run it on Linux and OS X (though others in the house have Windows and there’s one other Kindle in the house). It’s free and it’s very good. One tip: If you’re using it on Linux it’s probably best to install it directly, rather than going through e.g. Debian repositories. I say this because it’s frequently updated and the repositories seem to be way behind.

I used Calibre with my old Sony PRS-700 eBook reader – which I found to be unusably slow and hard to read. (The Kindle is neither of these.)

Calibre does a number of things for me. Most notably it lets me:

  • Convert books from other formats e.g. EPUB.
  • Download RSS / Atom “news” feeds and convert them to MOBI so I can read them on Kindle.
  • Edit metadata for books – such as titles and authors. (Mainly this is worthwhile for books that weren’t from the Kindle Store – as some of them have dubious spellings etc.)
  • (I actually don’t feel the need to have Calibre back up my Kindle – though it will do that as well)

Calibre has a lot of sophistication built into its conversion. I’ve yet to fully explore what it can do, for instance, to tidy up conversion of PDF documents. Page footers, for one, need removing on conversion.

One other thing: You can use Calibre in Batch Mode. That might well help with automation.

Project Gutenberg

I’ve known for a long time about Project Gutenberg. To quote from their website:

“Project Gutenberg offers over 38,000 free ebooks: choose among free epub books, free kindle books, download them or read them online.”

Two good things to note:

  • Project Gutenberg has a rigorous copyright checking process – so everything is out of copyright or otherwise in the public domain. I’m against ripping off authors, so this is a good thing.
  • The books are well formatted: eBook quality can vary enormously, to the point where books can be frustratingly hard to read (in the worst case).

Without listing the catalog I’d say you can find many classics there. The “usual suspects” like Chaucer, Shakespeare and Oscar Wilde are represented (all of which I have on my Kindle), along with many others. (I wish Raymond Chandler were there but the absence of his works probably means they’re still under copyright protection.)

Distributed Proofreaders

So, where do Project Gutenberg books come from? I can’t say this is true of all of them but many come from Distributed Proofreaders. The idea of this is that people sign up to proofread OCR’ed pages – one page at a time. I signed up to do this and worked on the first proofreading of two books. I’d never heard of the books before and the actual process was good as I found the books interesting in their own right.

The OCR process was pretty accurate but the proofreading was absolutely necessary. I think it might be possible to codify many of the errors in the OCR process as they were repeated.

There are several rounds of proofreading and so the results – books in Project Gutenberg – are very good. There’s a lot of emphasis on not correcting the spelling or punctuation, and on not editorialising.

More volunteers are needed. As I say I’ve enjoyed doing it.


If you connect a Kindle to a PC or Mac (and I’ve done both) the Kindle shows up as a removable drive. The most useful thing you can do with it is to extract the ‘My Clippings.txt’ file. This contains all your bookmarks and annotations. It’s reasonably hackable: While it’s not XML (and I really wish it were) it has a simple-to-understand and easy-to-parse format in plain text.

One challenge I’d like to see someone meet is processing this file and creating Evernote notes. True you can get at your annotations etc from Amazon but I think there’s value in easing getting marked up passages into Evernote. Indeed I’d be pleased if Amazon and Evernote worked together to provide a slick “clip to Evernote” function for Kindles other than the Fire.

I have other hacking challenges else I’d work on this one – processing the file – myself. I know that doing it for Windows (and Linux under Wine) and for OS X would mean two separate pieces of code.

So Why Am I Still Carrying Around Paper Books?

It turns out I still have a few books to get through in paper format before I go “all electronic”. I also expect there to be incidences where someone gives me a book. I consider those to be “beyond my control”. πŸ™‚

One final thing: For another view (although a corroborative one) see Susan Visser’s blog posts on the subject.

