Announcing htmlbib, a tool for rendering BibTeX files as interactive HTML

For some time now, I’ve been working on an annotated bibliography of articles on various topics in transportation (particularly the history of automatic fare collection from 1960 to the present, as well as the SelTrac train control system and its origins in Germany). I’ve been compiling the information using BibDesk, and I’d like to be able to share it with a wider audience, in the hope that it might be useful to someone.

At a bare minimum, posting the BibTeX file online somewhere would fulfill my desire to get the information out there. But not everyone out there who might benefit from the bibliography uses BibTeX. For many people, I fear a .bib file would be nothing more than unintelligible gibberish; outside of academic circles (and even then, outside of the hard sciences), TeX is not particularly well-known.

The next alternative would be to post the bibliography online as a PDF or HTML file. This alternative is considerably more accessible to non-BibTeX users, but actually makes life harder for people who would like to be able to copy references (as BibTeX source) to use in their own BibTeX files (common practice in communities of TeX users). Merely rendering the entire contents of the file also loses some of the metadata—the comments associated with entries, the groups and keywords, etc.

There are also specialized tools (like bibtex2html) for converting a BibTeX file to HTML. But there, still, the results fall short; the output is mostly static text. I wanted a tool that would make good use of the keywords entered in BibDesk, and which would provide links between publications and authors. I also wanted a tool which would be equally useful for BibTeX users, who would be helped by having access to the BibTeX source for each entry, and non-BibTeX users, who would be helped by having formatted bibliography entries. I therefore set out to built a tool that would meet my needs; the result is htmlbib.

One of the items of concern for me was that the bibliography entries be formatted properly; after having taken care to make sure that the information was added to BibDesk so that it would be rendered well, I did not want to have some generic template used to create HTML for each entry. So, I ended up cobbling together an arrangement that actually uses BibTeX and tex4ht to produce HTML for each entry using the desired BibTeX style (in my case, IEEEtran), so that the entries look the same in the preview as they would in an actual publication. This is slow, but the preview results are cached, so subsequent runs are faster.

As for parsing the BibTeX file, since I’m already familiar with scripting BibDesk, I decided to use appscript to call BibDesk from Python. The result is therefore not portable from OS X, but it suits my needs. There are BibTeX parsing libraries for Python, so porting to another platform would only require substituting one of those libraries of the calls to BibDesk; the rest is pure Python, with the exception of lxml, and the aforementioned preview code, which expects a functioning TeX installation on the system.

The HTML is produced using Jinja2 templates, which for now are stored in the application egg. The default, built-in template is built on Blueprint CSS and jQuery along with jQuery Tools. It wouldn’t be too hard to provide an option for using user-specified templates instead of the built-in template.

I’ve uploaded some sample output to demonstrate what htmlbib does.