Automatically generated bibliographies with Jekyll, Liquid and YAML

07 Mar 2018 | tags: science, code, jekyll, liquid, yaml

I recently updated how this site is built behind the scenes including more auto-generated code for bibliographies such as those found on the publications page or on the home page. There other ways of doing this (e.g. Jekyll-Scholar) but these usually rely on CSL styles which don’t do what I wanted to do and are relatively long and complex to make your own. Here is how I get my bibliographies to auto-generate using jekyll, Liquid and YAML. As always, the code is available on github.

Why would I (or you) want to auto-generate bibliographies?

Before this update, my publications list was hand-coded in HTML. If I wanted to repeat part of the list on another page (e.g. the home page) I had to copy that code over manually. I wanted a way that I could automatically generate my publications list that would also update other bibliographies around the site. For example, now when I add a new paper to my database and rebuild the site, it gets added to the main publications page and the recent papers list on the home page gets updated. It can also be used on other places throughout the site (see below).

Process and Requirements

As jekyll can handle Liquid and YAML natively, that is the only software requirement. Find out how to install jekyll.

The process for producing the bibliography is pretty simple. The information for each paper is stored in a YAML file. Wherever a bibliography should appear there is some Liquid code. When jekyll is called to build the site, it parses the Liquid code, retrieves the appropriate parts from the YAML file, and produces the appropriate HTML ready for use.

YAML file structure

The complete file is available on github.

For each paper there is an entry as shown below. Most of this should be reasonably self-explanatory, some fields are less obvious but I will come back to those.

Liquid Code

The simplest example for the liquid is below.

The first line starts a for loop that goes through each entry in the yaml file papers.yml located in the _data in the jekyll root folder (hence site.data.papers). The entry is then assigned to the variable paper.

The second line is where it decides the output and mixes Liquid code with standard HTML to combine data from papers.yml with your formatting. For example, {{ paper.title }} gets the content of the title field and puts that into the code and so on. One line 5 there is also a conditional (if) statement that allows for journals that don’t have volumes (e.g. old Chem. Commun. articles). If the volume field is present, the field value is reported (with additional html attributes) otherwise it moves on to the page numbers.

For the single entry YAML file, jekyll then produces the following HTML code.

Which you browser then interprets as:

Jasmine Lord, Hugh Britton, Sebastian G. Spain and Andrew L. Lewis "Advancements in the Development on New Liquid Embolic Agents for Use in Therapeutic Embolisation", J. Mater. Chem. B., 2020, Accepted Manuscript. [DOI]

For a single citation this doesn’t make sense, but with suitable use of the power of Liquid and additional fields in the database it becomes very powerful. Some examples.

Latest Publications

A simple modification to the code can be used to retrieve the first 5 (for example) entries and list them (this time without authors).

Output:

• "Advancements in the Development on New Liquid Embolic Agents for Use in Therapeutic Embolisation", J. Mater. Chem. B., 2020, Accepted Manuscript. [DOI]
• "Mucoadhesive Electrospun Fibre‐Based Technologies for Oral Medicine", Pharmaceutics, 2020, 12, 504. [DOI]
• "Medium-chain fatty acids released from polymeric electrospun patches inhibit Candida albicans growth and reduce biofilm viability", ACS Biomater. Sci. Eng., 2020, 6, 4087-4095. [DOI]
• "Incorporation of Lysozyme into a Mucoadhesive Electrospun Patch for Rapid Protein Delivery to the Oral Mucosa", Mat. Sci. Eng. C., 2020, 112, 110917. [DOI]
• "Mucoadhesive electrospun patch delivery of lidocaine to the oral mucosa and investigation of spatial distribution in tissue using MALDI mass spectrometry imaging", Mol. Pharmaceutics, 2019, 16, 3948-3956. [DOI]

Publications by topic

In my database, some entries have a field called tags. In this example, papers that have the tag “DNA based vehicles” are the ones that get picked. A similar approach can be used to get author specific bibliographies.

Output:

• Laura Purdie, Cameron Alexander, Sebastian G. Spain* and Johannes P. Magnusson "Alkyl-modified oligonucleotides as intercalating vehicles for doxorubicin uptake via albumin binding", Mol. Pharmaceutics, 2018, 15, 437-446. [DOI]
• Laura Purdie, Cameron Alexander, Sebastian G. Spain* and Johannes P. Magnusson "Influence of polymer size on uptake and cytotoxicity of doxorubicin-loaded DNA–PEG conjugates", Bioconjugate Chem., 2016, 27, 1244-1252. [DOI]

More detailed options

In my database, I have entries for toc, pdf and openaccess. These are true/false fields and are used to determine what content is displayed in the main publication list in a similar manner to volume conditional earlier. For example, pdf: true signifies that the copyright agreement allows me to put the PDF on my website. A link to the file is then produced (I still have to put the file in the correct place!).