gris: a lightweight parser of reference files in RIS format

I took advantage of the break to go back to a couple of side projects. In particular, one of the goals I had was updating gris, a lightweight Python package that parses reference files in RIS format that I put together a while back.

A detour on RIS data files

Not sure what reference manager people use these days but, if you use something like Zotero or have downloaded any reference file from Springer Nature, you most likely have come across a RIS file.

The RIS format is old, so old that I had to use the Wayback machine to find an original specification. Now I keep a copy in the gris documentation for reference.

It is also a format that publishers tend to play with, many years back WebOfScience used a RIS format with different tag formats when you downloaded references as other. Recently, AIP has messed up with the format adding headers that were not part of the original specification.

gris and its updates

The new version of gris now doesn’t break when reading files with headers that should not be there. It now has proper documentation and a bit of testing.

The main purpose of gris is to parse these old files into Python objects that can be then manipulated or converted into more universal formats, such as JSON. While it is completely unrelated to my daily work, as it happens it has made its way to some of my papers as well.

To do list

There are a few items in my to do list:

I need to expand the collection of input files used in tests.
I need to document the json output and demonstrate that the RIS -> JSON -> RIS roundtrip works.
I need to document the code.
I need to decide whether I want to add to gris a proper semantic layer.