Python – Convert BibTex file to database entries using Python

bibtexMySQLpython

Given a bibTex file, I need to add the respective fields(author, title, journal etc.) to a table in a MySQL database (with a custom schema).

After doing some initial research, I found that there exists Bibutils which I could use to convert a bib file to xml. My initial idea was to convert it to XML and then parse the XML in python to populate a dictionary.

My main questions are:

  1. Is there a better way I could do this conversion?
  2. Is there a library which directly parses a bibTex and gives me the fields in python?

(I did find bibliography.parsing, which uses bibutils internally but there is not much documentation on it and am finding it tough to get it to work).

Best Answer

Old question, but I am doing the same thing at the moment using the Pybtex library, which has an inbuilt parser:

from pybtex.database.input import bibtex

#open a bibtex file
parser = bibtex.Parser()
bibdata = parser.parse_file("myrefs.bib")

#loop through the individual references
for bib_id in bibdata.entries:
    b = bibdata.entries[bib_id].fields
    try:
        # change these lines to create a SQL insert
        print b["title"]
        print b["journal"]
        print b["year"]
        #deal with multiple authors
        for author in bibdata.entries[bib_id].persons["author"]:
            print author.first(), author.last()
    # field may not exist for a reference
    except(KeyError):
        continue