What’s a good way to retrieve BibTeX entries? This question has been asked on tex.SX, Reddit and other forums. Google Scholar is doing a reasonable, but not always brilliant job in generating BibTeX entries (see my recent post on the topic and the discussion). The main drawbacks include the missing digital object identifier (DOI) or URL linking back to the journal website. Some BibTeX entries are just faulty. Other services such as Mendeley or CiteULike require a registration and provide more extensive services.
So here it is, introducing doi2bib.
doi2bib
We have been working hard to come up with a web service that allows retrieval of citations in BibTeX format from digital object itentifiers (DOIs). It is accessible through doi2bib.org. The service is free of charge and no login is required.
Enter a DOI and the web services provides the corresponding BibTeX entry. You can find DOIs printed on most research articles or directly from the journal website. Copy the DOI and paste it to the search field on doi2bib. The service retrieves BibTeX entries directly from publishers, through public APIs provided by doi.org and crossref.org. Therefore, you’ll get the most recent and complete citation record available.
If you have any comments or suggestions, please use our Twitter account @doi2bib to get in touch. For bug reports, visit our Github repository.
We hope doi2bib.org simplifies the citation process, giving you more time to focus on research.
You can get citations in BibTex format (other styles) from CrossRef at http://search.crossref.org . For more information on DOIs, citation formats and Content Negotiation see http://data.crossref.org for details.
Something I’ve noticed with this, and other options for automatic Bibtex generators (e.g. Zotero’s export feature) is that they don’t handle special characters at all well. For instance if an author has an accent in their name, like “Joe Bløggs”, the Tex engine doesn’t understand the special character and spews out garbage requiring me to manually tidy up my Bibtex file by escaping the special character. It would be great to have the option for the exporter to escape the special characters by generating the author’s name as (in this case) “Joe Bl\o ggs” or “Joe Bl\'{o}ggs” or whatever, depending on the character required.
Hi Ian,
Thank you for the feedback. I agree, special characters are a problem in BibTeX formatted citations. Currently, our service replaces the most common special characters, including greek letters, mathematical operators and a subset of all possible accents (in total ~120/900 characters). We are already working on a better solution which will be deployed in the near future. Your suggestion to let the user decide might in fact be the right way to go. I will add it to the list of feature requests.
Thanks again!
Tom
Hi Ian. If you experience any problems with the BibTex generated by CrossRef could you drop us a line at labs@crossref.org and we’ll try to fix it.
I have tested several of my papers and all cases conference papers are incorrectly returned as journal articles (@journal) when they should be conferences (@INPROCEEDINGS).
The examples tested:
10.1109/AFRCON.2011.6072062
10.1109/ICTAI.2009.35
10.1109/IECON.2013.6699120
On the contrary, this one was correctly detected and provide the relevant information.
10.1109/TII.2012.2219063
I will fill a bug in github.
Hi there,
I saw the issue on GitHub, thanks! I’m still thinking of a smart way to address it.
Thanks again,
Tom
Hi Cruz,
This looks like an error in the CrossRef metadata. I’ll forward this to CrossRef support. Please email support@crossref.org if you spot any more problems and we can try and fix the metadata.
Joe
Karl at CrossRef here. This is a problem with how we are generating bibtex. I’ve been using the Citation Style Language and their bibtex style definition which does not always produce accurate or valid bibtex.
I’m now in the process of rewriting our bibtex output to use a different library to produce the bibtex itself. I’m hoping that this will address encoding issues as well as type accuracy.
Thank you for this wonderful post and helpful comments. I work on knowledge flow and the data I use for my current research is a set of citation data from hundreds of focal academic papers (backward citations), which will eventually end up thousands of observations.
It obviously can’t be done by hand and we are writing a program that takes the name of the article from reference section of each focal paper, searches it in Google Scholar, and collects the BibTex file.
However, I am hearing too much about the erroneous/irrelevant information in Google Scholar BibTex codes. Does Mendeley maintain a better collection than Google Scholar, and thus does it make sense to write our program for Mendeley (or maybe another service provider) instead? I mean I don’t really have any loyalty to Google, I’d want to collect the data with the least noise.
Any ideas?
Thank you in advance.
Hi Amir,
My suggestion would be to use the crossref API instead (http://search.crossref.org/help/api). Their data is certainly less noisy, as it is provided by the publishers directly.
Good luck,
Tom
Can one use this from the commandline? E.g.
wget http…
Hi Heiko,
In the command line you’re better off using a script such as this.
Cheers, Tom
Thanks for pointing out the script. The key is the ‘accept’ header.
You can do this using wget. E.g.
does the trick.
(Sorry for coming back to this only so late.)
Heiko, thanks for your reply! Better late than never :).
Cheers, Tom
wget http://search.bibtexsearch.com/search?q=test
Three BibTeX Tips | Nick Higham
[…] If you happen to know the DOI of a paper and want to obtain a bib entry, go to the doi2bib service and type in your DOI. For further information see this blog post. […]
Hi,
It may be a little late, but I wrote a simple script that does something similar to this: . Hopefully someone finds it useful.
All the best,
Felipe
Thanks for the information. Here’s the link: https://github.com/dudektria/doi2bib.
Best, Tom
Thanks Tom!