PDF versus HTML
We asked researchers to tell us which format they prefer
By Dr. IJsbrand Jan Aalbersberg Posted on 1 July 2013
As you may know, over the past few years the Article of the Future team at Elsevier has introduced an array of article content innovations to enhance the online reading experience. More details on the most recent of these can be found in the article How to handle digital content in this edition of Editors' Update.
With so many of these innovations already live and functioning on ScienceDirect, we thought the time was ripe to poll researchers about their thoughts on the future of the scientific article format - in particular the online version (HTML) and the traditional PDF. We posed a series of questions to a 500-strong online group of researchers, whose members represent a variety of disciplines worldwide*. Their answers provided some interesting food for thought.
How we conducted the survey
The survey was divided into two phases. We began by asking participants for their thoughts on both the HTML and the PDF. We asked them to outline the pros and cons of accessing and reading articles in both formats and which had their preference. We also asked them to indicate which format they expected to be using in the future. Once they had completed their answers, participants entered stage two. This involved viewing a video outlining Elsevier's Article of the Future project, which focuses on enriching content in a discipline-specific manner. They were then presented with a second set of questions.
During the first stage, key insights gained included:
- Researchers favor the PDF format primarily for in-depth reading - they find the HTML format more convenient and suitable for discovering content and determining its relevance.
- Researchers feel there will be a place for both HTML and PDF formats in the future: PDF for offline access and the storage of content for later reference, HTML for online access and immediate learning and discovery.
- If we want to promote the uptake of HTML, we need to address concerns around author input, printing, storing and sharing content, and make the HTML layout customizable.
In the words of one participant: "If the article contains interactive elements, then (the) HTML version would make more sense; otherwise (the) good old PDF will be preferable."
Other participant comments included:
"There exist so many articles. And it's hard to open or download whenever I find interesting things. So it's more reasonable to read (the) HTML version first."
"I prefer to work with the printed version of the article because I don't like reading from (the) screen."
"I use articles in the HTML format because they may contain links to additional information and tools (missing in PDFs)."
"PDF version is formatted just like the article in print, I can easily navigate to the places I want."
"HTML files are ok on the screen but messy to handle downloaded. But I do sort normally on abstract or relevant info I get in the HTML environment, before I click on the PDF icon for download."
"I prefer to share the article URLs rather than sending big PDF files around."
"(The PDF) looks and feels more like a paper article. If I want to print it, I think it will look better printed from PDF. If I want to save or email it, it is easier with PDF."
Dr. IJsbrand Jan Aalbersberg, PhD, is Senior Vice President Journal & Content Technology and leads Elsevier's Content Innovation team. He was not surprised by the findings of the first stage of the research. He explained: "These replies match the ones received in earlier Article of the Future studies, which led us to develop the three-pane article view now available on ScienceDirect. The PDF-style center pane is ideal for reading the paper while the other two panes offer a series of discipline-specific presentation and content enrichments that add real value to the article. The preference for a PDF format when printing is something that the Article of the Future project is taking into consideration: we are currently looking into how we can make the center pane easy to print, while maintaining its optimized reading format."
In the second stage of the survey, participants were shown the Article of the Future video (below), which discusses recent improvements to article presentation, content and context and the introduction of the article three-pane view on ScienceDirect.
When asked if the video had changed their perception of the usability of the HTML format, 60% agreed it had, while 25% said it hadn't and 15% were unsure. A sample of participants' comments is shown in this table:
60% said: 'Yes, it has changed my perception'
25% said: 'No, it has not changed my perception'
|"I conventionally don´t like html articles because of the way they are presented in the screen, nevertheless the Article of the Future is taking the traditional way to a new frontier, beyond hypertext to metacontent management by user or reader."||"This is more or less what I can see on some programming software, but applied to articles, good idea but questions remain: How will it age? How expensive to maintain? How to keep it alive and operational?"|
|"Elsevier has used the power of the internet to make sure the article is a truly dynamic, interactive and well annotated and connected scientific document."||"Logical path forward. The current problem is that not all users are equipped with appropriate technology, nor do they master it."|
|"I like the interaction with the content, and the ease of exploring other links and references without having to go back to search for them later."||"…as long as I cannot download it, it's hard to archive and I prefer reading it offline."|
|"I think there's a great deal of advantages to providing the option to publish in such a format. I would be interested to see how authors can access the tools to represent their data in these new formats."||"It offers greater interactivity - but there again a much more powerful PDF viewer (with interactive tools built in) would preserve PDF's ascendancy."|
|"… much more powerful than pdfs that I currently use."||"I didn't see how the features were relevant to articles, only handbooks and textbooks."|
More than 65% thought there would be a shift towards HTML use in the future.
We also asked participants to let us know whether they expected the way they access and use articles today to change in the future. Their responses varied:
|Don't know / not sure||5.8%|
The digital revolution has radically changed the way in which scientists carry out their research, and process and store their results. It is clear that as long as technology develops, the way scientists access and use articles will develop as well. The important question is, "How and by how much?"
Twenty years ago, the only article format was paper, 10 to 15 years ago the format became PDF, and now a new way of usage has been created by the introduction of tablets. Does that mean that we threw away paper and will now throw away desktop computers? No – we apply the format and way of use that is applicable at the moment of use. And the same will hold for PDF and online HTML; I think there is a future for both. PDF will remain the preferred format for archiving and offline use, while online HTML will increasingly become the standard for online use, as it is so much richer and in tune with the ongoing developments in the regular research process.
PDF and HTML – the pros
| || |
* The questions were posed to a community of 500 researchers. Two surveys were conducted during January 2013. The first attracted 159 responses (31.8% response rate) the second attracted 122 responses (24.4% response rate).
A similar version of this article originally appeared in Editors' Update.
After joining Elsevier in 1997, Dr. IJsbrand Jan Aalbersberg, PhD, served as Vice President of Technology at Elsevier Engineering Information (Hoboken, USA) during 1999-2002. As Technology Director in Elsevier Science and Technology (2002-2005), he was one of the initiators of Scopus, responsible for its publishing-technology connection, and subsequently focused on product development in Elsevier's Corporate Markets division (from 2006-2009). He then took on the role of Vice President Content Innovation, which he held until 2012. In both that role and the position he now holds, he has striven to help scientists to communicate research in ways they weren't able to do before. Now he is Senior Vice President, Journal & Content Technology. IJsbrand Jan holds a PhD in Theoretical Computer Science.