Field of Science

  • in The Biology Files
  • in inkfish
  • in Life of a Lab Rat
  • in The Greenhouse
  • in PLEKTIX
  • in Chinleana
  • in RRResearch
  • in The Culture of Chemistry
  • in Disease Prone
  • in The Phytophactor
  • in The Astronomist
  • in Epiphenom
  • in Sex, Genes & Evolution
  • in Skeptic Wonder
  • in The Large Picture Blog
  • in Memoirs of a Defective Brain
  • in C6-H12-O6
  • in The View from a Microbiologist
  • in Labs
  • in Doc Madhattan
  • in The Allotrope
  • in The Curious Wavefunction
  • in A is for Aspirin
  • in Variety of Life
  • in Pleiotropy
  • in Catalogue of Organisms
  • in Rule of 6ix
  • in Genomics, Evolution, and Pseudoscience
  • in History of Geology
  • in Moss Plants and More
  • in Protein Evolution and Other Musings
  • in Games with Words
  • in Angry by Choice

Some Thoughts on How to Make Electronic Publication Work



"There is a cube of crystal here - though I can no longer tell you where -no larger than the ball of your thumb that contains more books than the library itself does. Though a harlot might dangle it from one ear for an ornament, there are not volumes enough in the world to counterweight the other."
--Gene Wolfe, The Shadow of the Torturer.


The ICZN is currently debating amendments to the Zoological Code that will formally accept the validity of names published in electronic publications (ICZN, 2008; if you're not familiar with the subject, I'd recommend reading the post just linked to before this one). Electronic publication raises a host of issues related to such matters as long-term availability and accessibility, but even an old curmudgeon like myself has to admit that it's gonna happen, whether we like it or not - in fact, it's already happening - and the question of whether or not to accept it has become more or less moot. The question, rather, is how to best respond to it.

I do have to apologise in advance for a few things I'm going to quote here without attribution that I know I've heard someone say somewhere, but I can't remember where or who. This post was partially inspired by Taylor (2009; that's Mike Taylor of SV-POW! fame, not me) and its rather snotty little title, but I wouldn't call it a direct reply. I'll also note that one of the interesting side effects of the debate on how to handle electronic publication is that it provides a much-needed impetus to tackle some of the neglected questions of how we handle printed publications.

Many of the complaints about electronic publication revolve around permanence. Supporters of electronic formats point out (and correctly so) that on-line publications* are both more readily and widely distributable than printed publications. However, for the purpose of taxonomy, we need to be thinking not only about current distribution, but also future distribution. Preservation of taxonomic works, theoretically, needs to be permanent (in the long term, of course, this is a problem for both printed and electronic works). It has been claimed that the large number of electronic copies floating around on people's private computers provides a guard against loss of an electronic publication, but this is not a sufficient guard in the majority of cases, because of the simple fact that not all organisms garner the same degree of attention. Taylor (2009) refers to the public interest around the publication of Darwinius masillae Franzen et al., 2009 - I mean, come on, it's a bloody monkey with its own bloody TV documentary, of course it raises a lot of interest. But taxonomy doesn't only concern fossil monkeys or honking great lizards, which are the other cases Mike cites. Yochelson (1969), working on Palaeozoic molluscs, estimated that the species description he wrote were "read by a worldwide audience that ranges from three to seven persons"**. This is not an audience that guarantees preservation. Also, there is a potential generational issue - when a researcher passes away, her copies of printed publications may be donated to a library and archived, but her hard-drive is likely to be thrown away or wiped.

*Including on-line versions of printed publications, which in a significant proportion of cases have become the effective primary version.

**Yochelson (1969) is a rather interesting publication for the current debate because he wrote it arguing that the ICZN should accept publications on microfilm. The more things change...



However, while permanent storage may be more of an issue for electronic media than printed media, it's one that is already being addressed, with the current proposals already including requirements for archival that, while arguably not perfect, are perhaps good enough for government work. The really major problem is not long-term availability, but long-term accessibility. To read a printed publication, I have simply to use my eyes, and for most of us they come pre-installed*. Reading an electronic publication, however, requires appropriate machinery and software. Claims that there will "always be a way to read PDF" are just hopelessly naïve, as are claims that we will be able to rely on conversion of electronic publications into newer formats as they gain prominence. Media will be converted if there is an immediate demand for their conversion, and the longer a given publication goes without being converted, the lower the chance that it will ever be converted (I'm pretty sure that a significant number of movies once available on VHS have not become available on DVD and probably never will). In the quote at the top of this post from a book set in our world's far-distant future, the librarian Ultan refers to what is obviously some sort of electronic storage device. What he fails to mention is that the technology to read the material stored on it no longer exists.

*Of course, it has to be in a language I can read, but that issue applies equally to electronic and printed media.

So electronic publications are not sufficiently reliable, but electronic publications are already here and only going to become more predominant. Are we doomed to confusion then? No, because I don't think that the means of evading the issues are really that difficult. The current code allows for the effective publication of electronic media if a permanent copy is deposited in a number of libraries on CD. The proposals in front of the ICZN include phasing out CD publication, and rightly so in my opinion - CD publication carries all the issues of accessibility associated with electronic media, without any of the advantages of online publication. But why not request the deposition of print-outs? That both allows for the validity of the electronic publication (with all its advantages of disseminability), while still maintaining the printed counterpart as a back-up.

Taylor (2009) points out that the current requirements for back-up deposition are vague and difficult to comply with. I would definitely agree that, if nothing else, they lead to something of a logical paradox in their current requirement that the publication itself include details of its own depositories. A simple reprint of the current requirements with "printed copy" substituted for CD would not help matters. But what about ZooBank? ZooBank is the proposed register of zoological names currently being developed. The current code does not require registration of names, but it is being considered and the electronic publication proposals include compulsary registration for electronically-published names at least (though as pointed out in the comments to the post linked to at the top of this one, if they're going to make it compulsary for some they might as well make it compulsary for all). A separate proposal that has been made that would make registration compulsary for all new names (Polaszek et al., 2005) allows a window of two years between publication and registration for a name to be validated*. Perhaps ZooBank could be expanded to also hold listings of archives of printed copies for electronic publications with, again, a two-year window to allow the authors/publishers to arrange the depositories and submit the info. Heck, the same thing could be done for names published in rare printed publications such as the journal Lansania that I've discussed previously.

*And it was this simple point that made yours truly no longer an avid opponent of registration. In a workable system, registration would have to come after publication, because the potential consequences of a name being published but not registered afterwards are minor compared to those of a name being registered but not published.

One potential complaint is that the printed version may not be identical to the electronic version. This would be particularly unavoidable if the electronic version includes such features as video that are not reproducible on paper. But for the purposes of taxonomy, what does the permanent record actually need? Name, type material, description/diagnosis. If the entire publication cannot be stored in printed form, then maybe allowances could be made for an abridged version containing the vital points to be deposited in its stead. Cantino et al. (2007) provides something of a precedent in botanical nomenclature (which currently excludes electronic publication) - a shorter, printed version of this work on plant phylogenetic nomenclature was published containing the essentials of definitions and such, while a longer electronic version allowed for more expansive discussion. So long as the two were not in direct conflict, then the existence of two versions does not actually pose a problem for nomenclatural purposes. For those wishing to guard against potential issues, perhaps a clause could be included in the code giving one of the versions priority over the other - I'd suggest something to the effect of the electronic version having priority so long as it remains available and accessible, because that's the version people are more likely to have access to in such cases.

Discuss.

REFERENCES

Cantino, P. D., J. A. Doyle, S. W. Graham, W. S. Judd, R. G. Olmstead, D. E. Soltis, P. S. Soltis & M. J. Donoghue. 2007. Towards a phylogenetic nomenclature of Tracheophyta. Taxon 56(3): E1-E44.

Franzen, J. L., P. D. Gingerich, J. Habersetzer, J. H. Hurum, W. von Koenigswald & B. H. Smith. 2009. Complete primate skeleton from the Middle Eocene of Messel in Germany: morphology and paleobiology. PLoS One 4 (5): e5723.

International Commission on Zoological Nomenclature. 2008. Proposed amendment of the International Code of Zoological Nomenclature to expand and refine methods of publication. Zootaxa 1908: 57-67.

Polaszek, A., M. Alonso-Zarazaga, P. Bouchet, D. J. Brothers, N. Evenhuis, F.-T. Krell, C. H. C. Lyal, A. Minelli, R. L. Pyle, N. J. Robinson, F. C. Thompson & J. van Tol. 2005. ZooBank: the open-access register for zoological taxonomy: Technical Discussion Paper. Bulletin of Zoological Nomenclature 62 (4).

Taylor, M. P. 2009. Electronic publication of nomenclatural acts is inevitable, and will be accepted by the taxonomic community with or without the endorsement of the Code. Bulletin of Zoological Nomenclature 66 (3): 205-214.

Yochelson, E. Y. 1969. Publication, microfilm, microcard, microfiche, and the International Code of Zoological Nomenclature. Systematic Zoology 18 (4): 476-480.

17 comments:

  1. Some really great thoughts in your recent post. Thanks for promoting the idea of electronic publication.

    The Biodiversity Heritage Library has posted a link to your post on
    "Some Thoughts on How to Make Electronic Publication Work".

    http://twitter.com/BioDivLibrary

    ReplyDelete
  2. "Snotty"?

    Explicit, I think, is the word you wanted. I like to title my papers so that truly lazy people can read only the title and skip the rest :-)

    Oh, and: "just hopelessly naive" is perhaps not the very most carefully argued rebuttal I've ever seen. My BZN paper talks specifically about the objections typically raised to PDF and why none of them is valid any more (though some used to be). I might be wrong, but if you want to persuade me of that you're going to have to do better than "just hopelessly naive"

    ReplyDelete
  3. We have had technology turnovers in the past. Many of them could not have been predicted much prior to their occurrence. I very much doubt that, in any of them, the data conversion has been total.

    It's "naive" because precedent argues otherwise.

    ReplyDelete
  4. There IS no precedent for abundant zero-cost copies distributed (also at no cost) globally.

    ReplyDelete
  5. But as I said in the post, the "abundant copies" side of things is simply not an assumption that we can make.

    When I was working on _Pantopsalis_, some of the taxa I was looking at had not been referred to in print for over fifty years. It is not impossible that I was the first person looking for their original descriptions in all that time (and everyone who had previously referred to them was long dead). If those descriptions had been stored in a software-dependent format, and a technology changeover had happened in that time, there would have been a non-negligible possibility that those descriptions would have been lost or rendered inaccessible.

    ReplyDelete
  6. You make a good point. This is an argument for mandatory archiving and registration.

    Or for working on a more interesting group :-)

    ReplyDelete
  7. We can never assume abundant copies, but increasingly libraries (traditional repositories for print) are creating digital repositories. There is still a bit of a way to go before the DR will be as robust and disperse as print collections, but it is on the horizon.

    The Smithsonian's library is a "deposit" library for print; for the past few years, we've built a DR (http://research.si.edu/) that hosts (currently) Smithsonian staff publications. Working with the Biodiversity Heritage Library project, we hope to expand the ability of natural history, botanical, taxonomic libraries to become abundant, diverse, storage for digital taxonomic work ("lots of copies keeps stuff safe", LOCKSS in the DR parlance).

    The US government has had a similar "depository" system in place for 60 or so years and is struggling with similar issues related to new digital publications (same for copyright depositories in the US and UK - the two areas I'm familiar with).

    Big issues all, but I think the problems outlined above (and in the comments) can be solvable through trans-disciplinary work.

    ReplyDelete
  8. Timely and thought-provoking post, Christopher. When reading it, I found myself nodding in agreement most of the time. I have no real objections, only a few comments.

    CD publication carries all the issues of accessibility associated with electronic media, without any of the advantages of online publication

    Not to mention that CDs may physically degrade over time (sometimes shockingly fast).

    A more general point: your post is mainly concerned with the long-term electronic availability (or lack thereof) of taxonomically relevant material, but this whole issue pertains to all other electronic-only scientific literature as well. Currently, and increasingly, a lot of science communication and commentary take place on blogs. Do you have any thoughts on the staying power and long-term availability of these? Do you think people will be able to read what's currently posted on, say, Catalogue of Organisms or on SV-POW ten years, twenty years, fifty years from now? Will science-related Internet blog posts be the kind of material people will want to save and convert over to whatever will eventually replace the current Internet?


    And Mike, maybe I’m misunderstanding what you mean but this can’t be right:

    There IS no precedent for abundant zero-cost copies distributed (also at no cost) globally.

    No cost?? What about your university/institution who pays for the access to most of the journals? What about costs for maintaining the web sites that host the journals, including the open-access ones? What about the electricity it takes to run the computer you're reading the PDF on? What about your computer/laptop itself; surely it wasn’t assembled for free and then donated to you for free? Electronic-only publishing might be cheaper overall than traditional publishing, but it sure as heck isn’t free. In this world, somebody always pays along the way; if it's not you, it's somebody else.

    ReplyDelete
  9. Dartian, you're right that this has implications beyond taxonomy, but I suspect that the issues are more pressing for taxonomy than other fields because taxonomy is actually directly dependent on having that historical record. Other fields, while they may find them useful, are not actually brought to a halt by their absence.

    And I think you have misunderstood Mike - what he was referring to was basically the process of sending a copy from one computer to another alone, for which (while it's not entirely free) the cost is pretty minimal (not to mention the point that, unlike with sending a paper copy, the gift of an electronic copy does not mean the giver no longer has a copy themselves). However, my point above was that, while this theoretically can lead to an endless number of copies does not necessarily mean that it will.

    ReplyDelete
  10. And I think you have misunderstood Mike - what he was referring to was basically the process of sending a copy from one computer to another alone

    Fair enough (although I do see the file-sharing part as a relatively - emphasis on 'relatively' - trivial aspect of the digital revolution). But I'd still like to point out that free electronic file sharing won't do much good for someone who can't afford the hardware or the software to read those files with. That may not be a pressing issue for most Westerners, but it often is to someone in, say, Africa. (No, in the long term that's not really a valid argument against switching to electronic publication and I know it. But it’s good to keep in mind that the digital revolution does not advance at the same pace all over the world, and that economic poverty often strongly contributes to that state of affairs.)

    ReplyDelete
  11. The technology divide used to be a problem I saw with compulsary registration as well. However, I think it might have been Mike Taylor himself who pointed out to me when I raised the issue on the DML a number of years back* that the current system is equally disadvantageous to researchers in disadvantaged countries.

    *Back when issues like that still got raised on the DML. *Sigh*.

    ReplyDelete
  12. Because they don't have access to the printed resources either, that is.

    ReplyDelete
  13. Dartian,

    Sorry, I didn't make my point clear. Christopher is correct: I wasn't arguing that the whole process of publishing is zero-cost, but that once a PDF exists, the marginal cost of making copies is so close to zero it can't be measured. And that situation -- the making of copies is free, instantaneous and unlimited by geography -- is what's without precedent.

    And, yes, there are plenty of places that can't get electronic publication in electronic form. This is why we have printers. An electronically published paper is trivial to convert to paper; by contrast, something published on paper can only be converted to digital form via the relatively laborious process of scanning, cropping, cleaning, OCRing and manual error correction.

    I bet that PLoS articles are more widely distributed in the third world than articles from Elsevier journals!

    ReplyDelete
  14. This website is a fascinating use of electronic publication itself. Web-based publication is one option of course. Content Management Systems (CMS) like Drupal keep content separate from display, so your data will remain the same, but how it's displayed can be updated over time. I use Drupal as an example because I've seen groups working on standard practices for organizing and storing scientific information within that CMS. The separation of content from display is now standard practice for accessibility. Let's the content be displayed as the reader requires, and makes updates easier.

    ReplyDelete
  15. This comment comes a little late (sorry about that). I was troubled by what Mike wrote:

    I bet that PLoS articles are more widely distributed in the third world than articles
    from Elsevier journals!


    That may or may not be the case (I know of no actual data). But 'more widely distributed'
    does not necessarily mean that those articles will end up getting any more citations than
    non-open access articles.

    Davis et al. (2008) found that open access articles were indeed much more
    frequently downloaded than subscription access articles. However, 'article downloads' do
    not always equate with 'readership', and still less with 'ending up being cited'. It turned out that open access articles were
    not more frequently cited than subscription access articles.

    In other words, the literature's open access status does not seem to affect citation practices. (In an earlier review, Craig et
    al
    . (2007) reached a similar conclusion.) And the number of actual citations, for better or for worse, are still what truly matters if one tries to quantify scientific impact.

    References:

    Craig, I.D., Plume, A.M., McVeigh, M.E., Pringle, J. & Amin, M. 2007. Do open access articles have greater citation impact? A critical review of the literature. Journal of Infometrics 1, 239-248.

    Davis, P.M., Lewenstein, B.V., Simon, D.H., Booth, J.G. & Connolly, M.J.L. 2008. Open access publishing, article downloads, and citations: randomised controlled trial. British Medical Journal 337:a568, doi:10.1136/bmj.a568

    ReplyDelete
  16. Clarification to my previous message, lest it's thought that I'm against open access in general: I'm not (though I do have some reservations). I'm only pointing out that the existing data, few and preliminary as they are, suggest that one can't argue along the lines of 'If the publication isn't open access, other researchers won't cite it'. Clearly they will, so other arguments are needed.

    (Oh, and an apology for the messed-up lines; I don't know why that happened.)

    ReplyDelete
  17. The download vs. citation discussion is interesting (and controversial).

    Which has more impact? Something that has been read 1,000 times and cited 20 times or cited 40 times and read 100 times?

    The whole field of impact and citation analysis is still more an art than a science.

    Are open access publications cited less because they do not have high impact factors (as measured by conventional methods)or because readers are publishing in non-traditional locations?

    What are the economic factors for commercial publishers to maximize distribution? Marginal costs for distribution of the second copy approach zero in the digital environment; commercial publishers, however, will receive increasingly smaller returns as they expand the subscription base of a title.

    Further Reading
    The Economics and Structure of Scholarly Publishing (Jevin West, University of Washington and eigenfactor.org, April 17, 2009)
    http://www.bioone.org/userimages/ContentEditor/1240257471168/West_Presentation_PPM09.pdf

    Usage Factor vs. Impact Factor: A Case Study with BioOne Journals (Jason Price, Claremont University Consortium, April 17, 2009)
    http://www.bioone.org/userimages/ContentEditor/1240570493132/Price_Presentation_PPM09.pdf

    ReplyDelete

Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS