Wednesday 7 March 2012

Open access and generating senior scientist buy-in


There is a vibrant, intelligent but completely impractical debate happening right now around publishing scientific papers just use #openaccess to see the volume of twitter posts. The concepts are great-faster dissemination of information, cleaner peer review process, greater collaboration. 

My issue is the lack of reality check or way to bring it into real world use. Science at its heart is a glacier- cold, unworried, progressing forward in an unstoppable manner. It also has the inertia of literally hundreds of years and in general a highly conservative set of guidelines. What needs to be fought against is this idea that peer review requires a third party outside of the scientific community. The internet and the transparency that it brings makes a third party paid watch dog unnecessary, there are plenty of folks on the internet just looking to "yell gotcha".

There is some value to the conservative mindset that works well for society and Progress-the burden of proof. The conservative guidelines protect Science from making too many mistakes and jumping to conclusions based on the unexplained. It's what prevents Einstein's theory from being torn down by a faulty wire. It also means that the younger generation of scientist must bring ideas to the hallowed halls of old science and prove that it will work by bringing real world suggestions that will fix the interwoven problems of publication, attribution, grants, jobs, and tenure. 

Otherwise its just pissing in the wind and complaining. Not my idea of what scientists do. They come up with theories, models and explanations-that are then roundly torn  down by their peers and rebuilt into a better product.

With that said lets take a look at how open access and "non-publishing" publishing could work in the real world*. 

First some challenges that I see as the main drag on process:
  1. Comparative analytics. There needs to be transparent metrics to gauge the value of the research to the larger community. This can be an issue since every scientist does the most important research to the world.
  2. Control over content-this may seem trivial but if I am the principal investigator I'm not sure I want every post-doc and grad student uploading their crappy, blurry images. The metadata surrounding who gets tagged, whats get tagged and the terminology is vital to ensuring that the data can be reviewed by everyone who may be interested. 
  3. Clear lines of permissions-What role do collaborators play in making this public who gets to post it? where is it hosted? The university still has some co-ownership. This is not something that can be decided afterwards it has implications for grants, etc.
  4. What about intellectual property? I'm all for open collaboration and sharing data with academics but what guarantees are there that pharma and biotech will play by these rules? The beauty of public research from the grantor's(read government) point of view is it maximises their investment. That is lost if [insert huge pharm company here] comes along and builds a drug based on that data and charge the public obscene amounts of money.
  5. The content that is made public has to have a finished look. This can't be just thought vomit. It must have some clarity of thought and citations. Put some thought into it! Distill by putting into context, why should I care? how does it support or counter the current models. Proper citation through hyperlinks. It should be peer review not crowd sourced editing.
  6. Control over comments. Comments can't be removed just because someone says your data quality sucks. This has to be transparent warts and all. You have to take reviewers suggestions of more experiments seriously. As someone who has reviewed for a variety of journals (top five to lower tier) nothing is more frustrating than taking the time to review and having someone completely disregard it. 
  7. Buy-in from senior scientists. Science is largely a oral history, the reality is that for most methods just having the paper is useless for understanding how to do it or where the problems can arise. The insights from senior scientists that have seen it all are required for this to be truly revolutionary. 
  8. Buy-in from Administrators. I for one do not believe that open publishing will be any less time consuming or cheap for the scientist. Someone will have to maintain the database for the content and ensure that there is enough capacity for the videos, excel files and what not that will be made available. This will have to be maintained for decades with back-ups, etc. Someone will have to know where and what there content there is AND when is should be taken down and replaced. Right now the university covers those costs for each departments website, unless there is a clear benefit i.e. grant money, donors, clarity on which faculty members are doing well and in a perfect world which require more help.

Now the potential solution:

A dedicated website where each lab publishes its own data. Personally I don't be believe that all data should be available but some labs and branches of science believe that is best. Allow the system to have some flexibility, some fields are inherently more competitive and technically nuanced than others. Scientists need to retain the ability to check data quality, accuracy and potentially fundamental flaws in experimental setup as a lab group prior to making it public. 

I like figshare and creative commons and all of the really great tools that are coming at breakneck speed. I love the idea of posting of all of the data but I truly believe that my job as a scientist is to analyse the data not just find/acquire it to send to others. If open access publication does not include this it will set back cancer and other highly nuanced fields in biomedical sciences years. These fields have moved at breakneck speed because such a premium is placed succinct analysis by the publishers. While I do not believe publishers should be the gatekeepers, I do not want to lose the analysis of data for the sake of speed of publication. 

The solution: most labs have university owned/manged websites where the Principal Investigator (aka PI, professor i.e. the person who's tuckus is on the line) owns the admin rights. These should become more of a real world home for the publication and sharing of lab data. It exists and some labs do a great job of updating it with content as it gets published. Everyone else needs to get on board bring it into the modern age with appropriate tagging and labels to ensure that it can be found through search engines.

PIs need to retain control, they are the ones that will be held accountable if the "published" result that shows a new cure for cancer that turns out a contaminant. The pervasive nature of the Internet means that the media has access to the data and need to hype themselves as getting the most interesting story. "Never let the truth get in the way of a good story" is a truism for more and more journalists. Accidents happen and while I'm alright with being embarrassed by my peers figuring it out, I'm not alright with it spinning into a worldwide story and having to explain that it was an "oops" to the public. 

Main point: Use the established website, each PI has admin rights to remove. Give senior lab members the ability to publish concise analysis with appropriate figures and links to the whole data set with metadata and clear descriptions. As part of the mentorship for new post-docs and graduates students training on what is considered an acceptable level of proof prior to making the data public. Laboratories are still training grounds, as someone who has trained students and post-docs the peer review process allows young scientists looking to move up to the majors an idea of what good science is-this cannot be lost by opening up publication. I love the citizen science movement but at the end of the day, like anything, there has always been a difference between someone who does something as a hobby and someone who has the discipline, passion and willpower to dedicate their life to a subject. Training and the culture of science has to be part of it-science is about justifying your opinion and the quality of data. 

Peer review isn't broken, the publishing models are, let's not throw the baby out with the bath water. The university controlled site allows for clear rules of engagement for pharma, media and allows for the level of control that PIs, chairs need to ensure that crappy data doesn't not spiral out of control into a scandal. The departmental chair and grant study groups can look at the metrics; website views, re-links, etc to allow flexibility into the systems for review whether it be tenure, grants or something else that no one has thought of. Open access will be a failure if it does not give everyone involved with the industry (yes its an industry! get over it people). It's not perfect but it can be piloted in a way that senior scientists from the core Cell, Nature and Science author pool can at least talk about. I think that many of the ideas that are being bandied about are much better than this for science as a whole and ultimately will be the long term solution. 

That being said I have yet to see an idea that any of the top level Cancer, Stem Cell, etc scientists will buy into. This is not a group to dismiss, they may not be the majority but they represent the main attraction for why scientists will not give up on the Elsevier or any other for profit publisher. They also are the presidents or senior leadership of some of the most influential universities in the world (Caltech, Rockefeller U., Memorial Sloan-Kettering, Max-Planck, etc). As a final reason to get them on-board they also are on the grant study groups and a variety of other activities that effect all levels of at least biomedical sciences.

At the end of the day the risk posed by publishing incorrect data needs to be balanced with greater access and conversation about what can be done next. Please comment as you see appropriate.


*Disclaimer-I only have experience with a limited number of institutes(eight) so I do not know if any of this is applicable widely. I have no idea if the issues that are bringing up are universal or limited to the institutes that I have worked at.