Tuesday, 8 April 2014

Clinical data random information

I've become an information hoarder. As I spend more time thinking about Information Management and speeding the move to better technical systems, I am amazed how general the principals of design are between the different industries.

Here is a noobs (i.e. me) "plain spoken" understanding of a key term in managing patient data across hospitials and for predicative analytics and personal health decison making.

Level setting (i.e. in general the definition of Clinical data warehousing) Clinical data warehousing is a patient identifier organized, integrated, historically archived collection of data.

For the most part the purpose of CDW is as a database for hospitals and healthcare workers to analyze and make informed decisions on both individual patient care and forecasting where a hospital’s patient population is going to need greater care (i.e. patient’s are showing up as obese; therefore the need for specific hospital programs to fight diabetes are a good idea).

Data warehousing in healthcare also has use in preparing for both full ICD-10 and meaningful use implementation. For example; McKesson through its Enterprise intelligence module probably has plenty of CDW management capabilities the only interested in meeting the upcoming ICD-10 and meaningful use deadlines. These kinds of worries are only for US hospitals. However since Canada requires ICD-10 compliance for all EMR systems this does present a benefit to Canadian healthcare.

In principal since data warehousing at its core is about building a relational database and should be EMR supplier agnostic. Since McKesson is an ICD-10 and meaningful use- ready supplier, the database itself should conform to standards that would allow general solutions to be used. This article goes through some of the potential benefits and pain points. It is tailored to clinical trials but the underlying message that building a CDW is a ongoing procedure is the same for other uses.

One example of how this may be done is Stanford’s STRIDE; they used HL7 reference information model to combine their Cerner and Epic databases. This is part of a larger opensource project that may be an option if an organization has some development expertise.

Since the main user of CDWs tends to be the people doing the analysis (current buzzwords for search for analytics include:BI, Predictive analytics, enterprise planning, etc) it is probably useful for Health IT professionals to understand its WHO and WHAT the CDW is for within the organization...i.e. have a full blown Information Governance plan that places a value on information not just a risk assessment. 

Friday, 28 March 2014

Security without usability isn't better healthcare

I spend a lot of my time understanding how information is stored, accessed and protected as part of my role as a IT analyst. I always am astounded at how little of what is standard practice in many industries as not filtered over to health care and/or life sciences (Pharma+Biotech+academia).

The recent hub-bub about ACA (AKA Obamacare) has completely yelled over the real transformation opportunity in healthcare. Up until the recent deadlines and political fights regarding ACA "everyone" was really concerned about meaningful use. The TL;DR version of the MU legislation is this: make information available to care providers and patients.

So what are we really talking about here? It is really pretty simple; it is information management and the processes that guard against mis-use while enabling productivity.

Lets be honest the EHR/EMR solutions implemented at most organizations do not enable productivity or protect information. Doctors hate them because they do not fit their work patterns (see here), hospitals are have significant issues with data protection (see here) and importantly it is not mitigating the biggest risk to patient outcomes (and hospital liability) (see here).  

It is time to re-think the information silos in healthcare.

So if a single poorly accessed EHR is not the answer, what is?

I would argue that we need to think about this based on information flow and how we expect the value to be delivered. In this case patient care.

An interesting model to think about is the Canadian delivery model. For example; Ontario E-health has determined it is neither cost effective nor timely to build a single system for every hospital.  At the moment, 70% of all physician practices and hospitals already have some sort of EHR system in place. So rip and replace is not an option, the reality is we need to make lemonade.

Since Ontario funds the hospitals through direct allocation of tax revenue, it is loathe flush that money down the drain. 

Therefore the best approach is to control the data itself (including digital images, prescription history, surgery, etc) and letting the individual hospitals control how they view and use the data. 

In other words- Make it easier to access information based on who you are and what you need the information for!

Focus on the Information exchange layer

Consolidated Information Management layout for Patient care focus. 
So how do we do this without moving to brand new systems and shiny new toys?

The same way every other industry is doing it; especially low margin high risk industries such as Oil and gas, Insurance and Manufacturing. Keep the clunky but very secure system and take advantage of the new technologies that enable information sharing. Instead of all-in-one solution add an ECM or portal to manage rights, search and presentation. It will be more cost-effective than doing nothing or rip and replace.

This structure controls movement and access to patient data, allowing for quick access to the appropriate information based on job and location.  It provides a structure that takes advantage of the current investment in a secure database yet provides a flexible layer that is designed to convey information in context for end users. 

This may not be the best system or the system that you would design from scratch with an unlimited budget, but it gives a long term flexibility AND doesn't require a rip and replace of your current EMR/EHR. It should provide very good, highly usable healthcare at a reasonable cost.

The way they are going about the change may not be splashy but it will work for both patients and doctors- that’s a great thing. The one thing it won’t fix is the doctors who refuse to use it-and that is a bad thing.

There is additional cost involved in this model but if teh doctors and nurses do not use what you have now.....would salvaging that investment be better?

Love any comments or critique of the model.

Saturday, 1 February 2014

Big data is just a euphemism for lazy and cheap

Maybe I'm getting cantankerous but I'm really over all of the talk about big data and how it is going revolutionize the world businesses are going to so efficient they will only need a CEO and a lowly marketing guy. Governments will so efficient taxes will be almost unnecessary. 

Enough! The reality is that big data isn't new and most organizations are not mature enough or focused enough to take advantage of the new technology. 

Learn the lessons of the past.
I was (am) a scientist. I did my Ph.D in neuroscience and genetics back when sequencing a single gene took months. For reference, the bleeding edge technologies can deliver a whole genome (about 20 thousand genes) in 15 minutes

I have already complained about the challenges in knowledge management in science - and the parallelism in businesses today in this blog. I'll summarize; businesses suck at getting the right information to workers because they are cheap and lazy. 

No one wants to pay to do it right, everyone thinks that the app should be cheap and reduce labor cost by reducing the need to hire smart people. 

Well folks organizing and analyzing data/information is hard and takes a deep understanding of the difference between junk and INFORMATION.

The original Big data problem
Scientists have always generated large, complex data sets that are almost too difficult to comprehend.
As we enter the genomics era in science it has gotten worse because most scientists have not taken the time to do quality control on the information that they submit to public databases. The public data is very spotty at best; how many scientists can honestly say that they trust the gene ontology notes?

N.B. For non-scientists the Gene ontology database is a repository of notes, data, or published papers about our combined knowledge of each gene's function, interactions and chemical inhibitors. It contains links across species and across several databases.

The problem is that it is incomplete NLM/NIH does not have the money to maintain it-nor do any of the primary owners. The pace of growth is to much for the curators to keep up with. The number of different sources has also grown, you have images, gene expression studies, drug testing, protein interaction maps. 

Science has had a big data problem since before computers. How has the scientific community moved forward and had success even in the face of such poor data stewardship?

People.

Anyone how gets through a Ph.D has a great analytical mind. They can see through poor quality data to those nuggets of truth. How do they do this? They focus on finding an answer to a question, and then they build out from that question until they have built a complex multifaceted answer.

You wan to know why science is becoming stagnant and have serious ethical and just plain stupid errors of reproducibility?

We do not train scientists to be critical and form questions. We teach them to get a whole lot of data and mold it into a a beautiful story. The logic being that if you look at enough data the truth will come out. It never does; if you start out with biased data you will get a biased answer. The data sets are inherently flawed.

There is no big data only poorly framed questions. If you have a big data problem it is because you have been a poor data steward and you don't have a question. so you have no ability to start sifting through information.

Their has always been a lot of information it is just That we trained people to work with it, understand it, analyze it and make decisions. More importantly we understood that failure was a good thing, it is a chance to define the question and focus on things that will work.

A lesson not learned
There is no such thing as big data, just better storage of the vast amounts of information that life generates. Nothing has really changed it just the problem is more visible-and we downsized all of the keepers of the knowledge. Most organizations- healthcare and Pharma being the key culprits refuse to train people to think critically and scrutinize the veracity and quality of information/data.

You want to fix the big data problem? Train people to ask questions and let them answer the question. Or hire someone well trained already such as the overstocked "bioinformatics Ph.D" class of scientists. The biottom line is that new shiny system is still going to give you crap data if the person asking the question is can't ask good and insightful questions.

Realize that autocorrect is the state of the art in predictive analytics right?......let that sink in for a minute. Are you will to leave your career or company to this?

You don't need more data, you need the right data and the time and confidence to fully vet the quality of the data. We need people that understand today  to test how well that information fits with the world today. This is a key element of accurate predictions

In biomedical sciences this really comes down to how we train graduate students; do we make them learn statistics or just hope that excel is good enough? Are we willing to mentor students or are they just cheap labor for the gratification of the professor? Do we pay attention to how we store and mange information so that the next student can find it?

For most businesses it comes down to why? Is there a business question that we need to solve, what is the problem that we need fix, is there a new source of revenue that we can exploit? What are our past failures and what can we learn from them?

Tuesday, 21 January 2014

Twenty skills that I -or any Ph.D- has that are in demand

A while ago Christopher Buddle posted a blog on SciLogs about what you needed to know before becoming a professor. Many of those skills are the ones in demand outside of academia. 

It got me thinking generally what skills I have amassed over a Ph.D, Post-doc and faculty position. For any other "recovering scientists" reading this please feel free to steal this list, add to it or perfect it. Any comments or critique would be welcome. 
  1. Project managementover my academic career I managed to publish several papers in top journals. Some required precise planning of tasks and experiments on a short deadlines against competition. This requires ensuring that each set of experiments is finishes with a high quality deliverable.
  2. Human resource- as a professor I had to hire, fire and develop staff. This included students and early career professionals where you are balancing what they are capableof today, with their career goals. I picked projects for them that they matched their skills.
  3. Project planning- a PhD is a set of projects, that need to be planned out, with a full timeline, deliverables and costs set out. In addition a key part of a successful PhD or post-doc is knowing when to kill a project.
  4. Stakeholder relationship- each stage of a PhD requires you to set out goals with your faculty advisory committee. These people will provide guidance and advice for where you should spend your time. Part of success is ensuring that you cogent show progress toward each of the members ideas ofyour success. The stakes get higher as you move to a post-doc where you are expected to manage the project and manage the expectations of your boss.
  5. Budget building- as a professor I needed to build RFPs, prioritize purchases based on project needs-as well as the long term strategy of the lab, source infrastructure, mange vendors and raise funds.
  6. Publications- part of a scientists job is to communicate results to the community. This includes typical writing skills but also graphic design, matching the presentation visualizations to the message and audience.
  7. Data management- all aspects of data management including ensuring high quality data recording metadata, designing database considerations. Build database querying, integrating public and owned data into a complete set.
  8. Analytics- a key part of my PhD was defying how to quantitate behavior and images. This requires a clear analytic method that allows reproducibility through clear, logical rubric for scoring purposes.
  9. Web based research-not just the query but also the decision on good sources and bad ones.
  10. Public speaking- I have given hundreds of lectures to all sizes of groups both lay groups and expert groups. This gives me a large set of tools to fall back on for presentation design
  11. Individual drive- to do a PhD you need to an internal drive to do what must be done.
  12. Intellectual flexibility- as part of my PhD I learned at least 12 different technical skills at a high enough level to use them in peer reviewed publications and teach them to others. I learned these through reading and just dpingi didn't need to be walk through them multiple times.
  13. Records management- my laboratory work in a high demand, high competition environment. We needed to have all experiments documented in a way that would stand up to legal review and could be used as part of a patent process.
  14. Understanding of several healthcare related regulations- part of my work was related to drug discovery and some of it was in collaboration with clinicians. Meaning that we ensured that all documents and protocols met the required standards.
  15. Graphic design- genetics is a hard area to explain without pictures. I designed many successful visualizations using Photoshop, powepoint and old matte photography techniqies.
  16. Process design- my laboratory was at the bleeding edge of genetics. This meant that we were constantly building new processes and testing resources that would be best for that process.
  17. Process optimization- due to the unique methods we constantly needed to set production standards and build analytics that allowed us to evaluate and optimize process and make changes that reduced cost and increased reproducibility and accuracy.
  18. Contract negotiations-as part of my job, I have negotiated service contracts, terms of employment 
  19. Fund raising- academic labs are also look for new sources of funding and interacting with potential investors/funders
  20. Strategic product planning -a key part of success is understanding where government priorities are now and the next five years to develop a funding strategy. Successful scientists also have a understanding of the competitive landscape and position their employees and infrastructure to keep up.

Friday, 18 October 2013

Lets focus on the the actual science not media fluff

Enough already all of the articles and blogs about how "epigenetics" is the cause of aggression or socio-economic disparity. 

Epigenetic modifications to the genome are not more important than genes, they are not separate from genes. They are how we regulate genes, genes are not binary- they are not on or off. They are used at certain levels for certain tasks ("grow an arm" will use the same genes as "grow a heart" but in much different dosages).

Epigenetics are akin to a thermostat. You wouldn't blame the thermostat for causing winter? No you use the thermostat to respond to winter.

Epigenetics is the same! It is the control mechanism that the body uses to respond to the environment. In this case the environment being EVERYTHING outside the nucleus of a single cell. Yes everything, cell signals, hormones, hunger, emotions, temperature, toxins- everything. 

Understanding epigenetics is like particle physics, we can be statistically certain but we can NOT be definitive about the role of any single epigenetic modification's role in a disease state or trait inheritance. 

It is mind-boggling how complex the potential role of epigenetics is in any disease. We do not even understand how it works at the single cell level, and we have people suggesting that "epigenetics" explains complex traits just because nothing else has explained that trait?!.....its frustrating. Epigenetics is not magic, it is at least 20 different types of gene regulation. That is all it is...its boring fundamental science. 

I get it; its hard to explain epigenetics but we are reaching Fox news area of truthiness with some of the blogs and "news" about Epigenetics. We, as the educated science community, need to hold ourselves to a higher standard. Epigenetics is part of everyday life; differences in twins, calico cats, Zebra spots. Lets appropriately educate the public using everyday examples and then go deeper. I have found people are more excited by the basics and a honest approach than being oversold. We get that enough nowadays with the 24 hour news cycle and 365 political campaigning. 

Lets not be part of the solution by "misspeaking" the wonderful nature of epigenetics. We should be exciting the public to the potential rather than selling snake oil.

Saturday, 3 November 2012

Funding research in the new (poorer) world


The world has changed. Money is tight, the large foundations and governmental granting agencies are risk averse........which means senior scientist will get the $. 

This means that innovation is going to die. Once you get to be a senior scientist you don't have time to fail you have to feed the beast. I think the best allegory is Wall St.- Yes "to big to fail, I need a bail-out Wall St." is exactly what most large labs in the world are!

Rather than launch on some Quixotic diatribe about how bad this is for health and science as endeavor,  Ill just talk about how to make it irrelevant.

The answer is small foundations. They have the focus, passion and community to start a real long term relationship with young scientists. Brand new, too ignorant to know better scientists who got their own lab by being the most innovative and best prepared post-doc is exactly the one who will take the chance-if their is money involved. 

All foundations seem to be focusing on drug discovery and biomarkers. This is a great space for smaller disease focused foundations to occupy. The problem is that most of the organizations do not have the gobs of money to follow it through in a comprehensive manner.


Getting effective novel therapeutics requires engagement of talented, creative scientists


For disease focused research foundations this can be difficult. In the current economic environment foundations must have some method to "hook" scientists whether it be large per year grants, limited restrictions on spending or speed of review. 


For rare diseases this means getting in their early phase when they shape and limit the vision of their lab. Rare diseases research requires passion and a way to find general funding that will maintain the lab. 


In this day and age when peer based mentoring is sadly lacking a foundation that can provide some guidance on where/how their researchers can leverage funding and expertise can gain loyalty and expand by word of mouth. This will then lead to larger "sexy" studies and fund-raising. 


This kind of thinking can work hand in hand with maximizing fund raising as you can tell fund raisers that a large portion of their money goes directly to attempting to cure or ease their disease rather than greater good. 


Overall:Small foundations should look to maximize the effect of their funds to effect change in their specific disease. 

Through targeting 2 areas of research:
1 Cellular characterization of the disease (ie what are properties that are different between normal & disease)
2 Drug/therapeutic design and testing-no matter how speculative. This assumes that the idea or test system can pass peer review 


The 2 areas would have separate competitions, 2 separate funding paradigms:

1 Short funding cycle-a micro-finance model. Short grants with quick turn around. 2 year grant with  a hard progress report with mutually agreed upon measurable progress. 

2 A prestige grant larger "no questions asked" funding 5 year funding with no reporting for 2 years. Again mutually agreed upon defined goals that MUST be met to receive final 3 yrs of $    


The grants would be open academia and small biotech. There would also be a bonus for academic lead Pharma-academic RFPs. There would be a significant and clear partnership NOT just "in-kind" contributions. 


The research and fund-raising would have a high degree of back and forth. The foundation would hold a stakeholders conference where selected funded scientist would come and explain the state of research in layman's terms. 


There has to be greater out-reach from the scientists at foundations. The Office of the CSO should engage in various forms of social media to engage and find funds (with the guidance of the Exec board). This can no longer be left to that summer intern who just finished Bio 101. The public is too smart for that and frankly if I was looking for a small foundation for funding I want to know that the scientists are engaged. It should be an expectation not just a hope that scientific merit is judged by scientists. 

Sunday, 16 September 2012

Time for some convergent evolution in knowledge management

As I move from the ivory tower of Neuroscience to the practical, business related advice that Info-Tech gives clients on their IT environment I'm amazed at how many parallels I see in the needs and the solutions in all kinds of human endeavours.

For example, I just finished talking to a vendor about how Enterprises can manage and maximize their content (images, documents, blogs, etc). Much like my own thinking on this, @OpenText is convinced the core issue is about information movement not what it is stored in (i.e. a word doc VS a excel).

For me this comes back to a practical problem that I had as a graduate student. My Ph.D was on how gene expression relates to brain development. The brain develops in a fascinating manner; it starts out as a tube that grows outwards at specific points to build the multi-lobed broccoli-esque structure that allows all vertebrates but particularly mammals to have diverse behaviours and life long learning.These complex behaviours rely on an the immensely diverse set of brain cell types. Not only is their great diversity of cells but each cell needs to get to the right place at the right time.

Think of a commute on the subway; not only do you need the right line but if you don't get to the station at the right time you won't get to work on time. This could lead to you getting fired. For brain cells this could lead to death. For the organism it could mean sub-normal brain function-and potentially death. The fact that the process works is a testament to the astounding flexibility and exception management built into cells by their epigenetic programming.

There is however one big problem with the type of brain development: the skull. The skull limits the number of cells that can be created at any given time. Practically this means that the level of control that must be exerted on the number of any one cell type is very tight.The control comes from coordinating which genes are expressed in each cell type to allow cells to make decisions on the fly. Usually it starts by the brain cells take off in a random direction that then informs them of what type of brain cell they will end of being when they arrive. The cells then proliferate as they move based on the contextual information that they receive about how many more cells are needed. This all happens through cell to cell communication and rapidly changing patterns of gene expression.

(Wait for it Ill get back the parallel problems honest....)

As you can imagine this was (and still is) a daunting problem to investigate. My research involved a variety of time staged images; reams of excel workbooks on cell counts, brain size; word docs on behaviour and whole genome expression sets. It was the a big data problem before the phrase existed. (Business parallel No.1). In reality I had no problem keeping track of all this data and looking at each piece and doing the analysis on each piece. I had very good notes(metadata) and file naming conventions (classification) to ensure that I could easy find the file I needed. I was in effect a content management system (Business parallel No.2).  The problem was synthesizing the separate analysis into a cogent piece of information i.e. something that can be shared with others in a common language that allows other to build their own actionable plan. (Business parallel No.3).

Any scientist reading my dilemma from 15 years ago can probably relate-and so can anyone else that uses and presents information as part of their job. The reality is that technology can only solve the problem if people recognize the problem and WANT to be systematic in their habits.......the will power to be repetitive in their approach to work is sorely lacking from most knowledge based workers. Ironically a lack of structure kills creativity by allowing the mind too much space to move within. The advent of the online databases by NIH from genomic, chemical and ontological data has given a framework for scientists to work within to quickly get up to speed in new areas of investigation. Unfortunately this has not trickled down to individual labs (again more proof that trickle down anything doesn't work effectively-its just not part of human nature).

This lack of shared framework across multiple laboratories is becoming a real problem for both Pharma and academia (and everyone else). The lack of system has led to reams of lost data and the nuggets of insight that could provide real solutions to clinical problems (Business parallel No. 4). This also leads to duplication of effort and missed opportunities for revenue(grant) generation.(Business parallel No.5).

From a health perspective, if we knew more about what "failed drugs" targeted, what gene patterns they changed and what cell types they had been tested on we could very quickly build a database. From a Rare disease perspective the cost of medical treatment is partially due to the lack of shared knowledge. How many failed drugs could be of use on rare diseases? We will never know.

This is a situation where scientists can learn from the business community for the technical tools to really allow long term shareable frameworks. These technical controls are available at any price. Conversely the frameworks and logic that scientists use to classify pieces of content to link them have lessons for any knowledge worker.

Its time for some open-mindedness on both sides, the needs for all kinds of organizations and workers are converging-too much data, too many types of data, not enough analysis. Evolution is about taking those "things" that work and modify them for the new environment.