Sunday, 16 September 2012

Time for some convergent evolution in knowledge management

As I move from the ivory tower of Neuroscience to the practical, business related advice that Info-Tech gives clients on their IT environment I'm amazed at how many parallels I see in the needs and the solutions in all kinds of human endeavours.

For example, I just finished talking to a vendor about how Enterprises can manage and maximize their content (images, documents, blogs, etc). Much like my own thinking on this, @OpenText is convinced the core issue is about information movement not what it is stored in (i.e. a word doc VS a excel).

For me this comes back to a practical problem that I had as a graduate student. My Ph.D was on how gene expression relates to brain development. The brain develops in a fascinating manner; it starts out as a tube that grows outwards at specific points to build the multi-lobed broccoli-esque structure that allows all vertebrates but particularly mammals to have diverse behaviours and life long learning.These complex behaviours rely on an the immensely diverse set of brain cell types. Not only is their great diversity of cells but each cell needs to get to the right place at the right time.

Think of a commute on the subway; not only do you need the right line but if you don't get to the station at the right time you won't get to work on time. This could lead to you getting fired. For brain cells this could lead to death. For the organism it could mean sub-normal brain function-and potentially death. The fact that the process works is a testament to the astounding flexibility and exception management built into cells by their epigenetic programming.

There is however one big problem with the type of brain development: the skull. The skull limits the number of cells that can be created at any given time. Practically this means that the level of control that must be exerted on the number of any one cell type is very tight.The control comes from coordinating which genes are expressed in each cell type to allow cells to make decisions on the fly. Usually it starts by the brain cells take off in a random direction that then informs them of what type of brain cell they will end of being when they arrive. The cells then proliferate as they move based on the contextual information that they receive about how many more cells are needed. This all happens through cell to cell communication and rapidly changing patterns of gene expression.

(Wait for it Ill get back the parallel problems honest....)

As you can imagine this was (and still is) a daunting problem to investigate. My research involved a variety of time staged images; reams of excel workbooks on cell counts, brain size; word docs on behaviour and whole genome expression sets. It was the a big data problem before the phrase existed. (Business parallel No.1). In reality I had no problem keeping track of all this data and looking at each piece and doing the analysis on each piece. I had very good notes(metadata) and file naming conventions (classification) to ensure that I could easy find the file I needed. I was in effect a content management system (Business parallel No.2).  The problem was synthesizing the separate analysis into a cogent piece of information i.e. something that can be shared with others in a common language that allows other to build their own actionable plan. (Business parallel No.3).

Any scientist reading my dilemma from 15 years ago can probably relate-and so can anyone else that uses and presents information as part of their job. The reality is that technology can only solve the problem if people recognize the problem and WANT to be systematic in their habits.......the will power to be repetitive in their approach to work is sorely lacking from most knowledge based workers. Ironically a lack of structure kills creativity by allowing the mind too much space to move within. The advent of the online databases by NIH from genomic, chemical and ontological data has given a framework for scientists to work within to quickly get up to speed in new areas of investigation. Unfortunately this has not trickled down to individual labs (again more proof that trickle down anything doesn't work effectively-its just not part of human nature).

This lack of shared framework across multiple laboratories is becoming a real problem for both Pharma and academia (and everyone else). The lack of system has led to reams of lost data and the nuggets of insight that could provide real solutions to clinical problems (Business parallel No. 4). This also leads to duplication of effort and missed opportunities for revenue(grant) generation.(Business parallel No.5).

From a health perspective, if we knew more about what "failed drugs" targeted, what gene patterns they changed and what cell types they had been tested on we could very quickly build a database. From a Rare disease perspective the cost of medical treatment is partially due to the lack of shared knowledge. How many failed drugs could be of use on rare diseases? We will never know.

This is a situation where scientists can learn from the business community for the technical tools to really allow long term shareable frameworks. These technical controls are available at any price. Conversely the frameworks and logic that scientists use to classify pieces of content to link them have lessons for any knowledge worker.

Its time for some open-mindedness on both sides, the needs for all kinds of organizations and workers are converging-too much data, too many types of data, not enough analysis. Evolution is about taking those "things" that work and modify them for the new environment.