Big Data has been a hot topic in brain cancer research recently, with the project on low grade brain tumours using The Cancer Genome Atlas (TCGA), the news of the IBM super computer to be applied to brain cancer, and the work on the integration of genomic data for epidemiology which used brain cancer as its test case, all being announced in the last few weeks.
And data is definitely ‘big’ now. Ninety per cent of all the data in the world has been generated over the last two to three years. Industries from finance to disease monitoring, criminology to genomics, and meteorology to healthcare are awash with data that must be captured, curated, stored, searched, shared, analysed and visualised.
In medical science, recent technology developments such as whole genome sequencing have generated huge amounts of information.
“We can now sequence the whole human genome in three hours - a decade ago it took 10 years and cost $1 billion”
What do we do with all this data? And what does it mean for brain cancer now that the war against cancer has moved into cyberspace? system”
What is Big Data?
Quite simply, ‘Big Data’ is a buzzword for a collection of data sets so large and complex that it becomes difficult to process and understand using traditional database management tools or data processing applications. Bioinformatics is a term that goes hand in hand with big data—it is the science of collecting and analysing complex biological (big) data such as genetic codes.
How will big data be used in medical research?
Genome sequencing has lead to the advent of personalised medicine (also called precision medicine). Personalised medicine uses the data gather from genome sequencing to predict disease development or to tailor treatment to an individual, and is one of the four pillars of the Cure Brain Cancer Research Funding Strategy.
As genomic sequencing becomes less cost-prohibitive, Big Data potentially allows researchers to combine genomic information for a given patient along with information about their clinical history, and that of thousands of other patients.
Researchers can ‘mine’ the data to see what treatments are most effective for particular conditions, identify patterns related to treatments, drug side effects or hospital admissions, and gain other important information that can help patients.
There are moves towards governments and other organisations to make decades of stored data usable, searchable, and more importantly actionable by doctors and researchers.
For example, if a database of all the melanoma sufferers was geographically-enabled – indicating where they live – together with environmental monitoring data that showed the intensity of ultraviolet light over a period of time, researchers could then do a correlation between those two. This would remove the need to do expensive and time-consuming clinical studies identifying these melanoma sufferers and asking them about their history of ultraviolet light exposure.
There have been some very exciting developments coming out of big data projects. Analysis of large amounts of data has actually changed how ALS, a 100% fatal neurological disorder, is being treated for example. The ALS database led to a crowdsourcing exercise for a predictive algorithm; this was phenomenally successful with over 1000 entrants from 64 countries delivering a total of 37 valid algorithms at a total prize cost of $25,000. The wonderful opportunity of bioinformatics is that, if the data is there, it can be opened up to a whole pool of talent through crowdsourcing.
What are the challenges of Big Data?
Big Data is next to worthless unless we ask the right questions cautions Roni Zeiger CEO of Smart Patients who spoke at the 2013 Faster Cures conference.
“Big data is a bunch of noise. The intersection between smart questions and Big Data is where the juicy stuff is in the next five years.”
- Roni Zeiger, CEO of Smart patients.
In other words, only when we approach Big Data with a set of questions – hypotheses – can we expect to retrieve anything of value from it.
Another challenge for Big Data are the issues that arise around data standardisation and protocols. If we seek to accumulate data (to make it big) how can we make it consistent across multiple sources? After all, surely we want to begin with what we’ve already got? What partly characterised some recent initiatives in cancer research using big data was the attention paid to protocols and data standards themselves. The exciting I-Spy 2 clinical trial which was extended to Melbourne took significant steps in creating international standards for data storage.
Big Data cannot deliver on its promise without patient data, and for orphan cancers such as brain cancer (where the incidence is low) the accumulation of patient biomarker data and clinical history is critical. This is a core reason for supporting Care Co-ordination in brain cancer; care co-ordinators can be the data collectors and ensure that the maximum amount of clinical and biomarker data from patients is collected.
How is Big Data impacting brain cancer research?
News last week from researchers at The University of Texas MD Anderson Cancer Center, highlights how Big Data is impacting brain cancer research. The research team led by Roel Verhaak, PhD analysed data from The Cancer Genome Atlas (TCGA) to established three molecular categories which low grade tumours fall into. They have found that one category bears molecular resemblance to the most lethal brain tumour, glioblastoma multiforme.
The TCGA is a research program supported by the National Cancer Institute and the National Institutes of Health, and involves a large number of researchers around the world: from the doctors treating the patients who agree to donate tissue, to the groups that process samples and send materials to laboratories, to the individual researchers actively sequencing and analysing the samples. TCGA is mapping all the mutations and other changes
in more than 20 different types of cancer, and, importantly, is listing these changes in a database accessible by researchers around the world . TCGA findings show the power of teamwork and collaboration—another of the pillars of Cure Brain Cancer’s research funding strategy.
In another attempt to deliver on the medical promise of Big Data IBM is already looking at big data analysis of brain cancer. In March, the computing giant announced that its Watson supercomputer is being used to solve the mysteries of brain cancer by examining individual genetic mutations.
IBM Watson will soon help clinicians at New York Genome Center search for mutations in patients with glioblastoma that may be referenced in medical literature and genomic databases. Findings will then be presented to the patient’s doctor.
Closer to home in Australia, Brandon Wainwright and his team at the Institute of Molecular Bioscience in Queensland are working extensively in bioinformatics for the discovery of genes specifically relevant to brain cancers.
As part of the overall research strategy, Cure Brain Cancer focuses on precision medicine, and will be keeping a close eye on big data projects in brain cancer research. Part of the attraction of international collaboration is the opportunity for participation in, and access to, big databases; as a small country with a disease with low incidence, we are data hungry and need to be part of the global big data picture in brain cancer.
Read more of our research blogs about brain cancer