Georgetown Database of Cancer (G-DOC):

A Vision of Personalized Medicine

Georgetown Lombardi Comprehensive Cancer Center's new director for Informatics, Subha Madhavan, PhD, has spent her career studying Informatics. She is committed to expanding the G-DOC® data integration platform and integrative knowledge discovery system for the oncology and translational research communities.


Dr. Lucile L. Adams-Campbell

In late 2007, Louis M. Weiner, MD, was named Director of the Georgetown Lombardi Comprehensive Cancer Center. His vision for Lombardi included a novel approach to bringing advances in research and treatment into practice for the community. Dr. Weiner's vision has manifested itself as the Georgetown Database of Cancer (G-DOC), a tool under development that promises to deliver on the founding principle of Lombardi and Georgetown University Medical Center, cura personalis, or the care of the whole person.

"Georgetown - and its Lombardi Comprehensive Cancer Center - is committed to reducing the burden of human cancer through the discovery and early adoption of cutting-edge systems biology based tools," said Dr. Weiner.

The Georgetown Database of Cancer is a major step towards personalized medicine, an essential concept when dealing with the unique nature of cancer. The G-DOC is being designed as a tool that combines clinical information from patients with a database containing detailed analyses of the molecular characteristics of each patient's cancer.

To help develop the new tool, Dr. Weiner recruited Subha Madhavan, MS, PhD, to Lombardi from the National Cancer Institute. She was named Lombardi's first Director of Clinical Research Informatics in early October, and her role is to coordinate combining the existing databases into the infrastructure that will become G-DOC.

"G-DOC is part of a changing mentality," explained Minetta Liu, MD, attending physician in the Lombardi clinic and director of translational breast cancer research. Her research focuses on correlating a patient's genes with clinical observations in order to develop better, more personalized, treatments.

In the past, studies were rarely conducted in collaboration between various research groups, she explained. This means that until recently, data complied by each researcher served as that group's only resource hub.

Access to broad data sets and new research tools has become a nationwide initiative advanced by the National Cancer Institute in the form of the Cancer Biomedical Informatics Grid, or caBIG. The mission is to develop a collaborative information network that shares research data across many investigators and institutions.

Georgetown University Medical Center (GUMC) has had a hand in the creation of caBIG through the Protein Informatics Resource, PIR. This project at GUMC has focused on developing a public resource tool that supports genomic and proteomic research. Furthermore, a team of researchers and programmers, led by Lombardi's Robert Clarke, PhD, DSc has contributed to the development of several of the caBIG tools.

G-DOC will take advantage of standards and best practices from caBIG and other large-scale informatics projects to help integrate the wide variety of patient data it will hold. Dr. Madhavan estimates that an average of only 20 percent of available information about a patient's cancer is used in the course of his or her care. Her hope is that implementation of G-DOC will increase the amount of useful information available to physicians, allowing them to tailor treatments more specifically and accurately.

At the same time, the new database has an important research goal. The data collected will allow scientists to identify key genes and proteins that may be responsible for causing the disease, help predict response to treatment, or indicate an increased risk of developing cancer.

"Our collaborators at Oak Ridge National Laboratories have estimated that every patient will generate about 1,000,000,000,000,000,000,000,000,000,000,000,000,000 bytes of data," explained Dr. Weiner. This includes information ranging from molecular data about DNA, proteins, and other markers in the cancer cells, imaging scans, and treatment information. "The G-DOC is a mechanism to bring all data for one patient together, and then compare it to the same amount of data from every other patient."

The collection of different types of information about a patient's disease and treatment will allow the researchers to make connections between outcomes and the clinical and molecular characteristics of cancer.

"There are millions of proteins in the body, and we can measure if one is switched on and another off. But the problem is that we can't tell which one is important for an individual patient's cancer," explained Dr. Clarke, who is co-leader of Lombardi's Breast Cancer Research Program, and interim director of GUMC's Biomedical Graduate Research Organization.

One of the major challenges of such data integration efforts is the wide variety of data sources that need to be brought together into the single unified database. Dr. Madhavan's team must painstakingly match fields in one database to fields in another, ensuring that units of measurement, timing, and hundreds of other factors are consistent. This is in addition to the obstacles common to any other large-scale project, such as limited resources and a high learning curve.

Already, Dr. Weiner has taken some important steps. In July 2008, Georgetown Medical Center announced collaboration with Indivumed GmbH to support the development of G-DOC. Indivumed, a center for cancer research in Germany, will assist in the effort by providing a wide range of biospecimen data to add to the clinical databases at Georgetown.

And from the research standpoint, Dr. Liu is very hopeful. "Researchers at Georgetown are great," she said. "They are very willing to collaborate." A study designed by Drs. Liu and Clarke is one of several that is slated to be included in the pilot run.