Researchers create record-sized, integrated cellular cancer database


A new, integrated database for cancer will help researchers explore in unprecedented detail the relationship between drugs, gene expression, mutations, copy number, methylation and protein expression.
Photo courtesy of Yves Pommier

An international team of researchers have created a powerful new database that consolidates data on a record number of cancer drugs and cell lines. The freely available tool, called CellMinerCDB, can be used to explore connections between drugs and various features of cancer, such as genetic mutations, cell signatures, DNA methylation and more. Already, researchers have used CellMinerCDB to uncover new genes that contribute to cancer. The description of the tool and study results were published December 12, 2018, in iScience

Databases play a critical role in exposing patterns and underlying causes of cancer; these patterns can be harnessed to create novel therapies. But many institutions host their own databases, which makes it very labor intensive to cross-reference data from different platforms. 

Yves Pommier, M.D., Ph.D., Chief of CCR’s Developmental Therapeutics Branch, wanted a better approach. “I didn’t want to have to solicit and engage bioinformatic teams to answer all my questions. I wanted to do it in real time,” he explains. “So we created the website, which enables anyone to do this without any assistance or training.”

He began by matching NCI’s database of cancer drugs with its database of cell lines – a daunting task that involved streamlining data on thousands of drugs and 60 different cancer cell lines. These efforts resulted in the creation of the database CellMiner, which was first created under the leadership of John N. Weinstein in 2009. 

The latest version, called CellMinerCDB, has been expanded to include four other databases, including the Broad Institute’s Cancer Cell Line Encyclopedia and the Sanger Institute and Massachusetts General Hospital’s Genomics of Drug Sensitivity in Cancer database. The new, consolidated tool contains hundreds of thousands of drugs and more than 1,400 cell lines, making it the most comprehensive integrated cancer cell line database in the world. 

“This was a lot of work to make sure the names of the cell lines and drugs were standardized across databases. This is an essential feature of CellMinerCDB,” says Dr. Pommier.

Anyone can go to the website and search a drug or feature of cancer they are interested in studying. Through the comprehensive dataset, researchers can identify potential correlations between drugs and biological factors such as gene expression, mutations, copy number, methylation and protein expression. 

CellMinerCDB is already yielding valuable insights. Using the new database, Dr. Pommier and his team were able to confirm that a recently discovered gene plays a critical role in killing cancer cells that replicate abnormally. The gene, SLFN11, is disabled in about 50 percent of cancer cell lines and is associated with a patient’s response to chemotherapy. “That led to a set of articles from us and others about how and why it’s related to certain drugs. This is one example where the database led to a new discovery, but there are many more to be made,” says Dr. Pommier. The investigators plan to continue to expand the database by adding new drugs and additional databases of cancer cell lines, such as those for small cell lung cancer and sarcoma. 

Summary Posted: Sat, 12/01/2018