Prediction of drug efficacy for cancer treatment based on comparative analysis of chemosensitivity and gene expression data
Abstract
The NCI60 database is the largest available collection of compounds with measured anti-cancer activity. The strengths and limitations for using the NCI60 database as a source of new anti-cancer agents are explored and discussed in relation to previous studies. We selected a sub-set of 2333 compounds with reliable experimental half maximum growth inhibitions (GI50) values for 30 cell lines from the NCI60 data set and evaluated their growth inhibitory effect (chemosensitivity) with respect to tissue of origin. This was done by identifying natural clusters in the chemosensitivity data set and in a data set of expression profiles of 1901 genes for the corresponding tumor cell lines. Five clusters were identified based on the gene expression data using self-organizing maps (SOM), comprising leukemia, melanoma, ovarian and prostate, basal breast, and luminal breast cancer cells, respectively. The strong difference in gene expression between basal and luminal breast cancer cells was reflected clearly in the chemosensitivity data. Although most compounds in the data set were of low potency, high efficacy compounds that showed specificity with respect to tissue of origin could be found. Furthermore, eight potential topoisomerase II inhibitors were identified using a structural similarity search. Finally, a set of genes with expression profiles that were significantly correlated with anti-cancer drug activity was identified. Our study demonstrates that the combined data sets, which provide comprehensive information on drug activity and gene expression profiles of tumor cell lines studied, are useful for identifying potential new active compounds.