Identification of key protein-coding genes in lung adenocarcinomas based on bioinformatic analysis

Ruixue Yao, Xiaoming Chen, Luyao Wang, Yuanyong Wang, Shaoli Chi, Na Li, Xuejun Tian, Nan Li, Jia Liu


Background: Lung cancer is one of the most common cancers and the primary cause of cancer-related deaths in the world. The 5-year survival of lung cancer patients is lower than 15%. As a common subtype of lung cancer, lung adenocarcinoma still has a high morbidity and mortality, although many strategies have been made, such as surgical operation, chemotherapy, targeted therapy. The use of gene expression microarray has provided a feasible and effective approach for the study on lung cancer. However, the biomarkers and potential therapeutic targets of lung adenocarcinomas are still not completely identified. Our study is aimed to find biomarkers and therapeutic targets of lung adenocarcinomas by identifying the key protein-coding gene in lung adenocarcinomas by bioinformatical approaches.
Methods: We selected and obtained messenger RNA microarray datasets from Gene Expression Omnibus database to identify differentially expressed genes between lung adenocarcinomas and normal lung tissue. The differentially expressed genes were clarified by Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, the protein-protein interaction (PPI) network and statistical analyses. Subsequently, quantitative real-time PCR was used to verify the results of bioinformatic analysis.
Results: We obtained 1,264, 896 and 408 differentially expressed genes from GSE32863, GSE43458 and GSE63459, respectively. The 242 common differentially expressed genes in three datasets were related to cell adhesion molecules, ECM-receptor interaction, Leukocyte transendothelial migration according to KEGG analysis. GO analysis showed that these common differentially expressed genes were enriched in tumor-related functions. ASPM, CCNB2, CDC20, CDC45, MELK, TOP2A and UBE2T and KIAA0101 have the strongest protein-protein interaction relationships based on protein-protein interaction networks. Survival analysis showed that these nine genes were closely related to the survival of lung adenocarcinomas. The further qRT-PCR assays indicated that seven key genes (ASPM, CCNB2, CDC20, CDC45, MELK, TOP2A and UBE2T) display differential profile between clinical lung adenocarcinoma specimens and their matched normal tissues.
Conclusions: ASPM, CCNB2, CDC20, CDC45, MELK, TOP2A and UBE2T may be key protein coding genes in lung adenocarcinoma, and deserve further study to verify their feasibility and effectiveness as biomarkers and therapeutic targets for lung adenocarcinomas.