文章摘要

GSE74602芯片数据中直肠癌关键基因与治疗药物的生物信息学筛选

作者: 1罗磊, 1龚宝成, 1刘福囝
1 中国医科大学附属第一医院 胃肠肿瘤外科,辽宁 沈阳 110001
通讯: 刘福囝 Email: lfn540@126.com
DOI: 10.3978/.2018.04.011
基金: 辽宁省自然基金资助项目(20170541008);CSCO—默克雪兰诺肿瘤研究基金资助项目(Y-MX2016-013)。

摘要

目的:通过生物信息学方法探寻结直肠癌的诊断标志物及潜在的治疗靶点与治疗药物。
方法:从美国国立生物技术信息中心(NCBI)公共数据平台基因表达大棚车(GEO)下载芯片数据GSE74602,包括60个样本,其中30例结直肠癌样本,30例正常结直肠组织样本。用R语言中的Limma包对正常组织和癌症组织进行差异表达分析,用DAVID在线数据库对筛选出来的差异表达基因进行基因本体论(GO)富集分析和京都基因与基因组百科(KEGG)信号通路分析,同时通过STRING在线数据库构建差异表达基因的蛋白互作网络,利用Cytoscape软件进行可视化编辑,并且通过MCODE插件进行子网络模块分析,筛选出结直肠癌发生过程中的核心基因。最后,用连通图数据库(cMap)分析具有潜在治疗结直肠癌的小分子药物。
结果:总共筛选出231个差异表达基因,包括122个上调基因,109个下调基因。GO分析结果表明,表达上调的基因富集的生物学过程主要包括细胞周期、细胞分裂等,而表达下调的基因富集在免疫反应,细胞内信号级联和防御反应等生物学过程。KEGG通路分析发现表达上调的基因主要参与细胞中氮及矿物质的代谢和细胞中相关物质的分泌(胆汁、胰液)的信号通路,而表达下调的基因主要参与药物代谢、细胞循环以及p53信号传导通路。子网络模块分析发现了一些在结直肠癌发生的调控中起着重要作用的关键基因,如KIF20A、CENPF、NCAPG、PYY、IQGAP3;蛋白互作网络中的差异表达基因被映射到cMap上,筛选出若干个潜在治疗结直肠癌的小分子药物,如紫霉素、去甲骆驼蓬碱、斑鸠霉素等。
结论:所发现的关键基因可能会成为诊断结直肠癌新的肿瘤标志物或者治疗结直肠癌的新的靶点;此外,筛选出的小分子药物有可能成为治疗结直肠癌的新型药物。
关键词: 结直肠肿瘤;信息系统;微阵列分析;计算生物学

Bioinformatics screening for pivotal genes and therapeutic drugs of colorectal cancer in GSE74602 microarray data

Authors: 1LUO Lei, 1GONG Baocheng, 1LIU Funan
1 Department of Gastrointestinal Surgical Oncology, the First Affiliated Hospital, China Medical University, Shenyang 110001, China

CorrespondingAuthor:LIU Funan Email: lfn540@126.com

Abstract

Objective: To explore the diagnostic markers as well as the potential therapeutic targets and drugs for colorectal cancer (CRC) through bioinformatics approach.
Methods: The microarray data of GSE74602 was downloaded from the public data platform Gene Expression Omnibus (GEO) of the National Center for Biotechnology Information (NCBI), which contained 30 CRC tissue samples and 30 normal colorectal tissue samples. The differential expressed genes between CRC tissue and normal colorectal tissue were identified by Limma package of R language. Then, the differential expressed genes were subjected to Gene Ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis using DAVID online tool. At the same time, Protein-protein interaction networks of the differential expressed genes were generated by using the STRING server and visualized by Cytoscape software. Subnetwork module analyses were performed through the MCODE plugin to screen out the core genes for CRC carcinogenesis. Finally, the small molecule compounds potentially against CRC were searched from the Connectivity Map (cMap) database.
Results: A total of 231 differential expressed genes were picked up, among which 122 were up-regulated and 109 were down-regulated. The GO analysis showed that the up-regulated genes were enriched for biological processes that mainly included cell cycle and cell division, while the down-regulated genes were enriched for biological processes such as immune response, intracellular signaling cascade and defense response. KEGG pathway analysis showed the up-regulated genes were mainly involved in the signaling pathways associated with the intracellular nitrogen and mineral metabolism and secretion (such as bile and pancreatic juice), while the down-regulated genes were mainly involved in the signaling pathways associated with drug metabolism, cell cycle and p53. Some genes playing critical roles in regulating the occurrence of CRC were identified, such as KIF20A, CENPF, NCAPG, PYY, and IQGAP3. Several small molecule drugs potentially against CRC were screened out after the differentially expressed genes in the protein-protein interaction networks were submitted to the cMap database, such as viomycin, harmalol and ikarugamycin.
Conclusion: The identified pivotal genes may probably be used as the new biomarkers for diagnosis of CRC or therapeutic targets of CRC. Moreover, the screened small molecule compounds may potentially be developed into novel drugs for the treatment of CRC.
Keywords: Colorectal Neoplasms; Information Systems; Microarray Analysis; Computational Biology