Abstract:Background and Aims Pancreatic cancer is a difficult-to-treat disease and over 90% of the patients will die within one year of diagnosis. The presence of differentially expressed genes (DEGs) between diseased and normal pancreatic cancer tissues may closely associated with the development and progression of pancreatic cancer. This study was conducted to screen the DEGs in pancreatic cancer using a machine learning approach, so as to provide a basis for studying the pathogenetic mechanism of this disease.Methods Pancreatic cancer gene expression profiles were screened from the public gene GEO database, differential calculations and normalizations were performed using the linear regression model package Limma for different groups of microarrays. The DEGs were obtained using the R language, and the selected DEGs were further screened by correlation-based feature selection method. Based on the hub DEGs obtained, AdaBoost and Bagging algorithms were used to construct pancreatic cancer prediction models respectively. The GO function analysis and KEGG enrichment analysis of the hub DEGs were performed through the DAVID website, and protein-protein interaction (PPI) network of the hub DEGs was analyzed using STRING database and Cytscape software. Finally, survival analysis was performed on the relevant hub DEGs through the GEPIA website.Results Through feature screening, 18 key DEGs were obtained. A prediction model was built by using AdaBoost algorithm based on the feature subset containing the 18 DEGs, and the prediction accuracy reached 92.3%. The GO and KEGG analysis of the DEGs revealed an indirect role for CDK1, CCNA2 and CCNB1 in the formation and development of pancreatic cancer. Survival analysis showed that the expressions of CDK1 (P=0.000 8), CCNB1 (P=0.012), CSK2 (P=0.023) and CKS1B (P=0.001 3) were correlated with the overall survival (OS) of patients, and higher expressions of them were associated with shorter OS of patients.Conclusion Machine learning methods can be efficiently applied for hub genes screening in pancreatic cancer, and have certain significance for the diagnosis and treatment of pancreatic cancer and related drug development.