PubMed İndeksli Yayınlar Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/397

Browse

Search Results

Now showing 1 - 6 of 6
  • Article
    Citation - WoS: 20
    Citation - Scopus: 24
    miRdisNET: Discovering MicroRNA Biomarkers That Are Associated With Diseases Utilizing Biological Knowledge-Based Machine Learning
    (Frontiers Media S.A., 2023-01-12) Jabeer, Amhar; Temiz, Mustafa; Bakir-Gungor, Burcu; Yousef, Malik
    During recent years, biological experiments and increasing evidence have shown that MicroRNAs play an important role in the diagnosis and treatment of human complex diseases. Therefore, to diagnose and treat human complex diseases, it is necessary to reveal the associations between a specific disease and related miRNAs. Although current computational models based on machine learning attempt to determine miRNA-disease associations, the accuracy of these models need to be improved, and candidate miRNA-disease relations need to be evaluated from a biological perspective. In this paper, we propose a computational model named miRdisNET to predict potential miRNA-disease associations. Specifically, miRdisNET requires two types of data, i.e., miRNA expression profiles and known disease-miRNA associations as input files. First, we generate subsets of specific diseases by applying the grouping component. These subsets contain miRNA expressions with class labels associated with each specific disease. Then, we assign an importance score to each group by using a machine learning method for classification. Finally, we apply a modeling component and obtain outputs. One of the most important outputs of miRdisNET is the performance of miRNA-disease prediction. Compared with the existing methods, miRdisNET obtained the highest AUC value of .9998. Another output of miRdisNET is a list of significant miRNAs for disease under study. The miRNAs identified by miRdisNET are validated via referring to the gold-standard databases which hold information on experimentally verified MicroRNA-disease associations. miRdisNET has been developed to predict candidate miRNAs for new diseases, where miRNA-disease relation is not yet known. In addition, miRdisNET presents candidate disease-disease associations based on shared miRNA knowledge. The miRdisNET tool and other supplementary files are publicly available at: .
  • Article
    Topological Feature Generation for Link Prediction in Biological Networks
    (PeerJ Inc, 2023-05-09) Temiz, Mustafa; Bakir-Gungor, Burcu; Sahan, Pinar Guner; Coskun, Mustafa; Güner Şahan, Pınar
    Graph or network embedding is a powerful method for extracting missing or potential information from interactions between nodes in biological networks. Graph embedding methods learn representations of nodes and interactions in a graph with low-dimensional vectors, which facilitates research to predict potential interactions in networks. However, most graph embedding methods suffer from high computational costs in the form of high computational complexity of the embedding methods and learning times of the classifier, as well as the high dimensionality of complex biological networks. To address these challenges, in this study, we use the Chopper algorithm as an alternative approach to graph embedding, which accelerates the iterative processes and thus reduces the running time of the iterative algorithms for three different (nervous system, blood, heart) undirected protein-protein interaction (PPI) networks. Due to the high dimensionality of the matrix obtained after the embedding process, the data are transformed into a smaller representation by applying feature regularization techniques. We evaluated the performance of the proposed method by comparing it with state-of-the-art methods. Extensive experiments demonstrate that the proposed approach reduces the learning time of the classifier and performs better in link prediction. We have also shown that the proposed embedding method is faster than state-of-the-art methods on three different PPI datasets.
  • Article
    Citation - Scopus: 2
    Prediction of Colorectal Cancer Based on Taxonomic Levels of Microorganisms and Discovery of Taxonomic Biomarkers Using the Grouping-Scoring (G-S-M) Approach
    (Elsevier Ltd, 2025-03) Bakir-Güngör, Burcu; Temiz, Mustafa; Canakcimaksutoglu, Beyza; Yousef, Malik
    Colorectal cancer (CRC) is one of the most prevalent forms of cancer globally. The human gut microbiome plays an important role in the development of CRC and serves as a biomarker for early detection and treatment. This research effort focuses on the identification of potential taxonomic biomarkers of CRC using a grouping-based feature selection method. Additionally, this study investigates the effect of incorporating biological domain knowledge into the feature selection process while identifying CRC-associated microorganisms. Conventional feature selection techniques often fail to leverage existing biological knowledge during metagenomic data analysis. To address this gap, we propose taxonomy-based Grouping Scoring Modeling (G-S-M) method that integrates biological domain knowledge into feature grouping and selection. In this study, using metagenomic data related to CRC, classification is performed at three taxonomic levels (genus, family and order). The MetaPhlAn tool is employed to determine the relative abundance values of species in each sample. Comparative performance analyses involve six feature selection methods and four classification algorithms. When experimented on two CRC associated metagenomics datasets, the highest performance metric, yielding an AUC of 0.90, is observed at the genus taxonomic level. At this level, 7 out of top 10 groups (Parvimonas, Peptostreptococcus, Fusobacterium, Gemella, Streptococcus, Porphyromonas and Solobacterium) were commonly identified for both datasets. Moreover, the identified microorganisms at genus, family, and order levels are thoroughly discussed via refering to CRC-related metagenomic literature. This study not only contributes to our understanding of CRC development, but also highlights the applicability of taxonomy-based G-S-M method in tackling various diseases. © 2025 Elsevier B.V., All rights reserved.
  • Article
    Citation - WoS: 9
    Citation - Scopus: 15
    MicroBiomeGSM: The Identification of Taxonomic Biomarkers From Metagenomic Data Using Grouping, Scoring and Modeling (G-S-M) Approach
    (Frontiers Media S.A., 2023-11-22) Bakir-Gungor, Burcu; Temiz, Mustafa; Jabeer, Amhar; Wu, Di; Yousef, Malik
    Numerous biological environments have been characterized with the advent of metagenomic sequencing using next generation sequencing which lays out the relative abundance values of microbial taxa. Modeling the human microbiome using machine learning models has the potential to identify microbial biomarkers and aid in the diagnosis of a variety of diseases such as inflammatory bowel disease, diabetes, colorectal cancer, and many others. The goal of this study is to develop an effective classification model for the analysis of metagenomic datasets associated with different diseases. In this way, we aim to identify taxonomic biomarkers associated with these diseases and facilitate disease diagnosis. The microBiomeGSM tool presented in this work incorporates the pre-existing taxonomy information into a machine learning approach and challenges to solve the classification problem in metagenomics disease-associated datasets. Based on the G-S-M (Grouping-Scoring-Modeling) approach, species level information is used as features and classified by relating their taxonomic features at different levels, including genus, family, and order. Using four different disease associated metagenomics datasets, the performance of microBiomeGSM is comparatively evaluated with other feature selection methods such as Fast Correlation Based Filter (FCBF), Select K Best (SKB), Extreme Gradient Boosting (XGB), Conditional Mutual Information Maximization (CMIM), Maximum Likelihood and Minimum Redundancy (MRMR) and Information Gain (IG), also with other classifiers such as AdaBoost, Decision Tree, LogitBoost and Random Forest. microBiomeGSM achieved the highest results with an Area under the curve (AUC) value of 0.98% at the order taxonomic level for IBDMD dataset. Another significant output of microBiomeGSM is the list of taxonomic groups that are identified as important for the disease under study and the names of the species within these groups. The association between the detected species and the disease under investigation is confirmed by previous studies in the literature. The microBiomeGSM tool and other supplementary files are publicly available at: https://github.com/malikyousef/microBiomeGSM.
  • Article
    Enlightening the Molecular Mechanisms of Type 2 Diabetes With a Novel Pathway Clustering and Pathway Subnetwork Approach
    (Tubitak Scientific & Technological Research Council Turkey, 2022-01-01) Bakir-Gungor, Burcu; Yazici, Miray Unlu; Goy, Gokhan; Temiz, Mustafa; Ünlü Yazici, Miray
    Type 2 diabetes mellitus (T2D) constitutes 90% of the diabetes cases, and it is a complex multifactorial disease. In the last decade, genome-wide association studies (GWASs) for T2D successfully pinpointed the genetic variants (typically single nucleotide polymorphisms, SNPs) that associate with disease risk. In order to diminish the burden of multiple testing in GWAS, researchers attempted to evaluate the collective effects of interesting variants. In this regard, pathway-based analyses of GWAS became popular to discover novel multigenic functional associations. Still, to reveal the unaccounted 85 to 90% of T2D variation, which lies hidden in GWAS datasets, new post-GWAS strategies need to be developed. In this respect, here we reanalyze three metaanalysis data of GWAS in T2D, using the methodology that we have developed to identify disease-associated pathways by combining nominally significant evidence of genetic association with the known biochemical pathways, protein-protein interaction (PPI) networks, and the functional information of selected SNPs. In this research effort, to enlighten the molecular mechanisms underlying T2D development and progress, we integrated different in silico approaches that proceed in top-down manner and bottom-up manner, and presented a comprehensive analysis at protein subnetwork, pathway, and pathway subnetwork levels. Using the mutual information based on the shared genes, the identified protein subnetworks and the affected pathways of each dataset were compared. While most of the identified pathways recapitulate the pathophysiology of T2D, our results show that incorporating SNP functional properties, PPI networks into GWAS can dissect leading molecular pathways, and it could offer improvement over traditional enrichment strategies.
  • Article
    Citation - Scopus: 4
    CCPred: Global and Population-Specific Colorectal Cancer Prediction and Metagenomic Biomarker Identification at Different Molecular Levels Using Machine Learning Techniques
    (Elsevier Ltd, 2024-11) Bakir-Güngör, Burcu; Temiz, Mustafa; Inal, Yasin; Cicekyurt, Emre; Yousef, Malik
    Colorectal cancer (CRC) ranks as the third most common cancer globally and the second leading cause of cancer-related deaths. Recent research highlights the pivotal role of the gut microbiota in CRC development and progression. Understanding the complex interplay between disease development and metagenomic data is essential for CRC diagnosis and treatment. Current computational models employ machine learning to identify metagenomic biomarkers associated with CRC, yet there is a need to improve their accuracy through a holistic biological knowledge perspective. This study aims to evaluate CRC-associated metagenomic data at species, enzymes, and pathway levels via conducting global and population-specific analyses. These analyses utilize relative abundance values from human gut microbiome sequencing data and robust classification models are built for disease prediction and biomarker identification. For global CRC prediction and biomarker identification, the features that are identified by SelectKBest (SKB), Information Gain (IG), and Extreme Gradient Boosting (XGBoost) methods are combined. Population-based analysis includes within-population, leave-one-dataset-out (LODO) and cross-population approaches. Four classification algorithms are employed for CRC classification. Random Forest achieved an AUC of 0.83 for species data, 0.78 for enzyme data and 0.76 for pathway data globally. On the global scale, potential taxonomic biomarkers include ruthenibacterium lactatiformanas; enzyme biomarkers include RNA 2′ 3′ cyclic 3′ phosphodiesterase; and pathway biomarkers include pyruvate fermentation to acetone pathway. This study underscores the potential of machine learning models trained on metagenomic data for improved disease prediction and biomarker discovery. The proposed model and associated files are available at https://github.com/TemizMus/CCPRED. © 2024 Elsevier B.V., All rights reserved.