PubMed İndeksli Yayınlar Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/397

Browse

Search Results

Now showing 1 - 2 of 2
  • Article
    Citation - WoS: 35
    Citation - Scopus: 42
    Inflammatory Bowel Disease Biomarkers of Human Gut Microbiota Selected via Different Feature Selection Methods
    (PeerJ Inc, 2022-04-25) Bakir-Gungor, Burcu; Lar, Hilal Hac; Jabeer, Amhar; Nalbantoglu, Ozkan Ufuk; Aran, Oya; Yousef, Malik; Hacilar, Hilal
    The tremendous boost in next generation sequencing and in the "omics" technologies makes it possible to characterize the human gut microbiome-the collective genomes of the microbial community that reside in our gastrointestinal tract. Although some of these microorganisms are considered to be essential regulators of our immune system, the alteration of the complexity and eubiotic state of microbiota might promote autoimmune and inflammatory disorders such as diabetes, rheumatoid arthritis, Inflammatory bowel diseases (IBD), obesity, and carcinogenesis. IBD, comprising Crohn's disease and ulcerative colitis, is a gut-related, multifactorial disease with an unknown etiology. IBD presents defects in the detection and control of the gut microbiota, associated with unbalanced immune reactions, genetic mutations that confer susceptibility to the disease, and complex environmental conditions such as westernized lifestyle. Although some existing studies attempt to unveil the composition and functional capacity of the gut microbiome in relation to IBD diseases, a comprehensive picture of the gut microbiome in IBD patients is far from being complete. Due to the complexity of metagenomic studies, the applications of the state-of-the-art machine learning techniques became popular to address a wide range of questions in the field of metagenomic data analysis. In this regard, using IBD associated metagenomics dataset, this study utilizes both supervised and unsupervised machine learning algorithms, (i) to generate a classification model that aids IBD diagnosis, (ii) to discover IBD-associated biomarkers, (iii) to discover subgroups of IBD patients using k-means and hierarchical clustering approaches. To deal with the high dimensionality of features, we applied robust feature selection algorithms such as Conditional Mutual Information Maximization (CMIM), Fast Correlation Based Filter (FCBF), min redundancy max relevance (mRMR), Select K Best (SKB), Information Gain (IG) and Extreme Gradient Boosting (XGBoost). In our experiments with 100-fold Monte Carlo cross-validation (MCCV), XGBoost, IG, and SKB methods showed a considerable effect in terms of minimizing the microbiota used for the diagnosis of IBD and thus reducing the cost and time. We observed that compared to Decision Tree, Support Vector Machine, Logitboost, Adaboost, and stacking ensemble classifiers, our Random Forest classifier resulted in better performance measures for the classification of IBD. Our findings revealed potential microbiome-mediated mechanisms of IBD and these findings might be useful for the development of microbiome-based diagnostics.
  • Article
    Citation - WoS: 16
    Citation - Scopus: 21
    GeNetOntology: Identifying Affected Gene Ontology Terms via Grouping, Scoring, and Modeling of Gene Expression Data Utilizing Biological Knowledge-Based Machine Learning
    (Frontiers Media S.A., 2023-08-21) Ersoz, Nur Sebnem; Bakir-Gungor, Burcu; Yousef, Malik
    Introduction: Identifying significant sets of genes that are up/downregulated under specific conditions is vital to understand disease development mechanisms at the molecular level. Along this line, in order to analyze transcriptomic data, several computational feature selection (i.e., gene selection) methods have been proposed. On the other hand, uncovering the core functions of the selected genes provides a deep understanding of diseases. In order to address this problem, biological domain knowledge-based feature selection methods have been proposed. Unlike computational gene selection approaches, these domain knowledge-based methods take the underlying biology into account and integrate knowledge from external biological resources. Gene Ontology (GO) is one such biological resource that provides ontology terms for defining the molecular function, cellular component, and biological process of the gene product.Methods: In this study, we developed a tool named GeNetOntology which performs GO-based feature selection for gene expression data analysis. In the proposed approach, the process of Grouping, Scoring, and Modeling (G-S-M) is used to identify significant GO terms. GO information has been used as the grouping information, which has been embedded into a machine learning (ML) algorithm to select informative ontology terms. The genes annotated with the selected ontology terms have been used in the training part to carry out the classification task of the ML model. The output is an important set of ontologies for the two-class classification task applied to gene expression data for a given phenotype.Results: Our approach has been tested on 11 different gene expression datasets, and the results showed that GeNetOntology successfully identified important disease-related ontology terms to be used in the classification model.Discussion: GeNetOntology will assist geneticists and scientists to identify a range of disease-related genes and ontologies in transcriptomic data analysis, and it will also help doctors design diagnosis platforms and improve patient treatment plans.