WoS İndeksli Yayınlar Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/394

Browse

Search Results

Now showing 1 - 7 of 7
  • Article
    G-S a Prior Biological Knowledge-Based Pattern Detection and Enrichment Framework for Multi-Omics Data Integration
    (MDPI, 2025-11-29) Unlu Yazici, Miray; Bakir-Gungor, Burcu; Yousef, Malik
    The rapid advancements in high-throughput technologies have led to a dramatic increase in diverse -omics data types, enabling comprehensive analyses, especially for complex diseases like cancer. Despite the development of multi-omics approaches, the challenges of scaling integration to massive, heterogeneous -omics datasets suggest that novel computational tools need to be designed. In this study, we propose an approach for integrating microRNA (miRNA) and messenger RNA (mRNA) expression data, incorporating prior biological knowledge (PBK). This approach scores and ranks groups of miRNAs and their associated genes using cross-validation iterations. The proposed method incorporates a Pattern detection (P) component to identify molecular motifs unique to each biological group. The analysis also facilitates the visualization of the groups, facilitating the identification of co-occurring groups and their characteristic features across iterations. Furthermore, the groups are scored using an over-representation analysis through a new Enrichment (E) component in each iteration. The clusters of the groups based on the Enrichment Scores (ESs) are visualized in a heatmap to obtain novel insights into the collective behavior and dependencies of the groups, aiming to understand the molecular mechanisms of complex diseases. The developed G-S-M-E tool not only provides performance metrics and biological scores at the group level but also offers comprehensive insights into intricate multi-omics interactions. In summary, our study emphasizes the importance of mathematical and data science methodologies in elucidating intricate multi-omics integration, yielding a formalized approach that deepens our comprehension of complex diseases.
  • Article
    Citation - WoS: 1
    A Comprehensive Analysis of Acoustic Emission Signals To Distinguish the Different Damage Types for Fiber-Reinforced Polymers: A Review
    (Wiley, 2025-12-03) Yilmaz, Cagatay
    Fiber-reinforced polymers (FRP) attract the attention of key industries, such as aerospace, wind energy, and automotive, as they can reduce the weight of structural components without compromising their mechanical properties. Due to FRP's anisotropic and non-homogeneous structure, their failure under different loading conditions and the corresponding failure mechanisms must be investigated. One method that progressively monitors the failure of FRP underload is Acoustic Emission (AE). AE can register the elastic stress waves in the form of digitized waveforms, released by the discontinuous events that occur in the FRP under load. These discontinuities can be clustered and identified as transverse cracking, fiber/matrix interface debonding, delamination, and fiber failure by analyzing the AE waveforms. Recently, numerous clustering approaches using machine learning algorithms, along with the varying features of AE waveforms, have been developed and are being used. These algorithms include supervised and unsupervised clustering, deep learning algorithms, and neural network methods, among others. While supervised algorithms require a training dataset to classify AE signals, unsupervised algorithms can perform clustering without training datasets. Deep learning and neural network algorithms can train themselves to cluster data, but they may require a significant amount of computer power when the dataset is large. This review paper provides comprehensive information on the clustering algorithm, along with the AE wave features, the range of features for different damage types, and the type of reinforcer.
  • Article
    Forecasting the Consumer Price Index in Türkiye Using Machine Learning Models: A Comparative Analysis
    (Gazi Univ, 2025-09-01) Söylemez, İsmet; Ünlü, Ramazan; Nalici, Mehmet Eren
    This study utilizes machine learning models to forecast Türkiye's Consumer Price Index (CPI), thereby addressing a critical gap in inflation prediction methodologies. The central research problem involves the forecasting of CPI in a volatile economic environment, which is essential for informed policymaking. The primary objective of this study is to evaluate the performance of three machine learning models, such as Decision Tree (DT), Random Forest (RF), and Support Vector Machine (SVM), in forecasting CPI over periods ranging from one to six months, utilizing data from 2012 to 2024. The study's unique contribution lies in the application of the \"SelectKBest\" method, which identifies the most relevant indices, thereby enhancing the efficiency of the models. An ensemble method, Averaging Voting, is also employed to combine the strengths of these models, producing more accurate and robust predictions. The findings indicate that while the RF model consistently generates the most accurate forecasts across all shifts, the SVM model demonstrates a particular strength in the domain of short-term predictions. The ensemble model demonstrates a substantial performance improvement, with a R2 value of 0.962 for one-month ahead of estimates and 0.956 for five-month forecasts. This combined approach has been shown to outperform individual models, offering a more reliable framework for CPI forecasting. The findings offer valuable insights for economic policymakers, enabling more precise and stable inflation predictions in Türkiye.
  • Conference Object
    Enhancing Complex Disease Group Scoring with Mirgedinet: A Multi-Algorithm Machine Learning Framework Based on the GSM Approach
    (IEEE, 2025-06-25) Qumsiyeh, Emma; Bakir-Gungor, Burcu; Yousef, Malik
    Integrating biological prior knowledge for disease gene associations has shown significant promise in discovering new biomarkers with potential translational applications. This work investigates the application of a multi-algorithm machine learning framework based on the Grouping-Scoring-Modeling (G-S-M) approach for improving the prediction of complex diseases. The study identifies the primary gene and miRNA interactions in various complex diseases with the help of miRGediNET, which is a machine-learning based tool that integrates data from three biological databases. Traditional methods have only focused on independence between features; the G-S-M method focuses on aggregating genes based on biological interactions, pinpointing the scoring of gene groups for a disease, and modeling its predictive capability using advanced machine learning algorithms. In this research paper, seven algorithms, including Support Vector Machine, Decision Tree, and CatBoost, were applied to eight datasets extracted from the GEO database. This framework proved very robust in ranking gene clusters, thus predicting critical biomarkers while doing 100-fold randomized cross-validation within the evaluation. The results indicate this approach's high potential for refining disease and supporting research for choosing the best algorithm that can provide biological insights and computational advances.
  • Conference Object
    Exploring Microbiome Signatures in Autism Spectrum Disorder via Grouping-Scoring Based Machine Learning
    (IEEE, 2025-06-25) Temiz, Mustafa; Ersoz, Nur Sebnem; Yousef, Malik; Bakir-Gungor, Burcu
    The rapid increase in omic data production increased the importance of machine learning (ML) methods to analze these data. In particular, the use of metagenomic data in the diagnosis, prognosis and treatment of diseases is becoming widespread. Autism Spectrum Disorder (ASD) is a neurodevelopmental disease that occurs in early childhood and continues lifelong. The aim of this study is to increase ML performance, reduce computational costs and achieve successful classification performance using a small number of metagenomic features. In addition, disease prediction is performed; ASD associated biomarkers are determined using the microBiomeGSM on metagenomic data. Classification is performed at three different taxonomic levels (genus, family and order) using the relative abundance values of species. The best performance metric (0.95 AUC) was obtained at the order taxonomic level using an average of 416 features with microBiomeGSM. The identified ASD-related taxonomic species are presented.
  • Conference Object
    TextNetTopics+: Enhancing Text Classification Through Classifier Diversity and Model Ensembling
    (Springer International Publishing AG, 2025) Voskergian, Daniel; Bakir-Gungor, Burcu; Yousef, Malik
    TextNetTopics is an innovative text classification framework that integrates topic modeling with feature selection to improve model accuracy and interpretability. Unlike traditional methods that rely on individual words, TextNetTopics selects cohesive topics extracted via Latent Dirichlet Allocation as features for document representation, effectively reducing dimensionality while preserving the semantic structure of the text. This study evaluates the performance of TextNetTopics utilizing multiple machine learning algorithms in the M (Modeling) component, including Random Forest, Support Vector Machine, Gradient Boosting, eXtreme Gradient Boosting, and Logistic Regression. To further enhance classification performance, we introduce TextNetTopics+, an ensemblebased extension that leverages both hard voting and soft voting mechanisms to combine the strengths of multiple classifiers. Comprehensive experiments on the LitCovid and WOS datasets demonstrate that ensemble learning in TextNetTopics + significantly outperforms individual classifiers in TextNetTopics, confirming its effectiveness in improving model robustness and generalization.
  • Conference Object
    Leveraging MicroRNA-Gene Associations With Mirgedinet: An Intelligent Approach for Enhanced Classification of Breast Cancer Molecular Subtypes
    (Springer International Publishing AG, 2025) Qumsiyeh, Emma; Bakir-Gungor, Burcu; Yousef, Malik
    Understanding the molecular subtypes of breast cancer is crucial for advancing targeted therapies and precision medicine. For the BRCA molecular subtype prediction problem, this study employs miRGediNET, a machinelearning approach that integrates data from miRTarBase, DisGeNET, and HMDD databases to investigate shared gene associations between microRNA (miRNA) activity and disease mechanisms. Using the BRCA LumAB_Her2Basal dataset, we evaluate miRGediNET's performance against traditional feature selection methods, including CMIM, mRmR, Information Gain (IG), SelectKBest (SKB), Fast Correlation-Based Filter (FCBF), and XGBoost (XGB). These feature selection techniques were assessed using various classification algorithms including Random Forest (RF), Support Vector Machine (SVM), LogitBoost, Decision Tree, and AdaBoost, all executed with default parameters. The feature selection methods were tested using Monte Carlo Cross-Validation, where performance metrics obtained for each iteration were averaged to ensure robustness. Our findings reveal that miRGediNET outperforms traditional methods in accuracy and Area Under the Curve (AUC), emphasizing its superior capability to identify key genes that bridge miRNA interactions and breast cancer mechanisms. Notably, both miRGediNET and Information Gain (IG) feature selection consistently identified ESR1, a critical biomarker frequently reported in recent research associated with breast cancer prognosis and resistance to endocrine therapies. This integrative approach provides deeper biological insights into miRNA-disease interactions, paving the way for enhanced patient stratification, biomarker discovery, and personalized medicine strategies. The miRGediNET tool, developed on the KNIME platform, offers a practical resource for further exploration in the field of bioinformatics and oncology.