WoS İndeksli Yayınlar Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/394

Browse

Search Results

Now showing 1 - 10 of 15
  • Conference Object
    Enhancing Complex Disease Group Scoring with Mirgedinet: A Multi-Algorithm Machine Learning Framework Based on the GSM Approach
    (IEEE, 2025-06-25) Qumsiyeh, Emma; Bakir-Gungor, Burcu; Yousef, Malik
    Integrating biological prior knowledge for disease gene associations has shown significant promise in discovering new biomarkers with potential translational applications. This work investigates the application of a multi-algorithm machine learning framework based on the Grouping-Scoring-Modeling (G-S-M) approach for improving the prediction of complex diseases. The study identifies the primary gene and miRNA interactions in various complex diseases with the help of miRGediNET, which is a machine-learning based tool that integrates data from three biological databases. Traditional methods have only focused on independence between features; the G-S-M method focuses on aggregating genes based on biological interactions, pinpointing the scoring of gene groups for a disease, and modeling its predictive capability using advanced machine learning algorithms. In this research paper, seven algorithms, including Support Vector Machine, Decision Tree, and CatBoost, were applied to eight datasets extracted from the GEO database. This framework proved very robust in ranking gene clusters, thus predicting critical biomarkers while doing 100-fold randomized cross-validation within the evaluation. The results indicate this approach's high potential for refining disease and supporting research for choosing the best algorithm that can provide biological insights and computational advances.
  • Article
    Citation - WoS: 6
    Citation - Scopus: 7
    The Determination of Distinctive Single Nucleotide Polymorphism Sets for the Diagnosis of Behcet's Disease
    (IEEE Computer Soc, 2022-05-01) Isik, Yunus Emre; Gormez, Yasin; Aydin, Zafer; Bakir-Gungor, Burcu
    Behcet's Disease (BD) is a multi-system inflammatory disorder in which the etiology remains unclear. The most probable hypothesis is that genetic tendency and environmental factors play roles in the development of BD. In order to find the essential reasons, genetic changes on thousands of genes should be analyzed. Besides, there is a need for extra analysis to find out which genetic factor affects the disease. Machine learning approaches have high potential for extracting the knowledge from genomics and selecting the representative Single Nucleotide Polymorphisms (SNPs) as the most effective features for the clinical diagnosis process. In this study, we have attempted to identify representative SNPs using feature selection methods, incorporating biological information and aimed to develop a machine-learning model for diagnosing Behcet's disease. By combining biological information and machine learning classifiers, up to 99.64 percent accuracy of disease prediction is achieved using only 13,611 out of 311,459 SNPs. In addition, we revealed the SNPs that are most distinctive by performing repeated feature selection in cross-validation experiments.
  • Conference Object
    TextNetTopics+: Enhancing Text Classification Through Classifier Diversity and Model Ensembling
    (Springer International Publishing AG, 2025) Voskergian, Daniel; Bakir-Gungor, Burcu; Yousef, Malik
    TextNetTopics is an innovative text classification framework that integrates topic modeling with feature selection to improve model accuracy and interpretability. Unlike traditional methods that rely on individual words, TextNetTopics selects cohesive topics extracted via Latent Dirichlet Allocation as features for document representation, effectively reducing dimensionality while preserving the semantic structure of the text. This study evaluates the performance of TextNetTopics utilizing multiple machine learning algorithms in the M (Modeling) component, including Random Forest, Support Vector Machine, Gradient Boosting, eXtreme Gradient Boosting, and Logistic Regression. To further enhance classification performance, we introduce TextNetTopics+, an ensemblebased extension that leverages both hard voting and soft voting mechanisms to combine the strengths of multiple classifiers. Comprehensive experiments on the LitCovid and WOS datasets demonstrate that ensemble learning in TextNetTopics + significantly outperforms individual classifiers in TextNetTopics, confirming its effectiveness in improving model robustness and generalization.
  • Conference Object
    Citation - WoS: 16
    Citation - Scopus: 20
    Machine Learning Analysis of Inflammatory Bowel Disease-Associated Metagenomics Dataset
    (Institute of Electrical and Electronics Engineers Inc., 2018-09) Hacilar, Hilal; Nalbantoĝlu, Özkan Ufuk; Bakir-Güngör, Burcu
    There is an ongoing interplay between humans and our microbial communities. The microorganisms living in our gut produce energy from our food, strengthen our immune system, break down foreign products, and release metabolites and hormones, which are significant for regulating our physiology. The shifts away from this 'healthy' gut microbiome is considered to be associated with many diseases. Inflammatory bowel diseases (IBD) including Crohn's disease and ulcerative colitis, are gut related disorders affecting the intestinal tract. Although some metagenomics studies are conducted on IBD recently, our current understanding of the precise relationships between the human gut microbiome and IBD remains limited. In this regard, the use of state-of-the art machine learning approaches became popular to address a variety of questions like early diagnosis of certain diseases using human microbiota. In this study, we investigate which subset of gut microbiota are mostly associated with IBD and if disease-associated biomarkers can be detected via applying state-of-the art machine learning algorithms and proper feature selection methods. © 2019 Elsevier B.V., All rights reserved.
  • Article
    Citation - WoS: 29
    Citation - Scopus: 32
    Liver Fibrosis Staging Using CT Image Texture Analysis and Soft Computing
    (Elsevier, 2014-12) Kayaalti, Omer; Aksebzeci, Bekir Hakan; Karahan, Ibrahim Okkes; Deniz, Kemal; Ozturk, Mehmet; Yilmaz, Bulent; Asyali, Musa Hakan
    Liver biopsy is considered to be the gold standard for analyzing chronic hepatitis and fibrosis; however, it is an invasive and expensive approach, which is also difficult to standardize. Medical imaging techniques such as ultrasonography, computed tomography (CT), and magnetic resonance imaging are non-invasive and helpful methods to interpret liver texture, and may be good alternatives to needle biopsy. Recently, instead of visual inspection of these images, computer-aided image analysis based approaches have become more popular. In this study, a non-invasive, low-cost and relatively accurate method was developed to determine liver fibrosis stage by analyzing some texture features of liver CT images. In this approach, some suitable regions of interests were selected on CT images and a comprehensive set of texture features were obtained from these regions using different methods, such as Gray Level Co-occurrence matrix (GLCM), Laws' method, Discrete Wavelet Transform (DWT), and Gabor filters. Afterwards, sequential floating forward selection and exhaustive search methods were used in various combinations for the selection of most discriminating features. Finally, those selected texture features were classified using two methods, namely, Support Vector Machines (SVM) and k-nearest neighbors (k-NN). The mean classification accuracy in pairwise group comparisons was approximately 95% for both classification methods using only 5 features. Also, performance of our approach in classifying liver fibrosis stage of subjects in the test set into 7 possible stages was investigated. In this case, both SVM and k-NN methods have returned relatively low classification accuracies. Our pairwise group classification results showed that DWT, Gabor, GLCM, and Laws' texture features were more successful than the others; as such features extracted from these methods were used in the feature fusion process. Fusing features from these better performing families further improved the classification performance. The results show that our approach can be used as a decision support system in especially pairwise fibrosis stage comparisons. (C) 2014 Elsevier B.V. All rights reserved.
  • Conference Object
    Leveraging MicroRNA-Gene Associations With Mirgedinet: An Intelligent Approach for Enhanced Classification of Breast Cancer Molecular Subtypes
    (Springer International Publishing AG, 2025) Qumsiyeh, Emma; Bakir-Gungor, Burcu; Yousef, Malik
    Understanding the molecular subtypes of breast cancer is crucial for advancing targeted therapies and precision medicine. For the BRCA molecular subtype prediction problem, this study employs miRGediNET, a machinelearning approach that integrates data from miRTarBase, DisGeNET, and HMDD databases to investigate shared gene associations between microRNA (miRNA) activity and disease mechanisms. Using the BRCA LumAB_Her2Basal dataset, we evaluate miRGediNET's performance against traditional feature selection methods, including CMIM, mRmR, Information Gain (IG), SelectKBest (SKB), Fast Correlation-Based Filter (FCBF), and XGBoost (XGB). These feature selection techniques were assessed using various classification algorithms including Random Forest (RF), Support Vector Machine (SVM), LogitBoost, Decision Tree, and AdaBoost, all executed with default parameters. The feature selection methods were tested using Monte Carlo Cross-Validation, where performance metrics obtained for each iteration were averaged to ensure robustness. Our findings reveal that miRGediNET outperforms traditional methods in accuracy and Area Under the Curve (AUC), emphasizing its superior capability to identify key genes that bridge miRNA interactions and breast cancer mechanisms. Notably, both miRGediNET and Information Gain (IG) feature selection consistently identified ESR1, a critical biomarker frequently reported in recent research associated with breast cancer prognosis and resistance to endocrine therapies. This integrative approach provides deeper biological insights into miRNA-disease interactions, paving the way for enhanced patient stratification, biomarker discovery, and personalized medicine strategies. The miRGediNET tool, developed on the KNIME platform, offers a practical resource for further exploration in the field of bioinformatics and oncology.
  • Article
    Citation - WoS: 5
    Citation - Scopus: 4
    Investigating Strain Rate Effects on Damage Mechanisms in Hybrid Laminated Composites Using Acoustic Emission
    (Elsevier Sci Ltd, 2025-12) Gulsen, Abdulkadir; Kolukisa, Burak; Etcil, Mustafa; Caliskan, Umut; Zafar, Hafiz Muhammad Numan; Demirbas, Munise Didem; Bakir-Gungor, Burcu
    Hybrid composites, which combine distinct fiber types such as carbon, basalt, and aramid, provide a synergistic balance of strength, stiffness, impact resistance, and energy dissipation, making them appealing for critical applications in aerospace, automotive, and other high-performance industries. Monitoring damage progression in these composites is vital for ensuring structural integrity and preventing catastrophic failures. Acoustic emission (AE) serves as a powerful, noninvasive technique for real-time structural health monitoring, capturing the transient stress waves generated when damage events occur. This study utilizes AE to examine the influence of strain rate on damage modes in carbon/basalt/aramid hybrid composites under three-point bending. An unsupervised feature selection based on Laplacian scores is employed to identify the most relevant AE features with damage modes, while SHapley Additive Explanations (SHAP) are used to evaluate the correlation between AE features and strain rates. The correlation analysis results indicate that peak frequency (PF) serves as a key indicator, demonstrating significant shifts at higher strain rates. Gaussian Mixture Model (GMM) clustering is used to analyze hybrid composites by examining clustered AE signals based on selected features identified through Laplacian scores, with Silhouette scores employed to determine the optimal number of clusters. This study highlights the role of AE in understanding fiber interactions and damage evolution, offering valuable insights into the mechanical performance and optimization of carbon/basalt/aramid hybrid composite structures.
  • Article
    Handling Incomplete Data Classification Using Imputed Feature Selected Bagging (IFBAG) Method
    (Ios Press, 2021-07-09) Khan, Ahmad Jaffar; Raza, Basit; Shahid, Ahmad Raza; Kumar, Yogan Jaya; Faheem, Muhammad; Alquhayz, Hani
    Almost all real-world datasets contain missing values. Classification of data with missing values can adversely affect the performance of a classifier if not handled correctly. A common approach used for classification with incomplete data is imputation. Imputation transforms incomplete data with missing values to complete data. Single imputation methods are mostly less accurate than multiple imputation methods which are often computationally much more expensive. This study proposes an imputed feature selected bagging (IFBag) method which uses multiple imputation, feature selection and bagging ensemble learning approach to construct a number of base classifiers to classify new incomplete instances without any need for imputation in testing phase. In bagging ensemble learning approach, data is resampled multiple times with substitution, which can lead to diversity in data thus resulting in more accurate classifiers. The experimental results show the proposed IFBag method is considerably fast and gives 97.26% accuracy for classification with incomplete data as compared to common methods used.
  • Conference Object
    Citation - WoS: 2
    Citation - Scopus: 1
    Feature Selection for Protein Dihedral Angle Prediction
    (IEEE, 2017) Aydin, Zafer; Kaynar, Oguz; Gormez, Yasin
    Three-dimensional structure prediction has crucial importance for bioinformatics and theoretical chemistry. One of the main steps of three-dimensional structure prediction is dihedral (torsion) angle prediction. As new feature extraction methods are developed the dimension of the input space increases considerably yielding longer model training and less accurate models due to noisy or redundant features. In this study, feature selection is employed for dimensionality reduction on one of the established benchmarks of protein 1D structure prediction. Experimental results show that the feature selection improves the accuracy of protein dihedral angle class prediction by 2% and can eliminate up to %82 of the features when random forest classifier is used. Accurate prediction of dihedral angles will eventually contribute to protein structure prediction.
  • Conference Object
    Citation - WoS: 22
    Citation - Scopus: 52
    Evaluation of Classification Algorithms, Linear Discriminant Analysis and a New Hybrid Feature Selection Methodology for the Diagnosis of Coronary Artery Disease
    (Institute of Electrical and Electronics Engineers Inc., 2018-12) Kolukisa, Burak; Hacilar, Hilal; Göy, Gökhan; Kus, Mustafa; Bakir-Güngör, Burcu; Aral, Atilla; Güngör, Vehbi Çağrı
    According to the World Health Organization (WHO), 31% of the world's total deaths in 2016 (17.9 million) was due to cardiovascular diseases (CVD). With the development of information technologies, it has become possible to predict whether people have heart diseases or not by checking certain physical and biochemical values at a lower cost. In this study, we have evalated a set of different classification algorithms, linear discriminant analysis and proposed a new hybrid feature selection methodology for the diagnosis of coronary heart diseases (CHD). Throughout this research effort, using three publicly available Heart Disease diagnosis datasets (UCI Machine Learning Repository), we have conducted comparative performance evaluations in terms of accuracy, sensitivity, specificity, F-measure, AUC and running time. © 2023 Elsevier B.V., All rights reserved.