Scopus İndeksli Yayınlar Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/395

Browse

Search Results

Now showing 1 - 10 of 16

Citation - Scopus: 4
RCE-IFE: Recursive Cluster Elimination with Intra-Cluster Feature Elimination
(PeerJ Inc., 2025-02-07) Kuzudisli, Cihan; Bakir-Gungor, Burcu; Qaqish, Bahjat; Yousef, Malik
Citation - Scopus: 7
Network Anomaly Detection Using Deep Autoencoder and Parallel Artificial Bee Colony Algorithm-Trained Neural Network
(PeerJ Inc., 2024-10-08) Dedeturk, Bilge Kagan; Bakir-Gungor, Burcu; Hacılar, Hilal; Gungor, Vehbi Cagri
Citation - WoS: 2
Citation - Scopus: 4
RCE-IFE: Recursive Cluster Elimination With Intra-Cluster Feature Elimination
(PeerJ Inc, 2025-02-07) Kuzudisli, Cihan; Bakir-Gungor, Burcu; Qaqish, Bahjat; Yousef, Malik
The computational and interpretational difficulties caused by the ever-increasing dimensionality of biological data generated by new technologies pose a significant challenge. Feature selection (FS) methods aim to reduce the dimension, and feature grouping has emerged as a foundation for FS techniques that seek to detect strong correlations among features and identify irrelevant features. In this work, we propose the Recursive Cluster Elimination with Intra-Cluster Feature Elimination (RCE-IFE) method that utilizes feature grouping and iterates grouping and elimination steps in a supervised context. We assess dimensionality reduction and discriminatory capabilities of RCE-IFE on various high-dimensional datasets from different biological domains. For a set of gene expression, MicroRNA (miRNA) expression, and methylation datasets, the performance of RCE-IFE is comparatively evaluated with RCE-IFE-SVM (the SVM-adapted version of RCE-IFE) and SVM-RCE. On average, RCE-IFE attains an area under the curve (AUC) of 0.85 among tested expression datasets with the fewest features and the shortest running time, while RCE-IFE-SVM (the SVM-adapted version of RCE-IFE) and SVM-RCE achieve similar AUCs of 0.84 and 0.83, respectively. RCE-IFE and SVM-RCE yield AUCs of 0.79 and 0.68, respectively when averaged over seven different metagenomics datasets, with RCE-IFE significantly reducing feature subsets. Furthermore, RCE-IFE surpasses several state-of-the-art FS methods, such as Minimum Redundancy Maximum Relevance (MRMR), Fast Correlation-Based Filter (FCBF), Information Gain (IG), Conditional Mutual Information Maximization (CMIM), SelectKBest (SKB), and eXtreme Gradient Boosting (XGBoost), obtaining an average AUC of 0.76 on five gene expression datasets. Compared with a similar tool, Multi-stage, RCE-IFE gives a similar average accuracy rate of 89.27% using fewer features on four cancer-related datasets. The comparability of RCE-IFE is also verified with other biological domain knowledge-based Grouping-Scoring-Modeling (G-S-M) tools, including mirGediNET, 3Mint, and miRcorrNet. Additionally, the biological relevance of the selected features by RCE-IFE is evaluated. The proposed method also exhibits high consistency in terms of the selected features across multiple runs. Our experimental findings imply that RCE-IFE provides robust classifier performance and significantly reduces feature size while maintaining feature relevance and consistency.
Citation - WoS: 24
Citation - Scopus: 27
PANOGA: a Web Server for Identification of SNP-Targeted Pathways From Genome-Wide Association Study Data
(Oxford Univ Press, 2014-01-11) Bakir-Gungor, Burcu; Egemen, Ece; Sezerman, Osman Ugur
Genome-wide association studies (GWAS) have revolutionized the search for the variants underlying human complex diseases. However, in a typical GWAS, only a minority of the single-nucleotide polymorphisms (SNPs) with the strongest evidence of association is explained. One possible reason of complex diseases is the alterations in the activity of several biological pathways. Here we present a web server called Pathway and Network-Oriented GWAS Analysis to devise functionally important pathways through the identification of SNP-targeted genes within these pathways. The strength of our methodology stems from its multidimensional perspective, where we combine evidence from the following five resources: (i) genetic association information obtained through GWAS, (ii) SNP functional information, (iii) protein-protein interaction network, (iv) linkage disequilibrium and (v) biochemical pathways.
Citation - WoS: 4
Citation - Scopus: 7
Network Anomaly Detection Using Deep Autoencoder and Parallel Artificial Bee Colony Algorithm-Trained Neural Network
(PeerJ Inc, 2024-10-08) Hacilar, Hilal; Dedeturk, Bilge Kagan; Bakir-Gungor, Burcu; Gungor, Vehbi Cagri
Cyberattacks are increasingly becoming more complex, which makes intrusion detection extremely difficult. Several intrusion detection approaches have been developed in the literature and utilized to tackle computer security intrusions. Implementing machine learning and deep learning models for network intrusion detection has been a topic of active research in cybersecurity. In this study, artificial neural networks (ANNs), a type of machine learning algorithm, are employed to determine optimal network weight sets during the training phase. Conventional training algorithms, such as back- propagation, may encounter challenges in optimization due to being entrapped within local minima during the iterative optimization process; global search strategies can be slow at locating global minima, and they may suffer from a low detection rate. In the ANN training, the Artificial Bee Colony (ABC) algorithm enables the avoidance of local minimum solutions by conducting a high-performance search in the solution space but it needs some modifications. To address these challenges, this work suggests a Deep Autoencoder (DAE)-based, vectorized, and parallelized ABC algorithm for training feed-forward artificial neural networks, which is tested on the UNSW-NB15 and NF-UNSW-NB15-v2 datasets. Our experimental results demonstrate that the proposed DAE-based parallel ABC-ANN outperforms existing metaheuristics, showing notable improvements in network intrusion detection. The experimental results reveal a notable improvement in network intrusion detection through this proposed approach, exhibiting an increase in detection rate (DR) by 0.76 to 0.81 and a reduction in false alarm rate (FAR) by 0.016 to 0.005 compared to the ANN-BP algorithm on the UNSWNB15 dataset. Furthermore, there is a reduction in FAR by 0.006 to 0.0003 compared to the ANN-BP algorithm on the NF-UNSW-NB15-v2 dataset. These findings underscore the effectiveness of our proposed approach in enhancing network security against network intrusions.
Citation - WoS: 9
Citation - Scopus: 15
MicroBiomeGSM: The Identification of Taxonomic Biomarkers From Metagenomic Data Using Grouping, Scoring and Modeling (G-S-M) Approach
(Frontiers Media S.A., 2023-11-22) Bakir-Gungor, Burcu; Temiz, Mustafa; Jabeer, Amhar; Wu, Di; Yousef, Malik
Numerous biological environments have been characterized with the advent of metagenomic sequencing using next generation sequencing which lays out the relative abundance values of microbial taxa. Modeling the human microbiome using machine learning models has the potential to identify microbial biomarkers and aid in the diagnosis of a variety of diseases such as inflammatory bowel disease, diabetes, colorectal cancer, and many others. The goal of this study is to develop an effective classification model for the analysis of metagenomic datasets associated with different diseases. In this way, we aim to identify taxonomic biomarkers associated with these diseases and facilitate disease diagnosis. The microBiomeGSM tool presented in this work incorporates the pre-existing taxonomy information into a machine learning approach and challenges to solve the classification problem in metagenomics disease-associated datasets. Based on the G-S-M (Grouping-Scoring-Modeling) approach, species level information is used as features and classified by relating their taxonomic features at different levels, including genus, family, and order. Using four different disease associated metagenomics datasets, the performance of microBiomeGSM is comparatively evaluated with other feature selection methods such as Fast Correlation Based Filter (FCBF), Select K Best (SKB), Extreme Gradient Boosting (XGB), Conditional Mutual Information Maximization (CMIM), Maximum Likelihood and Minimum Redundancy (MRMR) and Information Gain (IG), also with other classifiers such as AdaBoost, Decision Tree, LogitBoost and Random Forest. microBiomeGSM achieved the highest results with an Area under the curve (AUC) value of 0.98% at the order taxonomic level for IBDMD dataset. Another significant output of microBiomeGSM is the list of taxonomic groups that are identified as important for the disease under study and the names of the species within these groups. The association between the detected species and the disease under investigation is confirmed by previous studies in the literature. The microBiomeGSM tool and other supplementary files are publicly available at: https://github.com/malikyousef/microBiomeGSM.
Citation - WoS: 7
Citation - Scopus: 7
Investigating Strain Rate Effects on Damage Mechanisms in Hybrid Laminated Composites Using Acoustic Emission
(Elsevier Sci Ltd, 2025-12) Gulsen, Abdulkadir; Kolukisa, Burak; Etcil, Mustafa; Caliskan, Umut; Zafar, Hafiz Muhammad Numan; Demirbas, Munise Didem; Bakir-Gungor, Burcu
Hybrid composites, which combine distinct fiber types such as carbon, basalt, and aramid, provide a synergistic balance of strength, stiffness, impact resistance, and energy dissipation, making them appealing for critical applications in aerospace, automotive, and other high-performance industries. Monitoring damage progression in these composites is vital for ensuring structural integrity and preventing catastrophic failures. Acoustic emission (AE) serves as a powerful, noninvasive technique for real-time structural health monitoring, capturing the transient stress waves generated when damage events occur. This study utilizes AE to examine the influence of strain rate on damage modes in carbon/basalt/aramid hybrid composites under three-point bending. An unsupervised feature selection based on Laplacian scores is employed to identify the most relevant AE features with damage modes, while SHapley Additive Explanations (SHAP) are used to evaluate the correlation between AE features and strain rates. The correlation analysis results indicate that peak frequency (PF) serves as a key indicator, demonstrating significant shifts at higher strain rates. Gaussian Mixture Model (GMM) clustering is used to analyze hybrid composites by examining clustered AE signals based on selected features identified through Laplacian scores, with Silhouette scores employed to determine the optimal number of clusters. This study highlights the role of AE in understanding fiber interactions and damage evolution, offering valuable insights into the mechanical performance and optimization of carbon/basalt/aramid hybrid composite structures.
Citation - WoS: 25
Citation - Scopus: 28
Identification of Possible Pathogenic Pathways in Behcet's Disease Using Genome-Wide Association Study Data From Two Different Populations
(Nature Publishing Group, 2014-09-17) Bakir-Gungor, Burcu; Remmers, Elaine F.; Meguro, Akira; Mizuki, Nobuhisa; Kastner, Daniel L.; Gul, Ahmet; Sezerman, Osman U.
Behcet's disease (BD) is a multi-system inflammatory disorder of unknown etiology. Two recent genome-wide association studies (GWASs) of BD confirmed a strong association with the MHC class I region and identified two non-HLA common genetic variations. In complex diseases, multiple factors may target different sets of genes in the same pathway and thus may cause the same disease phenotype. We therefore hypothesized that identification of disease-associated pathways is critical to elucidate mechanisms underlying BD, and those pathways may be conserved within and across populations. To identify the disease-associated pathways, we developed a novel methodology that combines nominally significant evidence of genetic association with current knowledge of biochemical pathways, protein-protein interaction networks, and functional information of selected SNPs. Using this methodology, we searched for the disease-related pathways in two BD GWASs in Turkish and Japanese case-control groups. We found that 6 of the top 10 identified pathways in both populations were overlapping, even though there were few significantly conserved SNPs/genes within and between populations. The probability of random occurrence of such an event was 2.24E -39. These shared pathways were focal adhesion, MAPK signaling, TGF-beta signaling, ECM-receptor interaction, complement and coagulation cascades, and proteasome pathways. Even though each individual has a unique combination of factors involved in their disease development, the targeted pathways are expected to be mostly the same. Hence, the identification of shared pathways between the Turkish and the Japanese patients using GWAS data may help further elucidate the inflammatory mechanisms in BD pathogenesis.
Citation - WoS: 42
Citation - Scopus: 42
HomSI: A Homozygous Stretch Identifier From Next-Generation Sequencing Data
(Oxford Univ Press, 2013-12-03) Gormez, Zeliha; Bakir-Gungor, Burcu; Sagiroglu, Mahmut Samil
In consanguineous families, as a result of inheriting the same genomic segments through both parents, the individuals have stretches of their genomes that are homozygous. This situation leads to the prevalence of recessive diseases among the members of these families. Homozygosity mapping is based on this observation, and in consanguineous families, several recessive disease genes have been discovered with the help of this technique. The researchers typically use single nucleotide polymorphism arrays to determine the homozygous regions and then search for the disease gene by sequencing the genes within this candidate disease loci. Recently, the advent of next-generation sequencing enables the concurrent identification of homozygous regions and the detection of mutations relevant for diagnosis, using data from a single sequencing experiment. In this respect, we have developed a novel tool that identifies homozygous regions using deep sequence data. Using*.vcf (variant call format) files as an input file, our program identifies the majority of homozygous regions found by microarray single nucleotide polymorphism genotype data.
Epistatic Interactions Between Autoimmunity and Genetic Thrombophilia' Reply
(Nature Publishing Group, 2015-01-14) Bakir-Gungor, Burcu; Remmers, Elaine F.; Meguro, Akira; Mizuki, Nobuhisa; Kastner, Daniel L.; Gul, Ahmet; Sezerman, Osman Ugur

Scopus İndeksli Yayınlar Koleksiyonu

Browse

Filters

Settings

Sort By

Results per page

Search Results