Scopus İndeksli Yayınlar Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/395

Browse

Search Results

Now showing 1 - 10 of 25

Impact of Gene Duplicate Handling Strategies on Classification Performance and Feature Selection in Gene Expression Data
(Institute of Electrical and Electronics Engineers Inc., 2025-09-17) Kuzudisli, Cihan; Qaqish, Bahjat; Gungor, Burcu Bakir; Yousef, Malik
Citation - Scopus: 2
miRcorrNetPro: Unraveling Algorithmic Insights Through Cross-Validation in Multi-Omics Integration for Comprehensive Data Analysis
(Institute of Electrical and Electronics Engineers Inc., 2023-12-05) Ünlü Yazici, Miray; Yousef, Malik; Marron, J. S.; Bakir-Güngör, Burcu; Yazici, Miray Unlu
High throughput -omics technologies facilitate the investigation of regulatory mechanisms of complex diseases. Along this line, scientists develop promising tools and methods to extend our understanding at the molecular and functional levels. To this end, miRcorrNet tool performs integrative analysis of MicroRNA (miRNA) and gene expression profiles via machine learning (ML) approach to identify significant miRNA groups and their associated target genes. In this study, we propose miRcorrNetPro tool, which extends miRcorrNet by tracking group scoring, ranking and other information through the cross-validation iterations. Heatmap visualizations enable deep novel insights into the collective behavior of clusters of groups in cellular signaling and hence facilitate detection of potential biomarkers for the disease under investigation. Although miRcorrNetPro is designed as a generic tool, here we present our findings and potential miRNA biomarkers for Breast Cancer (BRCA). The miRcorrNetPro tool and all other supplementary files are available at https://github.com/Miray-Unlu/miRcorrNetPro. © 2024 Elsevier B.V., All rights reserved.
Citation - Scopus: 1
Words Speak Louder Than Actions: Decoding Emotions Through NLP
(Institute of Electrical and Electronics Engineers Inc., 2024-10-26) Paksoy, Melda; Bakal, Gokhan
Emotion detection in text remains a significant challenge in Natural Language Processing due to human emotions' complexity and subtle nuances. This paper presents multiple experimental models for emotion classification using an up-to-date dataset curated to address 13 emotions implied in Twitter posts. We evaluated various machine learning (ML) models, including Logistic Regression, Random Forest, SVM, and XGBoost, alongside deep learning (DL) architectures such as LSTM and CNN. Our results demonstrate the efficacy of deep learning models, particularly the CNN model by achieving an impressive F1 score of 0.99. This study contributes to emotion detection capabilities, paving the way for more nuanced and accurate sentiment analysis (SA) in various text analysis applications. © 2025 Elsevier B.V., All rights reserved.
Citation - Scopus: 1
The Identification of Discriminative Single Nucleotide Polymorphism Sets for the Classification of Behçet's Disease
(Institute of Electrical and Electronics Engineers Inc., 2018-09) Görmez, Yasin; Işik, Yunus Emre; Bakir-Güngör, Burcu
Behçet's disease is a long-term multisystem inflammatory disorder, characterized by recurrent attacks affecting several organs. As the genotyping individuals get cheaper and easier following the developments in genomic technologies, genome-wide association studies (GWAS) emerged. By this means, via studying big-sized case-control groups for a specific disease, potential genetic variations, single nucleotide polymorphisms (SNPs) are identified. Although several genetic risk factors are identified for Behçet's disease with the help of these studies via scanning around a million of SNPs, these variations could only explain up to 20% of the disease's genetic risk. In this study, for Behçet's disease classification, via comparing all the SNPs genotyped in GWAS, with the SNPs selected via using genetic knowledge, gain ratio and information gain; both reduction in the feature size and improvement in the classification accuracy is aimed. Also, using different classification algorithms such as random forest, k-nearest neighbour and logistic regression, their effects on the classification accuracy are investigated. Our results showed that compared to other feature selection methods, with at least 81% success rate, the selection of the SNPs using the genetic information (of their GWAS p-values, indicating the significance of the SNP against the disease) provides 15% to 42% improvement in all classification algorithms. This improvement is statistically sound. While gain ratio and information gain feature selection techniques yield similar classification accuracies, the models using all SNPs could not exceed 50% accuracies and results in the worst performance. © 2019 Elsevier B.V., All rights reserved.
Citation - Scopus: 1
TextNetTopics_TIS: Enhancing Textnettopics With Random Forest-Based Topic Importance Scoring
(Institute of Electrical and Electronics Engineers Inc., 2024-10-16) Voskergian, Daniel; Bakir-Güngör, Burcu; Yousef, Malik
TextNetTopics is an innovative Latent Dirichlet Allocation-based topic selection method for training text classification models. One main limitation is its computationally intensive scoring mechanism, especially when applied to many topics. This scoring mechanism involves training a machine learning model (i.e., Random Forest) on each topic using the Monte-Carlo Cross-Validation approach and assigning a score value based on a specific performance metric (e.g., accuracy or F1-score). Moreover, the measured score does not account for the interactions between all features residing in all topics. This paper presents a new topic-scoring mechanism called Topic Importance Scoring. This computationally efficient approach trains a Random Forest model on all topics simultaneously and leverages the extracted feature importance values to give each topic a score reflecting its classification potential. The experiments on three diverse datasets confirm that the proposed method's performance is superior to the Topic Performance Scoring, which was used in the original TextNetTopics method. © 2024 Elsevier B.V., All rights reserved.
Citation - Scopus: 10
On Comparative Classification of Relevant COVID-19 Tweets
(Institute of Electrical and Electronics Engineers Inc., 2021-09-15) Bakal, Gokhan; Abar, Orhan
Due to the impressive information dissemination power of social networks such as Twitter, people tend to check social networks and Web pages more than other traditional news sources, including newspapers, TV news programs, or radio channels. In that sense, the information carried by the content of the shared social media posts becomes much more considerable. However, most of the posts are commonly either irrelevant or inaccurate. Besides, the more critical case than the correctness of the information is the diffusion speed on Twitter through the reply or retweet actions. These activities make the initial situation even more complicated than itself due to the unregulated nature of the social networks and the lack of an immediate verification mechanism for the correctness of the posts. When we consider the current Covid-19 pandemic period (causing the coronavirus disease), one of the most utilized information resources is Twitter except the official health administration institutions. Thereupon, examining the correctness of the information related to the Covid-19 pandemic by computational techniques (e.g., Data Mining, Machine Learning, and Deep Learning) has been gaining popularity and remains a substantial task. Hence, we mainly focused on analyzing the correctness of the posts related to the current pandemic shared on the Twitter platform. Therefore, the overall goal of this work is to classify the relevant tweets using linear and non-linear machine learning models. We achieved the best F1 performance score (99%) with the neural network model using the unigram features & threshold value of 50 among all model configurations. © 2022 Elsevier B.V., All rights reserved.
Metabolomics Data Analysis to Discover Chronic Granulomatous Disease-Associated Biomarkers Utilizing G-S-M Machine Learning Model via Grouping Metabolites According to Ion Type
(Institute of Electrical and Electronics Engineers Inc., 2024-10-16) Ersöz, Nur Sebnem; Bakir-Güngör, Burcu; Yousef, Malik
Chronic Granulomatous Disease (CGD) is a rare, inherited immunodeficiency disorder characterized by white blood cells unable to effectively kill certain bacteria and fungi. This defect results in the formation of clusters of immune cells called granulomas that form at sites of infection or inflammation. Therefore, identification of disease-related biomarkers is a critical step in advancing precision medicine and improving diagnostic accuracy. In this study, we applied a G-S-M machine learning approach to metabolomics data to uncover CGD-Associated biomarkers. We obtained a metabolomics dataset from Gene Expression Omnibus with GSE220260 accession number. Data includes 85 samples (16 healthy controls and 69 CGD samples) with comprehensive metabolic profiles obtained using liquid chromatography-mass spectrometry analysis. Dataset includes metabolite names with their ion type and formula. In order to identify CGD related metabolites and their ion types, G-S-M was used as a grouping function when performing machine learning oriented metabolomics data analysis. We have performed the G-S-M approach by grouping metabolites according to their ion type. In the training part of the G-S-M approach, metabolites annotated with selected ion types have been utilized to perform a two-class classification task which generates an important set of ion type output. We also compared the performance results of the G-S-M machine learning model with traditional feature selection methods; XGB, SKB, IG, FCBF, MRMR, CMIM with random forest classifier. 100 times Monte-Carlo Cross Validation was used in our experiments. It was observed that G-S-M, XGB, SKB and FCBF methods similarly provided the best performances. In this study, besides its performance, G-S-M method used groups based on ion types unlike TFS, and then identified relevant Chronic Granulomatous Disease-associated metabolites. © 2024 Elsevier B.V., All rights reserved.
Citation - Scopus: 1
Machine Learning Based Beamwidth Adaptation for mmWave Vehicular Communications
(Institute of Electrical and Electronics Engineers Inc., 2023-12-10) Manic, Setinder; Heng Foh, Chuan; Köse, Abdulkadir; Lee, Haeyoung; Leow, Chee Yen; Chatzimisios, Periklis; Suthaputchakun, Chakkaphong; Foh, Chuan Heng
The incorporation of mmWave technology in vehicular networks has unlocked a realm of possibilities, propelling the advancement of autonomous vehicles, enhancing interconnectedness, and facilitating communication for intelligent transportation systems (ITS). Despite these strides in connectivity, challenges such as high path-loss have arisen, impacting existing beam management procedures. This work aims to address this issue by improving beam management techniques, specifically focusing on enhancing the service time between vehicles and base stations through adaptive mmWave beamwidth adjustments, accomplished using a Contextual Multi-Armed Bandit Algorithm. By leveraging various conditions to train the ML agent of the Contextual Multi-Armed Bandit Algorithm, it seeks to learn about vehicle mobility profiles and optimize the usage of different antenna beamwidth settings to maximize seamless connection time. The extensive simulation results showcase the effectiveness of an adaptive beamwidth for mobility profiles, extending the connection time a vehicle experiences with a base station when compared to the existing strategies. © 2024 Elsevier B.V., All rights reserved.
Citation - WoS: 16
Citation - Scopus: 20
Machine Learning Analysis of Inflammatory Bowel Disease-Associated Metagenomics Dataset
(Institute of Electrical and Electronics Engineers Inc., 2018-09) Hacilar, Hilal; Nalbantoĝlu, Özkan Ufuk; Bakir-Güngör, Burcu
There is an ongoing interplay between humans and our microbial communities. The microorganisms living in our gut produce energy from our food, strengthen our immune system, break down foreign products, and release metabolites and hormones, which are significant for regulating our physiology. The shifts away from this 'healthy' gut microbiome is considered to be associated with many diseases. Inflammatory bowel diseases (IBD) including Crohn's disease and ulcerative colitis, are gut related disorders affecting the intestinal tract. Although some metagenomics studies are conducted on IBD recently, our current understanding of the precise relationships between the human gut microbiome and IBD remains limited. In this regard, the use of state-of-the art machine learning approaches became popular to address a variety of questions like early diagnosis of certain diseases using human microbiota. In this study, we investigate which subset of gut microbiota are mostly associated with IBD and if disease-associated biomarkers can be detected via applying state-of-the art machine learning algorithms and proper feature selection methods. © 2019 Elsevier B.V., All rights reserved.
Citation - Scopus: 2
Machine Learning Algorithms Against Hacking Attack and Detection Success Comparison
(Institute of Electrical and Electronics Engineers Inc., 2020-09-15) Yavuz, Levent; Soran, Ahmet; Onen, Ahmet; Muyeen, S. M.
Power system protection units has got enormous importance with the growing risk of cyber-attacks. To create sustainable and well protected system, power system data must be healthy. For that purpose, many machine learning applications have been developed and used for bad data detection. However, each method has got different detection and application process. Methods has superiority over other methods. Although, an algorithm can detect some injections easily, same algorithm can be fail when injection type changed. So methods have got different success results when the injection types changed. For that reason, different injection types are applied on power system IEEE 14 bus system via created special hacking algorithm. PSCAD and python linkage has been used for simulation and detection parts. 3 different injection types created and applied on the system and five different most popular algorithms (SVM, k- NN, LDA, NB, LR) tested. Each algorithm's performances are compared and evaluated. © 2020 Elsevier B.V., All rights reserved.

Scopus İndeksli Yayınlar Koleksiyonu

Browse

Filters

Settings

Sort By

Results per page

Search Results