WoS İndeksli Yayınlar Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/394

Browse

Search Results

Now showing 1 - 10 of 10

Text Classification Experiments on Contextual Graphs Built by N-Gram Series
(Springer International Publishing AG, 2025) Sen, Tarik Uveys; Yakit, Mehmet Can; Gumus, Mehmet Semih; Abar, Orhan; Bakal, Gokhan
Traditional n-gram textual features, commonly employed in conventional machine learning models, offer lower performance rates on high-volume datasets compared to modern deep learning algorithms, which have been intensively studied for the past decade. The main reason for this performance disparity is that deep learning approaches handle textual data through the word vector space representation by catching the contextually hidden information in a better way. Nonetheless, the potential of the n-gram feature set to reflect the context is open to further investigation. In this sense, creating graphs using discriminative ngram series with high classification power has never been fully exploited by researchers. Hence, the main goal of this study is to contribute to the classification power by including the long-range neighborhood relationships for each word in the word embedding representations. To achieve this goal, we transformed the textual data by employing n-gram series into a graph structure and then trained a graph convolution network model. Consequently, we obtained contextually enriched word embeddings and observed F1-score performance improvements from 0.78 to 0.80 when we integrated those convolution-based word embeddings into an LSTM model. This research contributes to improving classification capabilities by leveraging graph structures derived from discriminative n-gram series.
TextNetTopics+: Enhancing Text Classification Through Classifier Diversity and Model Ensembling
(Springer International Publishing AG, 2025) Voskergian, Daniel; Bakir-Gungor, Burcu; Yousef, Malik
TextNetTopics is an innovative text classification framework that integrates topic modeling with feature selection to improve model accuracy and interpretability. Unlike traditional methods that rely on individual words, TextNetTopics selects cohesive topics extracted via Latent Dirichlet Allocation as features for document representation, effectively reducing dimensionality while preserving the semantic structure of the text. This study evaluates the performance of TextNetTopics utilizing multiple machine learning algorithms in the M (Modeling) component, including Random Forest, Support Vector Machine, Gradient Boosting, eXtreme Gradient Boosting, and Logistic Regression. To further enhance classification performance, we introduce TextNetTopics+, an ensemblebased extension that leverages both hard voting and soft voting mechanisms to combine the strengths of multiple classifiers. Comprehensive experiments on the LitCovid and WOS datasets demonstrate that ensemble learning in TextNetTopics + significantly outperforms individual classifiers in TextNetTopics, confirming its effectiveness in improving model robustness and generalization.
Temporal Logic-Based Intrusion Detection for Securing Connected Vehicles
(Springer International Publishing AG, 2024) Bozdal, Mehmet
Ensuring the security and integrity of in-vehicle communication networks (IVCNs) is paramount. The increasing connectivity of vehicles exposes them to unprecedented security vulnerabilities, necessitating innovative methodologies to safeguard against cyberattacks and unauthorized access. This research presents a novel approach to enhance IVCN security through the deployment of a Signal Temporal Logic (STL)-based Intrusion Detection System (IDS). Considering the limited resources of Electronic Control Units (ECUs), this approach offers an adaptive and lightweight solution that addresses the unique challenges posed by the dynamic nature of vehicular networks. The proposed STL-based IDS effectively detects a broad spectrum of intrusions while maintaining acceptable overhead for resource-constrained ECUs, thanks to its distributed architecture. Comprehensive experimental evaluations demonstrate significant performance improvements in detecting Denial of Service (DoS) attacks, achieving the highest accuracy of 0.996 and recall of 1.000. The system also excels in detecting fuzzy attacks, with the highest accuracy of 0.996.
Citation - WoS: 8
Citation - Scopus: 12
SVM-RCE-R Optimization of Scoring Function for SVM-RCE
(Springer International Publishing AG, 2021) Yousef, Malik; Jabeer, Amhar; Bakir-Gungor, Burcu
Gene expression data classification provides a challenge in classification due to it having high dimensionality and a relatively small sample size. Different feature selection approaches have been used to overcome this issue and SVM-RCE being one of the more successful approach. This study is a continuation of two previous research studies SVM-RCE and SVM-RCE-R. SVM-RCE-R suggests a new approach in the scoring function for the clusters, showing that for some different combination of weights the performance was improved. The aim of this study is to find the optimal weights for the scoring function suggested in the study of SVM-RCE-R using optimization approaches. We have discovered that finding the optimal weights for the scoring function would improve the performance of the SVM-RCE-in most cases. We have shown that in some cases the performance is increased dramatically by 10% in terms of accuracy and AUC. By increasing the performance of the algorithm, it is more likely that we can extract subset genes relating to the class association of a microarray sample.
Citation - WoS: 6
Citation - Scopus: 6
Rings With Modules Having a Restricted Injectivity Domain
(Springer International Publishing AG, 2019-09-30) Demirci, Yilmaz Mehmet; Turkmen, Burcu Nisanci; Turkmen, Ergul; Nişancı Türkmen, Burcu
We introduce modules whose injectivity domains are contained in the class of modules with zero radical and call them working-class. This notion gives a generalization of poor modules that have minimal injectivity domain. Semisimple working-class modules always exist for arbitrary rings whereas their predecessors do not. We investigate the rings over which every module is either injective or working-class. Right weakly V-rings are examples of these rings. Moreover, we study the existence of working-class simple modules and show that if there is a projective working-class simple right module, then the ring is a right GV-ring.
Normal Mixture Model-Based Clustering of Data Using Genetic Algorithm
(Springer International Publishing AG, 2020) Gogebakan, Maruf; Erol, Hamza
In this study, a new algorithm was developed for clustering multivariate big data. Normal mixture distributions are used to determine the partitions of variables. Normal mixture models obtained from the partitions of variables are generated using Genetic Algorithms (GA). Each partition in the variables corresponds to a clustering center in the normal mixture model. The best model that fits the data structure from normal mixture models is obtained by using the information criteria obtained from normal mixture distributions.
Leveraging MicroRNA-Gene Associations With Mirgedinet: An Intelligent Approach for Enhanced Classification of Breast Cancer Molecular Subtypes
(Springer International Publishing AG, 2025) Qumsiyeh, Emma; Bakir-Gungor, Burcu; Yousef, Malik
Understanding the molecular subtypes of breast cancer is crucial for advancing targeted therapies and precision medicine. For the BRCA molecular subtype prediction problem, this study employs miRGediNET, a machinelearning approach that integrates data from miRTarBase, DisGeNET, and HMDD databases to investigate shared gene associations between microRNA (miRNA) activity and disease mechanisms. Using the BRCA LumAB_Her2Basal dataset, we evaluate miRGediNET's performance against traditional feature selection methods, including CMIM, mRmR, Information Gain (IG), SelectKBest (SKB), Fast Correlation-Based Filter (FCBF), and XGBoost (XGB). These feature selection techniques were assessed using various classification algorithms including Random Forest (RF), Support Vector Machine (SVM), LogitBoost, Decision Tree, and AdaBoost, all executed with default parameters. The feature selection methods were tested using Monte Carlo Cross-Validation, where performance metrics obtained for each iteration were averaged to ensure robustness. Our findings reveal that miRGediNET outperforms traditional methods in accuracy and Area Under the Curve (AUC), emphasizing its superior capability to identify key genes that bridge miRNA interactions and breast cancer mechanisms. Notably, both miRGediNET and Information Gain (IG) feature selection consistently identified ESR1, a critical biomarker frequently reported in recent research associated with breast cancer prognosis and resistance to endocrine therapies. This integrative approach provides deeper biological insights into miRNA-disease interactions, paving the way for enhanced patient stratification, biomarker discovery, and personalized medicine strategies. The miRGediNET tool, developed on the KNIME platform, offers a practical resource for further exploration in the field of bioinformatics and oncology.
Citation - WoS: 13
Citation - Scopus: 13
Leaching of Turkish Oxidized Pb-Zn Flotation Tailings by Inorganic and Organic Acids
(Springer International Publishing AG, 2020) Kaya, Muammer; Kursunoglu, Sait; Hussaini, Shokrullah; Gul, Erkan
An eco-friendly approach and simultaneous recovery of metals from mine tailings is still a significant challenge. This study investigates the extraction of zinc metal from the Kayseri region oxidized lead-zinc (Pb-Zn) flotation tailings by leaching using three different inorganic acids (HNO3, HCl, and H2SO4) and six different organic acids (citric (CA), oxalic (OA), formic (FA), ascorbic (AA), malic (MA), and tartaric (TA) acids). The effects of acid type and concentration, leaching temperature and time, and solid/liquid (S/L) ratio were studied for maximum Zn dissolution and minimum Pb, Fe, and As co-dissolution at lowest temperature and leaching time. For inorganic acids at 1/10 S/L ratio, 1.0MH(2)SO(4) and HCl concentrations achieved 92% Zn + 0% Pb + 12% Fe at 40 degrees C leaching temperature and 60 min leaching time and 92% Zn + 10% Pb + 0% Fe at 80 degrees C leaching temperature and 30 min leaching time, respectively. For organic acids, at 1/10 S/L ratio and 1.0M concentration, 92% Zn + 8.3% Pb with malic acid at 80 degrees C leaching temperature and 180 min leaching time and 91% Zn + 12% Pb with citric acid at 60 degrees C leaching temperature and 180 min leaching time were achieved. 1.0 M formic acid dissolved about 83% Zn + 2.8% Pb at 80 degrees C and 180 min leaching time. More than 90% Zn dissolution can be succeeded by using either inorganic acids at 40 degrees C for 30-60 min leaching time or organic acids at 60-80 degrees C for 180 min leaching time. Oxalic acid significantly dissolved Fe and As without Zn and Pb dissolution.
Citation - WoS: 12
Citation - Scopus: 17
Integrating Gene Ontology Based Grouping and Ranking Into the Machine Learning Algorithm for Gene Expression Data Analysis
(Springer International Publishing AG, 2021) Yousef, Malik; Sayici, Ahmet; Bakir-Gungor, Burcu
Recent advances in the high throughput technologies resulted in the production of large gene expression data sets for several phenotypes. Via comparing the gene expression levels under different conditions, such as disease vs. control, treated vs. not treated, drug A vs. drug B, etc., one could identify biomarkers. As opposed to traditional gene selection approaches, integrative gene selection approaches incorporate domain knowledge from external biological resources during gene selection, which improves interpretability and predictive performance. In this respect, Gene Ontology provides cellular component, molecular function and biological process terms for the products of each gene. In this study, we present Gene Ontology based feature selection approach for gene expression data analysis. In our approach, we used the ontology information as grouping (term) information and embedded this information into a machine learning algorithm for selecting the most significant groups (terms) of ontology. Those groups are used to build the machine learning model in order to perform the classification task. The output of the tool is a significant ontology group for the task of 2-class classification applied on the gene expression data. This knowledge allows the researcher to perform more advanced gene expression analyses. We tested our approach on 8 different gene expression datasets. In our experiments, we observed that the tool successfully found the significant Ontology terms that would be used as a classification model. We believe that our tool will help the geneticists to identify affected genes in transcriptomic data and this information could enable the design of platforms to assist diagnosis, to assess patients' prognoses, and to create patient treatment plans.
Colorectal Cancer Prediction via Applying Recursive Cluster Elimination With Intra-Cluster Feature Elimination on Metagenomic Pathway Data
(Springer International Publishing AG, 2024) Temiz, Mustafa; Kuzudisli, Cihan; Yousef, Malik; Bakir-Gungor, Burcu
Advances in next-generation sequencing and in "-omics" technologies enable the characterization of the human gut microbiome. Colorectal cancer (CRC), the third most common cancer worldwide, is caused by genetic mutations, environmental influences, and abnormalities in the gut microbiota. The aim of this study is to identify pathways that influence host metabolism in CRC patients. The CRC-related metagenomic dataset used in this study contains the relative abundance values of 551 pathways calculated for 1262 samples. Here, two different approaches based on the feature grouping reduce the number of features by considering relevant features as groups, eliminate irrelevant features, and perform classification. The recursive cluster elimination with intra-cluster feature elimination (RCE-IFE) approach achieves anAUCof 0.72 using an average of 66.2 features on CRC-associated metagenomics dataset. In these experiments, P163-PWY: L-lysine fermentation to acetate and butanoate and PWY-6151: S-adenosyl-L-methionine cycle I pathways are identified as potential biomarkers associated with CRC. These experiments also reduce the number of features reported by both approaches in P163-PWY: L-lysine fermentation to acetate and butanoate and PWY-6151: Sadenosyl-L-methionine cycle I pathways reported by both approaches are considered possible CRC-related biomarkers. This study contributes to the molecular diagnosis and treatment of colorectal cancer by revealing the pathways associated with CRC. Our results are promising for the study of the gut microbiota and its role in CRC.

WoS İndeksli Yayınlar Koleksiyonu

Browse

Filters

Settings

Sort By

Results per page

Search Results