Scopus İndeksli Yayınlar Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/395

Browse

Search Results

Now showing 1 - 10 of 17

Citation - Scopus: 1
The Identification of Discriminative Single Nucleotide Polymorphism Sets for the Classification of Behçet's Disease
(Institute of Electrical and Electronics Engineers Inc., 2018-09) Görmez, Yasin; Işik, Yunus Emre; Bakir-Güngör, Burcu
Behçet's disease is a long-term multisystem inflammatory disorder, characterized by recurrent attacks affecting several organs. As the genotyping individuals get cheaper and easier following the developments in genomic technologies, genome-wide association studies (GWAS) emerged. By this means, via studying big-sized case-control groups for a specific disease, potential genetic variations, single nucleotide polymorphisms (SNPs) are identified. Although several genetic risk factors are identified for Behçet's disease with the help of these studies via scanning around a million of SNPs, these variations could only explain up to 20% of the disease's genetic risk. In this study, for Behçet's disease classification, via comparing all the SNPs genotyped in GWAS, with the SNPs selected via using genetic knowledge, gain ratio and information gain; both reduction in the feature size and improvement in the classification accuracy is aimed. Also, using different classification algorithms such as random forest, k-nearest neighbour and logistic regression, their effects on the classification accuracy are investigated. Our results showed that compared to other feature selection methods, with at least 81% success rate, the selection of the SNPs using the genetic information (of their GWAS p-values, indicating the significance of the SNP against the disease) provides 15% to 42% improvement in all classification algorithms. This improvement is statistically sound. While gain ratio and information gain feature selection techniques yield similar classification accuracies, the models using all SNPs could not exceed 50% accuracies and results in the worst performance. © 2019 Elsevier B.V., All rights reserved.
Citation - WoS: 15
Citation - Scopus: 15
PriPath: Identifying Dysregulated Pathways From Differential Gene Expression via Grouping, Scoring, and Modeling With an Embedded Feature Selection Approach
(BMC, 2023-02-23) Yousef, Malik; Ozdemir, Fatma; Jaber, Amhar; Allmer, Jens; Bakir-Gungor, Burcu
BackgroundCell homeostasis relies on the concerted actions of genes, and dysregulated genes can lead to diseases. In living organisms, genes or their products do not act alone but within networks. Subsets of these networks can be viewed as modules that provide specific functionality to an organism. The Kyoto encyclopedia of genes and genomes (KEGG) systematically analyzes gene functions, proteins, and molecules and combines them into pathways. Measurements of gene expression (e.g., RNA-seq data) can be mapped to KEGG pathways to determine which modules are affected or dysregulated in the disease. However, genes acting in multiple pathways and other inherent issues complicate such analyses. Many current approaches may only employ gene expression data and need to pay more attention to some of the existing knowledge stored in KEGG pathways for detecting dysregulated pathways. New methods that consider more precompiled information are required for a more holistic association between gene expression and diseases.ResultsPriPath is a novel approach that transfers the generic process of grouping and scoring, followed by modeling to analyze gene expression with KEGG pathways. In PriPath, KEGG pathways are utilized as the grouping function as part of a machine learning algorithm for selecting the most significant KEGG pathways. A machine learning model is trained to differentiate between diseases and controls using those groups. We have tested PriPath on 13 gene expression datasets of various cancers and other diseases. Our proposed approach successfully assigned biologically and clinically relevant KEGG terms to the samples based on the differentially expressed genes. We have comparatively evaluated the performance of PriPath against other tools, which are similar in their merit. For each dataset, we manually confirmed the top results of PriPath in the literature and found that most predictions can be supported by previous experimental research.ConclusionsPriPath can thus aid in determining dysregulated pathways, which applies to medical diagnostics. In the future, we aim to advance this approach so that it can perform patient stratification based on gene expression and identify druggable targets. Thereby, we cover two aspects of precision medicine.
Citation - Scopus: 10
On Comparative Classification of Relevant COVID-19 Tweets
(Institute of Electrical and Electronics Engineers Inc., 2021-09-15) Bakal, Gokhan; Abar, Orhan
Due to the impressive information dissemination power of social networks such as Twitter, people tend to check social networks and Web pages more than other traditional news sources, including newspapers, TV news programs, or radio channels. In that sense, the information carried by the content of the shared social media posts becomes much more considerable. However, most of the posts are commonly either irrelevant or inaccurate. Besides, the more critical case than the correctness of the information is the diffusion speed on Twitter through the reply or retweet actions. These activities make the initial situation even more complicated than itself due to the unregulated nature of the social networks and the lack of an immediate verification mechanism for the correctness of the posts. When we consider the current Covid-19 pandemic period (causing the coronavirus disease), one of the most utilized information resources is Twitter except the official health administration institutions. Thereupon, examining the correctness of the information related to the Covid-19 pandemic by computational techniques (e.g., Data Mining, Machine Learning, and Deep Learning) has been gaining popularity and remains a substantial task. Hence, we mainly focused on analyzing the correctness of the posts related to the current pandemic shared on the Twitter platform. Therefore, the overall goal of this work is to classify the relevant tweets using linear and non-linear machine learning models. We achieved the best F1 performance score (99%) with the neural network model using the unigram features & threshold value of 50 among all model configurations. © 2022 Elsevier B.V., All rights reserved.
Multifunction Optoelectronic Gate
(Wiley-Blackwell, 2015-02-24) Ozharar, Sarper; Ozdur, Ibrahim; Delfyett, Peter J.
A multifunction optoelectronic gate that can perform as any desired logic gate of two variables was theoretically proposed and a simplified version is experimentally demonstrated. The proposed optoelectronic gate is dynamically configurable, and being wavelength independent, it can act on multiple input optical bits and realize different functions simultaneously. (c) 2015 Wiley Periodicals, Inc. Microwave Opt Technol Lett 57:969-972, 2015
Multi-Method Text Summarization: Evaluating Extractive and BART-Based Approaches on CNN/Daily Mail
(Institute of Electrical and Electronics Engineers Inc., 2025-06-27) Inal, Yasin; Bakal, Gokhan; Esit, Muhammed
With the exponential growth of digital content, efficient text summarization has become increasingly crucial for managing information overload. This paper presents a comprehensive approach to text summarization using both extractive and abstractive methods, implemented on the CNN/Daily Mail dataset. We leverage pre-trained BART (Bidirectional and AutoRegressive Transformers) models and fine-tuning techniques to generate high-quality summaries. Our approach demonstrates significant improvements, with our best model trained on 287 k samples achieving ROUGE-1 F1 scores of 0.4174, ROUGE-2 F1 scores of 0.1932, and ROUGE-L F1 scores of 0.2910. We provide detailed comparisons between extractive methods and various BART model configurations, analyzing the impact of training dataset size and model architecture on summarization quality. Additionally, we share our implementation through an opensource NLP toolkit to facilitate further research and practical applications in the field. © 2025 Elsevier B.V., All rights reserved.
Linear Vs. Non-Linear Embedding Methods in Recommendation Systems
(Institute of Electrical and Electronics Engineers Inc., 2022-09-07) Gurler, Kerem; Cos¸kun, Mustafa; Karagenc, Safak; Orun, Gokhan; Kuleli Pak, Burcu Kuleli; Güngör, Vehbi Çağrı; Coskun, Mustafa; Pak, Burcu Kuleli
Predicting customer interest in items is very crucial in direct marketing as it can potentially boost sales. Data mining techniques are developed to predict which items a particular user might be interested in based on their purchase history or explicit feedback in form of ratings or comments. Recently, non-linear and linear methods have been developed for this purpose. In this study, we applied Neighborhood based Collaborative Filtering (CF), Matrix Factorization (MF), Singular Value Decomposition (SVD), Neural Graph CF (NGCF) and Light Graph Convolutional Network (LightGCN) on explicit user product rating data which is acquired from the online gaming and mobile entertainment platform called HADI. We compared the results of node embedding methods in terms of Precision@k, Recall@k and NDCG@k values. SVD and LightGCN showed the best test performance and SVD was significantly superior to LightGCN in terms of training speed. To further increase predictive performance of SVD, we have applied classification with Logistic Regression and Deep Random Forest on user and item embeddings created by the SVD. © 2022 Elsevier B.V., All rights reserved.
Citation - WoS: 5
Citation - Scopus: 5
Investigating the Carbon Border Adjustment Mechanism Transition Process With Linguistic Summarization Method: A Situational Analysis of Exporting Countries
(Elsevier Sci Ltd, 2024-08) Fidan, Fatma Sener; Aydogan, Sena; Akay, Diyar; Şener Fidan, Fatma
The Paris Agreement holds significant importance since it establishes a global framework for addressing the issue of climate change and endeavors to mitigate the release of greenhouse gases. The Carbon Border Adjustment Mechanism was introduced as an integral component of this agreement, aiming to oversee the carbon emissions associated with imported items within the European Union and provide compensation for the emissions from the nations engaged in importation. It is essential to analyze the countries involved in exporting to the European Union within the Carbon Border Adjustment Mechanism context to mitigate carbon leakage and effectively support the objectives outlined in the Paris Agreement. This research investigated 104 nations engaged in exporting activities to 27 European Union member countries. The linguistic summarization method, a descriptive data analytics tool, was employed for the analysis. A total of 42 Combined Nomenclature codes were encompassed within the scope of evaluation throughout the transition phase of the Carbon Border Adjustment Mechanism. This study examines the characteristics of exporting nations based on three variables: The Environmental Performance Index, a sustainability indicator; the Region in which the countries are located as classified by the World Bank; and the quantity of Renewable Energy Consumption. Additionally, the study explores the characteristics of EU countries, focusing on their Environmental Performance Index score and geography. The study employed fuzzy sets and the fuzzy c-means algorithm as parts of the linguistic summarization technique. Polyadic quantifiers were used to extract linguistic summaries, resulting in the acquisition of 124,227 summaries. A total of 1594 summaries have a truth degree exceeding 0.9. The findings were effectively utilized to assess the influence of the linguistic summarization approach and offered a valuable viewpoint for decisionmakers needing more expertise in this domain.
Citation - WoS: 5
Citation - Scopus: 5
Integrated Querying and Version Control of Context-Specific Biological Networks
(Oxford Univ Press, 2020) Cowman, Tyler; Coskun, Mustafa; Grama, Ananth; Koyuturk, Mehmet
Motivation: Biomolecular data stored in public databases is increasingly specialized to organisms, context/pathology and tissue type, potentially resulting in significant overhead for analyses. These networks are often specializations of generic interaction sets, presenting opportunities for reducing storage and computational cost. Therefore, it is desirable to develop effective compression and storage techniques, along with efficient algorithms and a flexible query interface capable of operating on compressed data structures. Current graph databases offer varying levels of support for network integration. However, these solutions do not provide efficient methods for the storage and querying of versioned networks. Results: We present VerTIoN, a framework consisting of novel data structures and associated query mechanisms for integrated querying of versioned context-specific biological networks. As a use case for our framework, we study network proximity queries in which the user can select and compose a combination of tissue-specific and generic networks. Using our compressed version tree data structure, in conjunction with state-of-the-art numerical techniques, we demonstrate real-time querying of large network databases. Conclusion: Our results show that it is possible to support flexible queries defined on heterogeneous networks composed at query time while drastically reducing response time for multiple simultaneous queries. The flexibility offered by VerTIoN in composing integrated network versions opens significant new avenues for the utilization of ever increasing volume of context-specific network data in a broad range of biomedical applications. Availability and Implementation: VerTIoN is implemented as a C++ library and is available at http://compbio.case.edu/omics/software/vertion and https://github.com/tjcowman/vertion Contact: tyler.cowman@case.edu
Identify Commonly Affected Pathways in Psychiatric Diseases
(Institute of Electrical and Electronics Engineers Inc., 2018-09) Bulut, Umit; Bakir-Güngör, Burcu
Genome-wide association studies (GWAS) are an extraordinary source of information when it comes to revealing the common variations of human complex diseases. Until now, the large amount of data generated from these studies have not been shown its full potential enough to identify the molecular and functional framework to be able to understand how a molecular system works. Following a more specific perspective, this study focused on the identification of commonly affected pathways of psychiatric diseases. The pathway term as used in molecular biology, depicts a simplified model of a process within the cell or tissue. Lately, several GWAS datasets are publicly available for various disease types such as psychiatric, immune-related, neurodegenerative, cardiovascular and such. A study on each disease and pairwise comparison to understand the behavior of disease and system would be time consuming and exhaustive. Instead of handling the results of these studies one by one, grouping diseases by target points is a more efficient way. This work aims to get one step closer to reveal key points of diseases and target these points to develop personalized medicine approaches. Especially for complex diseases, every drug doesn't show the same effect in every people. This paper contains the definition of molecular pathways, methods to identify disease related pathways, and to find common pathways pairwise in psychiatric diseases. © 2019 Elsevier B.V., All rights reserved.
Generating Linguistic Advice for the Carbon Limit Adjustment Mechanism
(Springer Science and Business Media Deutschland GmbH, 2023-10-02) Fidan, Fatma Şener; Aydogan, Sena; Akay, Diyar
Linguistic summarization, a subfield of data mining, generates summaries in natural language for comprehending big data. This approach simplifies the incorporation of information into decision-making processes since no specialized knowledge is needed to understand the generated language summaries. The present research employs linguistic summarization to examine the circumstances surrounding the Carbon Border Adjustment Mechanism, one of the most significant regulations confronting exporting nations to the European Union, and will be adopted to support sustainable growth. In this paper, associated with several attributes of the countries and product flow from exporting countries to European countries were defined as nodes and relations, respectively. Before the modeling phase, fuzzy c-means automatically identified fuzzy sets and membership degrees of attributes. During the modeling phase, summary forms were generated using polyadic quantifiers. A total of 1944 linguistic summaries were produced between exporting countries and European countries. Thirty-five summaries have a truth degree greater than or equal to the threshold value of 0.9, which is considered reasonable. The provision of natural language descriptions of the Carbon Border Adjustment Mechanism is intended to aid decision-makers and policymakers in their deliberations. © 2023 Elsevier B.V., All rights reserved.

Scopus İndeksli Yayınlar Koleksiyonu

Browse

Filters

Settings

Sort By

Results per page

Search Results