PubMed İndeksli Yayınlar Koleksiyonu
Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/397
Browse
9 results
Search Results
Article Developing a Label Propagation Approach for Cancer Subtype Classification Problem(Tubitak Scientific & Technological Research Council Turkey, 2022-01-01) Guner, Pinar; Bakir-Gungor, Burcu; Coskun, MustafaCancer is a disease in which abnormal cells grow uncontrollably and invade other tissues. Several types of cancer have various subtypes with different clinical and biological implications. Based on these differences, treatment methods need to be customized. The identification of distinct cancer subtypes is an important problem in bioinformatics, since it can guide future precision medicine applications. In order to design targeted treatments, bioinformatics methods attempt to discover common molecular pathology of different cancer subtypes. Along this line, several computational methods have been proposed to discover cancer subtypes or to stratify cancer into informative subtypes. However, existing works do not consider the sparseness of data (genes having low degrees) and result in an ill-conditioned solution. To address this shortcoming, in this paper, we propose an alternative unsupervised method to stratify cancer patients into subtypes using applied numerical algebra techniques. More specifically, we applied a label propagation based approach to stratify somatic mutation profiles of colon, head and neck, uterine, bladder, and breast tumors. We evaluated the performance of our method by comparing it to the baseline methods. Extensive experiments demonstrate that our approach highly renders tumor classification tasks by largely outperforming the state-of-the-art unsupervised and supervised approaches.Correction Correction: Engineering Novel Features for Diabetes Complication Prediction Using Synthetic Electronic Health Records(Frontiers Media S.A., 2025-08-29) Voskergian, Daniel; Bakir-Gungor, Burcu; Yousef, MalikArticle Citation - WoS: 2Citation - Scopus: 4RCE-IFE: Recursive Cluster Elimination With Intra-Cluster Feature Elimination(PeerJ Inc, 2025-02-07) Kuzudisli, Cihan; Bakir-Gungor, Burcu; Qaqish, Bahjat; Yousef, MalikThe computational and interpretational difficulties caused by the ever-increasing dimensionality of biological data generated by new technologies pose a significant challenge. Feature selection (FS) methods aim to reduce the dimension, and feature grouping has emerged as a foundation for FS techniques that seek to detect strong correlations among features and identify irrelevant features. In this work, we propose the Recursive Cluster Elimination with Intra-Cluster Feature Elimination (RCE-IFE) method that utilizes feature grouping and iterates grouping and elimination steps in a supervised context. We assess dimensionality reduction and discriminatory capabilities of RCE-IFE on various high-dimensional datasets from different biological domains. For a set of gene expression, MicroRNA (miRNA) expression, and methylation datasets, the performance of RCE-IFE is comparatively evaluated with RCE-IFE-SVM (the SVM-adapted version of RCE-IFE) and SVM-RCE. On average, RCE-IFE attains an area under the curve (AUC) of 0.85 among tested expression datasets with the fewest features and the shortest running time, while RCE-IFE-SVM (the SVM-adapted version of RCE-IFE) and SVM-RCE achieve similar AUCs of 0.84 and 0.83, respectively. RCE-IFE and SVM-RCE yield AUCs of 0.79 and 0.68, respectively when averaged over seven different metagenomics datasets, with RCE-IFE significantly reducing feature subsets. Furthermore, RCE-IFE surpasses several state-of-the-art FS methods, such as Minimum Redundancy Maximum Relevance (MRMR), Fast Correlation-Based Filter (FCBF), Information Gain (IG), Conditional Mutual Information Maximization (CMIM), SelectKBest (SKB), and eXtreme Gradient Boosting (XGBoost), obtaining an average AUC of 0.76 on five gene expression datasets. Compared with a similar tool, Multi-stage, RCE-IFE gives a similar average accuracy rate of 89.27% using fewer features on four cancer-related datasets. The comparability of RCE-IFE is also verified with other biological domain knowledge-based Grouping-Scoring-Modeling (G-S-M) tools, including mirGediNET, 3Mint, and miRcorrNet. Additionally, the biological relevance of the selected features by RCE-IFE is evaluated. The proposed method also exhibits high consistency in terms of the selected features across multiple runs. Our experimental findings imply that RCE-IFE provides robust classifier performance and significantly reduces feature size while maintaining feature relevance and consistency.Article Citation - WoS: 5Citation - Scopus: 5Novel Antimicrobial Peptide Design Using Motif Match Score Representation(IEEE Computer Soc, 2024-11) Soylemez, Ummu Gulsum; Yousef, Malik; Kesmen, Zulal; Bakir-Gungor, BurcuAntimicrobial peptides (AMPs) have drawn the interest of the researchers since they offer an alternative to the traditional antibiotics in the fight against antibiotic resistance and they exhibit additional pharmaceutically significant properties. Recently, computational approaches attemp to reveal how antibacterial activity is determined from a machine learning perspective and they aim to search and find the biological cues or characteristics that control antimicrobial activity via incorporating motif match scores. This study is dedicated to the development of a machine learning framework aimed at devising novel antimicrobial peptide (AMP) sequences potentially effective against Gram-positive/Gram-negative bacteria. In order to design newly generated sequences classified as either AMP or non-AMP, various classification models were trained. These novel sequences underwent validation utilizing the "DBAASP: strain-specific antibacterial prediction based on machine learning approaches and data on AMP sequences" tool. The findings presented herein represent a significant stride in this computational research, streamlining the process of AMP creation or modification within wet lab environments.Letter Epistatic Interactions Between Autoimmunity and Genetic Thrombophilia' Reply(Nature Publishing Group, 2015-01-14) Bakir-Gungor, Burcu; Remmers, Elaine F.; Meguro, Akira; Mizuki, Nobuhisa; Kastner, Daniel L.; Gul, Ahmet; Sezerman, Osman UgurArticle Citation - WoS: 1Citation - Scopus: 3Engineering Novel Features for Diabetes Complication Prediction Using Synthetic Electronic Health Records(Frontiers Media S.A., 2025-04-14) Voskergian, Daniel; Bakir-Gungor, Burcu; Yousef, MalikDiabetes significantly affects millions of people worldwide, leading to substantial morbidity, disability, and mortality rates. Predicting diabetes-related complications from health records is crucial for early prevention and for the development of effective treatment plans. In order to predict four different complications of diabetes mellitus, i.e., retinopathy, chronic kidney disease, ischemic heart disease, and amputations, this study introduces a novel feature engineering approach. While developing the classification models, we utilize XGBoost feature selection method and various supervised machine learning algorithms, including Random Forest, XGBoost, LogitBoost, AdaBoost, and Decision Tree. These models were trained on synthetic electronic health records (EHR) generated by dual-adversarial autoencoders. These EHRs represent nearly 1 million synthetic patients derived from an authentic cohort of 979,308 individuals with diabetes. The variables considered in the models were the age range accompanied by chronic diseases that occur during patient visits starting from the onset of diabetes. Throughout the experiments, XGBoost and Random Forest demonstrated the best overall prediction performance. The final models, which are tailored to each complication and trained using our feature engineering approach, achieved an accuracy between 69% and 77% and an AUC between 77% and 84% using cross-validation, while the partitioned validation approach yielded an accuracy between 59% and 78% and an AUC between 66% and 85%. These findings imply that the performance of our method surpass the performance of the traditional Bag-of-Features approach, highlighting the effectiveness of our approach in enhancing model accuracy and robustness.Article Citation - WoS: 11Citation - Scopus: 10Clinical and Molecular Evaluation of MEFV Gene Variants in the Turkish Population: A Study by the National Genetics Consortium(Springer Heidelberg, 2022-01-31) Dundar, Munis; Fahrioglu, Umut; Yildiz, Saliha Handan; Bakir-Gungor, Burcu; Temel, Sehime Gulsun; Akin, Haluk; Erdem, LeventFamilial Mediterranean fever (FMF) is a monogenic autoinflammatory disorder with recurrent fever, abdominal pain, serositis, articular manifestations, erysipelas-like erythema, and renal complications as its main features. Caused by the mutations in the MEditerranean FeVer (MEFV) gene, it mainly affects people of Mediterranean descent with a higher incidence in the Turkish, Jewish, Arabic, and Armenian populations. As our understanding of FMF improves, it becomes clearer that we are facing with a more complex picture of FMF with respect to its pathogenesis, penetrance, variant type (gain-of-function vs. loss-of-function), and inheritance. In this study, MEFV gene analysis results and clinical findings of 27,504 patients from 35 universities and institutions in Turkey and Northern Cyprus are combined in an effort to provide a better insight into the genotype-phenotype correlation and how a specific variant contributes to certain clinical findings in FMF patients. Our results may help better understand this complex disease and how the genotype may sometimes contribute to phenotype. Unlike many studies in the literature, our study investigated a broader symptomatic spectrum and the relationship between the genotype and phenotype data. In this sense, we aimed to guide all clinicians and academicians who work in this field to better establish a comprehensive data set for the patients. One of the biggest messages of our study is that lack of uniformity in some clinical and demographic data of participants may become an obstacle in approaching FMF patients and understanding this complex disease.Article Citation - Scopus: 4Aguhyper: A Hyperledger-Based Electronic Health Record Management Framework(PeerJ Inc, 2024-05-22) Dedeturk, Beyhan Adanur; Bakir-Gungor, BurcuThe increasing importance of healthcare records, particularly given the emergence of new diseases, emphasizes the need for secure electronic storage and dissemination. With these records dispersed across diverse healthcare entities, their physical maintenance proves to be excessively time-consuming. The prevalent management of electronic healthcare records (EHRs) presents inherent security vulnerabilities, including susceptibility to attacks and potential breaches orchestrated by malicious actors. To tackle these challenges, this article introduces AguHyper, a secure storage and sharing solution for EHRs built on a permissioned blockchain framework. AguHyper utilizes Hyperledger Fabric and the InterPlanetary Distributed File System (IPFS). Hyperledger Fabric establishes the blockchain network, while IPFS manages the off -chain storage of encrypted data, with hash values securely stored within the blockchain. Focusing on security, privacy, scalability, and data integrity, AguHyper ' s decentralized architecture eliminates single points of failure and ensures transparency for all network participants. The study develops a prototype to address gaps identi fi ed in prior research, providing insights into blockchain technology applications in healthcare. Detailed analyses of system architecture, AguHyper ' s implementation con fi gurations, and performance assessments with diverse datasets are provided. The experimental setup incorporates CouchDB and the Raft consensus mechanism, enabling a thorough comparison of system performance against existing studies in terms of throughput and latency. This contributes signi fi cantly to a comprehensive evaluation of the proposed solution and offers a unique perspective on existing literature in the fi eld.Article 3Mont: A Multi-Omics Integrative Tool for Breast Cancer Subtype Stratification(Public Library Science, 2025-06-27) Unlu Yazici, Miray; Marron, J. S.; Bakir-Gungor, Burcu; Zou, Fei; Yousef, Malik; Yazici, Miray UnluBreast Cancer (BRCA) is a heterogeneous disease, and it is one of the most prevalent cancer types among women. Developing effective treatment strategies that address diverse types of BRCA is crucial. Notably, among different BRCA molecular sub-types, Hormone Receptor negative (HR-) BRCA cases, especially Basal-like BRCA sub-types, lack estrogen and progesterone hormone receptors and they exhibit a higher tumor growth rate compared to HR+ cases. Improving survival time and predicting prognosis for distinct molecular profiles is substantial. In this study, we propose a novel approach called 3-Multi-Omics Network and Integration Tool (3Mont), which integrates various -omics data by applying a grouping function, detecting pro-groups, and assigning scores to each pro-group using Feature importance scoring (FIS) component. Following that, machine learning (ML) models are constructed based on the prominent pro-groups, which enable the extraction of promising biomarkers for distinguishing BRCA sub-types. Our tool allows users to analyze the collective behavior of features in each pro-group (biological groups) utilizing ML algorithms. In addition, by constructing the pro-groups and equalizing the feature numbers in each pro-group using the FIS component, this process achieves a significant 20% speedup over the 3Mint tool. Contrary to conventional methods, 3Mont generates networks that illustrate the interplay of the prominent biomarkers of different -omics data. Accordingly, exploring the concerted actions of features in pro-groups facilitates understanding the dynamics of the biomarkers within the generated networks and developing effective strategies for better cancer sub-type stratification. The 3Mont tool, along with all supporting materials, can be found at https://github.com/malikyousef/3Mont.git.
