1. Home
  2. Browse by Author

Browsing by Author "Coskun, Mustafa"

Filter results by typing the first few letters
Now showing 1 - 12 of 12
  • Results Per Page
  • Sort Options
  • Loading...
    Thumbnail Image
    Article
    Citation - WoS: 3
    Citation - Scopus: 3
    Consensus Embedding for Multiple Networks: Computation and Applications
    (Cambridge Univ Press, 2022) Li, Mengzhen; Coskun, Mustafa; Koyuturk, Mehmet
    Machine learning applications on large-scale network-structured data commonly encode network information in the form of node embeddings. Network embedding algorithms map the nodes into a low-dimensional space such that the nodes that are "similar" with respect to network topology are also close to each other in the embedding space. Real-world networks often have multiple versions or can be "multiplex" with multiple types of edges with different semantics. For such networks, computation of Consensus Embeddings based on the node embeddings of individual versions can be useful for various reasons, including privacy, efficiency, and effectiveness of analyses. Here, we systematically investigate the performance of three dimensionality reduction methods in computing consensus embeddings on networks with multiple versions: singular value decomposition, variational auto-encoders, and canonical correlation analysis (CCA). Our results show that (i) CCA outperforms other dimensionality reduction methods in computing concensus embeddings, (ii) in the context of link prediction, consensus embeddings can be used to make predictions with accuracy close to that provided by embeddings of integrated networks, and (iii) consensus embeddings can be used to improve the efficiency of combinatorial link prediction queries on multiple networks by multiple orders of magnitude.
  • Loading...
    Thumbnail Image
    Article
    Developing a Label Propagation Approach for Cancer Subtype Classification Problem
    (Tubitak Scientific & Technological Research Council Turkey, 2022) Guner, Pinar; Bakir-Gungor, Burcu; Coskun, Mustafa
    Cancer is a disease in which abnormal cells grow uncontrollably and invade other tissues. Several types of cancer have various subtypes with different clinical and biological implications. Based on these differences, treatment methods need to be customized. The identification of distinct cancer subtypes is an important problem in bioinformatics, since it can guide future precision medicine applications. In order to design targeted treatments, bioinformatics methods attempt to discover common molecular pathology of different cancer subtypes. Along this line, several computational methods have been proposed to discover cancer subtypes or to stratify cancer into informative subtypes. However, existing works do not consider the sparseness of data (genes having low degrees) and result in an ill-conditioned solution. To address this shortcoming, in this paper, we propose an alternative unsupervised method to stratify cancer patients into subtypes using applied numerical algebra techniques. More specifically, we applied a label propagation based approach to stratify somatic mutation profiles of colon, head and neck, uterine, bladder, and breast tumors. We evaluated the performance of our method by comparing it to the baseline methods. Extensive experiments demonstrate that our approach highly renders tumor classification tasks by largely outperforming the state-of-the-art unsupervised and supervised approaches.
  • Loading...
    Thumbnail Image
    Conference Object
    Expanding Label Sets for Graph Convolutional Networks
    (Springer International Publishing AG, 2025) Coskun, Mustafa; Grama, Ananth; Bakir-Gungor, Burcu; Koyuturk, Mehmet
    In recent years, Graph Convolutional Networks (GCNs) and their variants have been widely utilized in learning tasks that involve graphs. These tasks include recommendation systems, node classification, among many others. In node classification problem, the input is a graph in which the edges represent the association between pairs of nodes, multi-dimensional feature vectors are associated with the nodes, and some of the nodes in the graph have "known" labels. The objective is to predict the labels of the nodes that are not labeled, using the nodes' features, in conjunction with graph topology. While GCNs have been successfully applied to this problem, the caveats that they inherit from traditional deep learning models pose significant challenges to broad utilization of GCNs in node classification. One such caveat is that training a GCN requires a large number of labeled training instances, which is often not the case in realistic settings. To remedy this requirement, state-of-the-art methods leverage network diffusion-based approaches to propagate labels across the network before training GCNs. However, these approaches ignore the tendency of the network diffusion methods in biasing proximity with centrality, resulting in the propagation of labels to the nodes that are well-connected in the graph. To address this problem, here we present an alternate approach, namely LExiCoL, which extrapolates node labels in GCNs in the following three steps: (i) clustering of the network to identify communities, (ii) use of network diffusion algorithms to quantify the proximity of each node to the communities, thereby obtaining a low-dimensional topological profile for each node, (iii) comparing these topological profiles to identify nodes that are most similar to the labeled nodes. Testing on three large-scale real-world networks, we systematically evaluate the performance of the proposed algorithm and show that our approach outperforms existing methods for wide ranges of parameter values.
  • Loading...
    Thumbnail Image
    Article
    Citation - WoS: 3
    Citation - Scopus: 6
    Fast Computation of Katz Index for Efficient Processing of Link Prediction Queries
    (Springer, 2021) Coskun, Mustafa; Baggag, Abdelkader; Koyuturk, Mehmet
    Network proximity computations are among the most common operations in various data mining applications, including link prediction and collaborative filtering. A common measure of network proximity is Katz index, which has been shown to be among the best-performing path-based link prediction algorithms. With the emergence of very large network databases, such proximity computations become an important part of query processing in these databases. Consequently, significant effort has been devoted to developing algorithms for efficient computation of Katz index between a given pair of nodes or between a query node and every other node in the network. Here, we present LRC-Katz, an algorithm based on indexing and low rank correction to accelerate Katz index based network proximity queries. Using a variety of very large real-world networks, we show that LRC-Katzoutperforms the fastest existing method, Conjugate Gradient, for a wide range of parameter values. Taking advantage of the acceleration in the computation of Katz index, we propose a new link prediction algorithm that exploits locality of networks that are encountered in practical applications. Our experiments show that the resulting link prediction algorithm drastically outperforms state-of-the-art link prediction methods based on the vanilla and truncated Katz.
  • Loading...
    Thumbnail Image
    Article
    A High Order Proximity Measure for Linear Network Embedding
    (2022) Coskun, Mustafa
    Ağ gömülümü öğrenme problemi bir çok ağ analizi gerektiren problemin ifade ve çözümlenmesi için çok büyük önem arz etmektedir. Bu bağlamda, ağ içerisinde bulunan düğümlerin birbirleri ile olan gizli ilişkilerini açığa çıkarmak için, son yıllarda ağ gömülümü öğrenme problemi çokça çalışılmaktadır. Bu gizli ilişkinin açığa çıkarılması, bağlantı tahminleme, öbekleme ve sınıflandırma gibi öğreme problemlerinin daha iyi çözümlenmesinde kullanılmaktadır. Ağ gömülümünü öğrenmek için, farklı yaklaşım ve algoritmalar geliştirilmiş olsada, matris ayrışımı bazlı algoritmalar hızlı olmasından dolayı araştırmacılar tarafından büyük ilgi görmekteler. Matris ayraşım bazlı ağ gömülümü öğrenmede genel anlamı ile yüksek dereceli yakınlık ölçüleri kullanılmaktadır, örneğin random walk with restart (RWR) ve Katz ölçüleri. Ancak, bu ölçülerle yapılan ağ benzerlik ölçüleri matris ayrışımında sıfıra karşılık gelen eigenvectors (özvektörler) üretebilmektedir. Bu ise öğrenilen ağ gömülümün yanlış olmasına sebeb olmaktadır. Bu prolemi aşmak için, bu makalede shift-and-invert (kaydır ve tersini al) yaklaşımına dayanarak bir yaklaşım önerdik. Bağlantı tahimini baz problemi alarak, geliştirdiğimiz algoritmayı üç gerçek veride kullanık ve sonuçların var olan matris ayrışımlı algoritmasını bütün metrik değerlendirmelerinde var olan algoritmanın performansını ciddi miktarda artırdığını gözlemledik.
  • Loading...
    Thumbnail Image
    Article
    Citation - WoS: 4
    Citation - Scopus: 4
    Integrated Querying and Version Control of Context-Specific Biological Networks
    (Oxford Univ Press, 2020) Cowman, Tyler; Coskun, Mustafa; Grama, Ananth; Koyuturk, Mehmet
    Motivation: Biomolecular data stored in public databases is increasingly specialized to organisms, context/pathology and tissue type, potentially resulting in significant overhead for analyses. These networks are often specializations of generic interaction sets, presenting opportunities for reducing storage and computational cost. Therefore, it is desirable to develop effective compression and storage techniques, along with efficient algorithms and a flexible query interface capable of operating on compressed data structures. Current graph databases offer varying levels of support for network integration. However, these solutions do not provide efficient methods for the storage and querying of versioned networks. Results: We present VerTIoN, a framework consisting of novel data structures and associated query mechanisms for integrated querying of versioned context-specific biological networks. As a use case for our framework, we study network proximity queries in which the user can select and compose a combination of tissue-specific and generic networks. Using our compressed version tree data structure, in conjunction with state-of-the-art numerical techniques, we demonstrate real-time querying of large network databases. Conclusion: Our results show that it is possible to support flexible queries defined on heterogeneous networks composed at query time while drastically reducing response time for multiple simultaneous queries. The flexibility offered by VerTIoN in composing integrated network versions opens significant new avenues for the utilization of ever increasing volume of context-specific network data in a broad range of biomedical applications. Availability and Implementation: VerTIoN is implemented as a C++ library and is available at http://compbio.case.edu/omics/software/vertion and https://github.com/tjcowman/vertion Contact: tyler.cowman@case.edu
  • Loading...
    Thumbnail Image
    Article
    Citation - WoS: 1
    Citation - Scopus: 1
    Intelligent Traffic Light Systems Using Edge Flow Predictions
    (Elsevier, 2024) Thahir, Adam Rizvi; Coskun, Mustafa; Kilic, Sultan Kubra; Gungor, Vehbi Cagri
    In this paper, we propose a novel graph-based semi-supervised learning approach for traffic light management in multiple intersections. Specifically, the basic premise behind our paper is that if we know some of the occupied roads and predict which roads will be congested, we can dynamically change traffic lights at the intersections that are connected to the roads anticipated to be congested. Comparative performance evaluations show that the proposed approach can produce comparable average vehicle waiting time and reduce the training/learning time of learning adequate traffic light configurations for all intersections within a few seconds, while a deep learning-based approach can be trained in a few days for learning similar light configurations.
  • Loading...
    Thumbnail Image
    Article
    Citation - WoS: 3
    Citation - Scopus: 5
    Intrinsic Graph Topological Correlation for Graph Convolutional Network Propagation
    (Elsevier, 2023) Coskun, Mustafa
    Recently, Graph Convolutional Networks (GCNs) and their variants become popular to learn graph-related tasks. These tasks include link prediction, node classification, and node embedding, among many others. In the node classification problem, the input is a graph with some labeled nodes and the features associated with these nodes and the objective is to predict the unlabeled nodes. While the GCNs have been successfully applied to this problem, some caveats that are inherited from classical deep learning remain unsolved. One such inherited caveat is that, during classification, GCNs only consider the nodes that are a few neighbors away from the labeled nodes. However, considering only a few steps away nodes could not effectively exploit the underlying graph topological information. To remedy this problem, the state-of-the-art methods leverage the network diffusion approaches, such as personalized PageRank and its variants, to fully account for the graph topology. However, these approaches overlook the fact that the network diffusion methods favour high degree nodes in the graph, resulting in the propagation of the labels to the unlabeled,hub nodes. In order to overcome bias, in this paper, we propose to utilize a dimensionality reduction technique, which is conjugate with personalized PageRank. Testing on four real-world networks that are commonly used in benchmarking GCNs' performance for the node classification task, we systematically evaluate the performance of the proposed methodology and show that our approach outperforms existing methods for wide ranges of parameter values. Since our method requires only a few training epochs, it releases the heavy training burden of GCNs. The source code of the proposed method is freely available at https://github.com/mustafaCoskunAgu/ScNP/blob/master/TRJMain.m.
  • Loading...
    Thumbnail Image
    Master Thesis
    Kanser Alt Tipi Tanımlama Problemi için Bir Etiket Yayma Yaklaşımı Geliştirme
    (Tubitak Scientific & Technological Research Council Turkey, 2022) Guner, Pinar; Bakir-Gungor, Burcu; Coskun, Mustafa; Güner, Pınar; Güngör, Burcu; Coşkun, Mustafa
    Kanser terimi, anormal hücrelerin kontrolden çıkıp diğer dokuları istila ettiği hastalıkları tanımlamak için kullanılır. Çok sayıda kanser türü vardır ve birçok kanser türü, farklı klinik ve biyolojik etkileri olan çeşitli alt tiplere sahiptir. Bu farklılıklar, kanserin farklı alt tiplerinin tedavisi için farklı yöntemlerin izlenmesi gerektiğini göstermektedir. Kişiselleştirilmiş tıbbın geliştirilmesine yardımcı olabileceğinden, kanser alt tiplerini keşfetmek biyoinformatikte önemli bir problemdir. Kanserin alt tipinin bilinmesi, tedavi basamaklarının ve öngörünün belirlenmesinde faydalıdır. Hesaplamalı biyoinformatik yöntemler, farklı kanser alt tiplerinin ortak moleküler patolojisini ortaya çıkararak hedeflenen tedavileri tasarlamak için kanser analizi yapmaya yardımcı olur. Şimdiye kadar, kanser alt tiplerini keşfetmek veya kanseri bilgilendirici alt tiplere ayırmak için çeşitli hesaplamalı yöntemler önerildi. Ancak, mevcut çalışmalar verilerin seyrekliğini dikkate almamakta ve kötü koşullu (tersi alınamayan) çözümle sonuçlanmaktadır. Bu eksikliği gidermek için, bu tezde, uygulamalı sayısal cebir tekniklerini kullanarak kanseri alt tiplerine ayırmak için alternatif bir denetimsiz hesaplama yöntemi öneriyoruz. Daha detaylı olarak, bu etiket yayma tabanlı yaklaşımı kolon, baş ve boyun, rahim, mesane ve meme tümörlerinin somatik mutasyon profillerini sınıflandırmak için uyguladık. Sonra, yöntemimizin performansını temel yöntemlerle karşılaştırarak değerlendirdik. Kapsamlı deneyler, yaklaşımımızın, modern denetimsiz ve denetimli yaklaşımlardan büyük ölçüde daha iyi performans göstererek tümör sınıflandırma görevlerini yüksek oranda yerine getirdiğini kanıtlamaktadır.
  • Loading...
    Thumbnail Image
    Article
    Citation - WoS: 46
    Citation - Scopus: 59
    Node Similarity-Based Graph Convolution for Link Prediction in Biological Networks
    (Oxford Univ Press, 2021) Coskun, Mustafa; Koyuturk, Mehmet
    Background: Link prediction is an important and well-studied problem in network biology. Recently, graph representation learning methods, including Graph Convolutional Network (GCN)-based node embedding have drawn increasing attention in link prediction. Motivation: An important component of GCN-based network embedding is the convolution matrix, which is used to propagate features across the network. Existing algorithms use the degree-normalized adjacency matrix for this purpose, as this matrix is closely related to the graph Laplacian, capturing the spectral properties of the network. In parallel, it has been shown that GCNs with a single layer can generate more robust embeddings by reducing the number of parameters. Laplacian-based convolution is not well suited to single-layered GCNs, as it limits the propagation of information to immediate neighbors of a node. Results: Capitalizing on the rich literature on unsupervised link prediction, we propose using node similarity-based convolution matrices in GCNs to compute node embeddings for link prediction. We consider eight representative node-similarity measures (Common Neighbors, Jaccard Index, Adamic-Adar, Resource Allocation, Hub- Depressed Index, Hub-Promoted Index, Sorenson Index and Salton Index) for this purpose. We systematically compare the performance of the resulting algorithms against GCNs that use the degree-normalized adjacency matrix for convolution, as well as other link prediction algorithms. In our experiments, we use three-link prediction tasks involving biomedical networks: drug-disease association prediction, drug-drug interaction prediction and protein-protein interaction prediction. Our results show that node similarity-based convolution matrices significantly improve the link prediction performance of GCN-based embeddings. Conclusion: As sophisticated machine-learning frameworks are increasingly employed in biological applications, historically well-established methods can be useful in making a head-start.
  • Loading...
    Thumbnail Image
    Conference Object
    Offer : Referees Suggester for the Journal Editors
    (IEEE, 2019) Coskun, Mustafa; Hacilar, Hilal; Gezer, Cengiz; Gungor, Vehbi Cagri
    Assigning appropriate referees to a journal or conference paper is a vital task for many reasons, including enhancing the journal venue quality and reliance, fair judgement of the papers, and among many others. While assigning the referees to the papers, the editors of a journal venue need to find suitable referees who are both related to field of the given paper and have no conflict of interest with the authors of the paper. Editorial-wise this referee assignment process is implemented in a hand-crafted manner, i.e., the editor needs to find the most suitable referees to the paper via a search engine and manually refines the all unrelated and having conflict of interest authors to the paper authors. Clearly, such a manual referee searching process is tedious and time consuming for the editors. In this paper, we present an alternate automated approach for assigning referees problem using intrinsic random walk with restart proximity measure. In our experiments based on a comprehensive DBLP networks, we show that our approach, called OFFER, significantly outperforms state-of-the-art the random walk with restart based method.
  • Loading...
    Thumbnail Image
    Article
    Topological Feature Generation for Link Prediction in Biological Networks
    (PeerJ Inc, 2023) Temiz, Mustafa; Bakir-Gungor, Burcu; Sahan, Pinar Guner; Coskun, Mustafa
    Graph or network embedding is a powerful method for extracting missing or potential information from interactions between nodes in biological networks. Graph embedding methods learn representations of nodes and interactions in a graph with low-dimensional vectors, which facilitates research to predict potential interactions in networks. However, most graph embedding methods suffer from high computational costs in the form of high computational complexity of the embedding methods and learning times of the classifier, as well as the high dimensionality of complex biological networks. To address these challenges, in this study, we use the Chopper algorithm as an alternative approach to graph embedding, which accelerates the iterative processes and thus reduces the running time of the iterative algorithms for three different (nervous system, blood, heart) undirected protein-protein interaction (PPI) networks. Due to the high dimensionality of the matrix obtained after the embedding process, the data are transformed into a smaller representation by applying feature regularization techniques. We evaluated the performance of the proposed method by comparing it with state-of-the-art methods. Extensive experiments demonstrate that the proposed approach reduces the learning time of the classifier and performs better in link prediction. We have also shown that the proposed embedding method is faster than state-of-the-art methods on three different PPI datasets.