RNA İkincil Yapılarının Çok Boyutlu Gösterimi ve Pre-Mirna Tespiti Için Uygulamaları
No Thumbnail Available
Date
2021
Journal Title
Journal ISSN
Volume Title
Publisher
TUBİTAK
Open Access Color
OpenAIRE Downloads
OpenAIRE Views
Abstract
MikroRNA'lar (miRNA'lar), transkripsiyon sonrası gen ekspresyonu düzenleyicileridir. Bir_x000D_
miRNA yüzlerce haberci RNA'yı (mRNA'lar) hedefleyebildiği gibi, bir mRNA farklı miRNA'lar_x000D_
tarafından hedeflenebilir, üstelik tek bir miRNA bir mRNA sekansında çeşitli bağlanma_x000D_
bölgelerine sahip olabilir. Bu nedenle miRNA'ları deneysel olarak araştırmak oldukça_x000D_
karmaşıktır. Bu tür zorlukları aşabilmek için makine öğrenimi (ML) sıklıkla kullanılmaktadır._x000D_
ML analizinin temel kısımları büyük ölçüde giriş verilerinin kalitesine ve verileri tanımlayan_x000D_
özelliklerin kapasitesine bağlıdır. Daha önce miRNA'lar için 1000'den fazla özellik önerilmişti._x000D_
Bu projede, RNA ikincil yapısını temsil eden yeni özellikler ve yüksek doğruluk değerleri_x000D_
sağlayan, dinamik, çok boyutlu grafik gösterimini tanımlamayı hedeflemiştik. Bu çalışmada,_x000D_
ML tabanlı miRNA tahmini için yeni ve kolayca güncellenebilir bir yaklaşım geliştirilmiştir._x000D_
Bilinen insan miRNA'larının ve sözde saç tokalarının random forest (RF), support vector_x000D_
machine (SVM) ve multilayer perceptron (MLP) gibi çeşitli sınıflandırıcılarla_x000D_
sınıflandırılmasıyla binlerce model oluşturulmuştur. Yöntem insan verilerine dayanarak_x000D_
oluşturulmuş olsa da en iyi model miRBase ve MirGeneDB gibi kamu veri tabanlarından_x000D_
insan olmayan saç tokaları üzerinde test edilmiş ve yüksek skorlar üretilmiştir. Ayrıca,_x000D_
yöntemin farklı veriler üzerindeki etkinliğini göstermek için ekspresyon farkları tahmini_x000D_
(differential expression prediction) analizinde de kullanılmıştır. Bu aşamada SARS-CoV-2_x000D_
enfeksiyonunun etkisini ölçen bir veri setinin analizinden elde edilen sonuçlar yayınlanmıştır.
MicroRNAs (miRNAs) are posttranscriptional regulators of gene expression. While a miRNA_x000D_ can target hundreds of messenger RNA (mRNAs), an mRNA can be targeted by different_x000D_ miRNAs, not to mention that a single miRNA might have various binding sites in an mRNA_x000D_ sequence. Therefore, it is quite complicated to investigate miRNAs experimentally. Thus,_x000D_ machine learning (ML) is frequently used to overcome such challenges. The key parts of a_x000D_ ML analysis largely depend on the quality of input data and the capacity of the features_x000D_ describing the data. Previously, more than 1000 features were suggested for miRNAs. In this_x000D_ project, we aim to define new features representing the RNA secondary structure and its_x000D_ dynamic multidimensional graphical representation providing high accuracy values. In this_x000D_ study, a new and easily updateable approach for ML-based miRNA prediction has been_x000D_ developed. Thousands of models have been created by classifying known human miRNAs_x000D_ and pseudo hairpins with various classifiers such as random forest (RF), support vector_x000D_ machine (SVM), and multilayer perceptron (MLP). Although the method was created based_x000D_ on human data, the best model was tested on non-human hairpins from public databases_x000D_ such as miRBase and MirGeneDB and high scores were produced. It has also been used in_x000D_ differential expression prediction analysis to show the effectiveness of the method on_x000D_ different data sets. At this stage, the results obtained from the analysis of a data set_x000D_ measuring the impact of SARS-CoV-2 infection have been published.
MicroRNAs (miRNAs) are posttranscriptional regulators of gene expression. While a miRNA_x000D_ can target hundreds of messenger RNA (mRNAs), an mRNA can be targeted by different_x000D_ miRNAs, not to mention that a single miRNA might have various binding sites in an mRNA_x000D_ sequence. Therefore, it is quite complicated to investigate miRNAs experimentally. Thus,_x000D_ machine learning (ML) is frequently used to overcome such challenges. The key parts of a_x000D_ ML analysis largely depend on the quality of input data and the capacity of the features_x000D_ describing the data. Previously, more than 1000 features were suggested for miRNAs. In this_x000D_ project, we aim to define new features representing the RNA secondary structure and its_x000D_ dynamic multidimensional graphical representation providing high accuracy values. In this_x000D_ study, a new and easily updateable approach for ML-based miRNA prediction has been_x000D_ developed. Thousands of models have been created by classifying known human miRNAs_x000D_ and pseudo hairpins with various classifiers such as random forest (RF), support vector_x000D_ machine (SVM), and multilayer perceptron (MLP). Although the method was created based_x000D_ on human data, the best model was tested on non-human hairpins from public databases_x000D_ such as miRBase and MirGeneDB and high scores were produced. It has also been used in_x000D_ differential expression prediction analysis to show the effectiveness of the method on_x000D_ different data sets. At this stage, the results obtained from the analysis of a data set_x000D_ measuring the impact of SARS-CoV-2 infection have been published.
Description
Keywords
miRNA, tahmin, makine öğrenmesi, model, prediction, machine learning
Turkish CoHE Thesis Center URL
Fields of Science
Citation
WoS Q
Scopus Q
Source
Volume
Issue
Start Page
1
End Page
24
Google Scholar™
Sustainable Development Goals
7
AFFORDABLE AND CLEAN ENERGY

13
CLIMATE ACTION
