Scopus İndeksli Yayınlar Koleksiyonu
Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/395
Browse
Search Results
Article Fine-Tuning Large Language Models for Turkish Flutter Code Generation(Sakarya University, 2025) Uluirmak, B.A.; Kurban, R.The rapid advancement of large language models (LLMs) for code generation has largely centered on English programming queries. This paper focuses on a low-resource language scenario, specifically Turkish, in the context of Flutter mobile app development. Two representative LLMs (a 4B-parameter multilingual model and a 3B code-specialized model) on a new Turkish question-and-answer dataset for Flutter/Dart are fine-tuned in this study. Fine-tuning with parameter-efficient techniques yields dramatic improvements in code generation quality: Bilingual Evaluation Understudy (BLEU), Recall-Oriented Understudy for Gisting Evaluation (ROUGE-L), Metric for Evaluation of Translation with Explicit Ordering (METEOR), Bidirectional Encoder Representations from Transformers Score (BERTScore), and CodeBLEU scores show significant increases. The rate of correct solutions increased from ~30–70% (for base models) to 80–90% after fine-tuning. The performance trade-offs between models are analyzed, revealing that the multilingual model slightly outperforms the code-focused model in accuracy after fine-tuning. However, the code-focused model demonstrates faster inference speeds. These results demonstrate that even with very limited non-English training data, customizing LLMs can bridge the gap in code generation, enabling high-quality assistance for Turkish developers comparable to that for English. The dataset was released on GitHub to facilitate further research in multilingual code generation. © 2025, Sakarya University. All rights reserved.Conference Object Citation - WoS: 2Citation - Scopus: 2Fine Tuning DeepSeek and Llama Large Language Models with LoRA(IEEE, 2025-06-25) Uluirmak, Bugra Alperen; Kurban, RifatIn this paper, Low-Rank Adaptation (LoRA) finetuning of two different large language models (DeepSeek R1 Distill 8B and Llama3.1 8B) was performed using the Turkish dataset. Training was performed on Google Colab using A100 40 GB GPU, while the testing phase was carried out on Runpod using L4 24 GB GPU. The 64.6 thousand row dataset was transformed into question-answer pairs from the fields of agriculture, education, law and sustainability. In the testing phase, 40 test questions were asked for each model via Ollama web UI and the results were supported with graphs and detailed tables. It was observed that the performance of the existing language models improved with the fine-tuning method.
