Back to Journal
Research Article Open AccessOrclever Native
Natural Language Processing Based Poet Recognition with Supervised Learning on Turkish Poetry Dataset
1Harran University
2Erciyes University
Published:May 31, 2024
DOI: 10.56038/oprd.v4i1.470
Vol. 4, No. 1 · pp. 115–122
Abstract
Natural language processing-based studies become popular nowadays and Turkish based studies are increasing. The problem of author classification is based on determining whether an anonymous text belongs to one of the popular authors. This research problem is motivated by the idea that each author's work will reflect some basic features about the author's intellectual vocabulary and thus it should be possible to distinguish between authors. In this study, 50 poems of 5 different poets from Turkish Literature were taken and a dataset was obtained. Experiments were performed on the dataset using 9 different classifier methods. This is a preliminary study that will serve as a basis for future studies.
Keywords
Natural language processingtext miningsupervised learningsupport vector machineBayesian classifiersdecision treesrandom forest
References
- 1.D. Reinsel, J. Gantz, J. Rydning, "The digitization of the world from edge to core". International Data Corporation, 16, 1-28, 2018.
- 2.A. Oğuzlar, "Temel Metin Madenciliği". Dora Yayınları, 2011.
- 3.E. Adalı, "Türkçe Doğal Dil İşleme". Akçağ Yayınları, 2020.
- 4.Z. Korkmaz, "Türkiye Türkçesi Grameri Şekil Bilgisi". Türk Dil Kurumu Yayınları, 2009.
- 5.C. M. Stamatatos, "Automatic authorship attribution". Ninth Conference of the European Chapter of the Association for Computational Linguistics, 1999.
- 6.D. Ünal, Ş. E. Şeker, "Metin Madenciliğinde Yazar Tanıma (Author Recognition in Text Mining)". BS Ansiklopedisi, 2018. Ninth Conference of the European Chapter of the Association for Computational Linguistics, 1999.
- 7.F. Mosteller, D. L. Wallace, "Applied Bayesian and Classical Inference: The Case of the Federalist Papers". Addison-Wesley, 1984.
- 8.K. Oflazer, Two-level description of Turkish morphology. In Literary and linguistic computing, volume 752, pages 137-148. Madison, WI, 1998.
- 9.G. Cebiroğlu, "Sentetik Türkçe Sözcük Kökleri Üretimi". International XII. Turkish Symposium on Artificial Intelligence and Neural Networks–TAINN, 2003.
- 10.İ. Büyukkuşcu, E. Adalı, "Heceleme Yöntemiyle Kök Sözcük Üretme". International XII. Turkish Symposium on Artificial Intelligence and Neural Networks–TAINN, 2003.
- 11.C. M. Tan, Y. F. Wang, et al. "The use of bigrams to enhance text categorization". Information Processing & Management 38(4), 2002.
- 12.B. Diri, F. Amasyalı, "Automatic author detection for Turkish texts". Artificial Neural Networks and Neural Information Processing (ICANN/ICONIP), 2003.
- 13.F. Amasyalı, B. Diri, et al. "Farklı özellik vektörleri ile Türkçe dokümanların yazarlarının belirlenmesi". 15th Turkish Symposium on Artificial Intelligence and Neural Network, Muğla, Türkiye, 2006.
- 14.İ. N. Bozkurt, O. Baghoglu, et al. "Authorship attribution performance of various features and classification methods". 22nd International Symposium on Computer and Information Sciences, 2007.
- 15.A. McCallum, K. Nigam, et al. A comparison of event models for naive bayes text classification. In AAAI-98 workshop on learning for text categorization, volume 752, pages 41–48. Madison, WI, 1998.
- 16.B. Schölkopf, A. J. Smola, et al. "New support vector algorithms." Neural Computation, 12(5):1207–1245, 2000.
- 17.L. Breiman, J. H. Friedman, et al. Classification And Regression Trees. Routledge, October 2017.
- 18.L. Breiman. Random forests. Machine learning, 45:5–32, 2001.
- 19.P. Geurts, D. Ernst, L. Wehenkel. Extremely randomized trees. Machine learning, 63:3–42, 2006.
- 20.E. Şahin. Makine öğrenme yöntemleri ve kelime kümesi tekniği ile İstenmeyen e-posta/e-posta sınıflaması. Master’s thesis, Hacettepe Üniversitesi, 2018
Cite This Article
Korkmaz, S., Köylü, F. (2024). Natural Language Processing Based Poet Recognition with Supervised Learning on Turkish Poetry Dataset. *Orclever Proceedings of Research and Development*, 4(1), 115-122. https://doi.org/10.56038/oprd.v4i1.470
Bibliographic Info
JournalOrclever Proceedings of Research and Development
Volume4
Issue1
Pages115–122
PublishedMay 31, 2024
eISSN2980-020X