Language Modeling for Turkish Text and Speech Processing
Loading...
Files
Date
2018
Authors
Arısoy, Ebru
Journal Title
Journal ISSN
Volume Title
Publisher
Springer
Abstract
This chapter presents an overview of language modeling followed by a discussion of the challenges in Turkish language modeling. Sub-lexical units are commonly used to reduce the high out-of-vocabulary (OOV) rates of morphologically rich languages. These units are either obtained by morphological analysis or by unsupervised statistical techniques. For Turkish, the morphological analysis yields word segmentations both at the lexical and surface forms which can be used as sub-lexical language modeling units. Discriminative language models, which outperform generative models for various tasks, allow for easy integration of morphological and syntactic features into language modeling. The chapter provides a review of both generative and discriminative approaches for Turkish language modeling.
Description
ORCID
Keywords
Language modeling
Turkish CoHE Thesis Center URL
Citation
Arisoy, E. & Saraçlar, M. (2018). Language modeling for Turkish text and speech processing in Turkish Natural Language Processing. pp. 69-92
WoS Q
N/A
Scopus Q
N/A
Source
Turkish Natural Language Processing
Volume
Issue
Start Page
69
End Page
92