A Survey on Open-Source Turkish Text-to-Speech

Datasets, Models, and the Absence of Emotion

2026

PART 1 OF 3

This survey is the first step toward our Turkish emotional TTS model.

We are building the dataset and model that this research shows is missing.

20

Papers Reviewed

10

Datasets Analyzed

21

Models Verified

0

Emotional Speech Datasets

Abstract

Turkish is an agglutinative language spoken by over 83 million people, yet it remains underserved in text-to-speech synthesis. This survey reviews 20 peer-reviewed papers, 10 community speech datasets on HuggingFace, and 21 open-source Turkish-specific TTS models. We evaluate each resource on three axes: dataset openness, model openness, and emotion support. Our findings show that no peer-reviewed work provides Turkish emotional speech data; the highest-quality system (MOS 4.39) is proprietary; 80% of HuggingFace datasets lack licenses; and community contributors have adopted modern architectures (F5-TTS, Orpheus) that have not appeared in academic Turkish TTS publications. We present these findings without prescriptive bias and let the evidence speak for itself.

Key Findings

No Emotional Speech Data

Across all 20 papers and 10 datasets, not a single resource provides Turkish speech annotated with emotion labels. Two community models (Orpheus) offer non-verbal cue tags (laugh, sigh), but no categorical emotion conditioning exists.

Best System is Closed

The highest reported MOS (4.39) belongs to Turkcell's proprietary system trained on 63 hours from a professional voice actress. No data, code, or weights were released. The best open effort achieves MOS 4.49 but with incomplete artifacts.

Licensing is Unreliable

Only 2 of 10 HuggingFace datasets declare any license. The largest dataset mislabels CC-BY-NC-SA-3.0 as CC-BY-SA-3.0. Multiple models have conflicting license metadata and model card restrictions.

80% Lack Documentation

Eight of ten datasets have empty READMEs with no methodology, source attribution, or quality metrics. Independent verification of data quality is impossible for most resources.

Academic vs. Community Gap

Academic papers use Tacotron 2 and FastSpeech 2 (2023). Community contributors deploy F5-TTS, Orpheus, LLaSA, and Dia (2025). But none of the 21 community models report formal MOS scores.

Turkish Lags Behind Tatar

TatarTTS provides 70 hours with MOS 4.54-4.65 under CC-BY-4.0. Turkish, with 15x more speakers, has no equivalent open resource combining dataset, model, and evaluation.

Open-Source Turkish Speech Datasets

Dataset Samples Hours kHz License Speakers Emotion Notes
Appenlimited/700h-tr2,0005.616Unknown~20NoMislabeled (says 700h); commercial sample
Anilosan15/Turkish_TTS30,606~3348None1NoYouTube scrape; copyright concern
mukahraman/orpheus-tr1,000-24None-NoPre-tokenized; Orpheus-locked
falan42/Bentropi (11 parts)2,595~1016None1NoYouTube extraction; copyright
falan42/Tunc_M4,552~1016None1NoYouTube math lectures
falan42/Mert-H837~316None1NoYouTube educational; duplicated
afkfatih/combined-raw81,513~13024CC-BY-SA-3.0*100sNoLargest; license mislabeled
afkfatih/snac-tokenized81,513~13024CC-BY-NC-SA100sNoOrpheus-locked
Anilosan15/Synthetic13,0002916CC-BY-4.04NoFully synthetic; TTS source undisclosed
yuserabv/turkish_tts30,915--None-NoSpeechT5 features only; no raw audio

*Effective license is CC-BY-NC-SA-3.0 due to Khan Academy source component.

Verified Turkish TTS Models

Model Architecture Size License DL/mo Emotion Notes
F5-TTS (Flow Matching + DiT)
Orkhon-TTSF5-TTS3.4 GBApache 2.077NoAlpha; voice cloning
marduk-ra/F5-TurkishF5-TTS4.1 GBCC-BY-NC-No3 checkpoints; 24 likes
Karayakar/F5-TurkishF5-TTS4.1 GBMIT-NoDemo Space available
Orpheus (Llama 3B + SNAC)
Karayakar/Orpheus-PT-5000Orpheus 3B13.2 GBMIT204YesEmotion tags; most popular
Karayakar/Orpheus-GGUFOrpheus Q52.4 GBMIT69YesCPU quantized
Cosmobillian/turkish_orpheusOrpheus F166.6 GBApache 2.05YesPT-5000 fine-tune
SpeechT5 (~100-145M params)
Omarrran/speecht5_ttsSpeechT50.5 GBMIT107NoIntern exercise
deryauysal/cv_trSpeechT50.6 GBMIT10NoCommon Voice
Other
facebook/mms-tts-turVITS 36M145 MBCC-BY-NC4,116NoMost downloaded overall
Anilosan15/kani-tts-400mLFM2 370M740 MBApache 2.0282NoAcademic only
SalihHub/karagoz-hacivatXTTS-v23.8 GBCC-BY-NC-SA0NoCultural theme
Piper tr_TR-dfkiVITS ONNX63 MBMIT-NoOnly Piper Turkish voice

BibTeX

@article{roxas2026turkishtts,
  title={A Literature Survey on Open-Source Turkish Text-to-Speech:
         Datasets, Models, and Emotion},
  author={Roxas, Daniel Quillan},
  year={2026}
}