About Kurdish
Kurdish as we understand it - a cluster of related languages, not a single thing
Kurdish Is Not One Language
What is commonly called "Kurdish" is better understood as a cluster of related Northwest and Southwest Iranian languages with substantially different sociolinguistic profiles. These varieties share historical roots and many structural features, but they also differ in vocabulary, grammar, phonology, and orthography - sometimes significantly enough that speakers of different varieties cannot always understand each other.
The boundaries between these varieties are not always clear, and many speakers have complex or mixed repertoires. Classifying someone's speech as "Kurmanji" or "Sorani" can sometimes be a political and identity statement as much as a linguistic description.
The Main Varieties
Kurmanji (Northern Kurdish / Kurmancî)
The most widely spoken variety, dominant in Turkey, northern Syria, parts of Iraq and Iran, and large diaspora communities across Europe (particularly Germany, Sweden, and the Netherlands). Kurmanji uses a Latin-based orthography that developed partly in diaspora contexts. It was subject to severe official suppression in Turkey for much of the 20th century, with long periods during which the language's existence was officially denied. Some recognition has occurred since the early 2000s, but the political situation remains contested.
Sorani (Central Kurdish / سۆرانی)
Dominant in the Kurdistan Region of Iraq (KRI) and in Iranian Kurdistan. Sorani uses an Arabic-script-based orthography and is the de facto official language of the KRI, with state-backed institutions, television channels, and universities teaching in the language. Compared to other Kurdish varieties, Sorani has the most developed institutional infrastructure.
Zazaki / Dimli
Spoken primarily in eastern Turkey. Its classification as a Kurdish variety versus a distinct Iranian language is itself contested - both among scholars and within communities. This is not a trivial academic dispute: it carries identity and political dimensions. Speakers may identify as Kurdish, as Zaza, or hold more complex positions.
Gorani / Hawrami
A prestige variety historically associated with Kurdish religious and literary culture. Now spoken by relatively few people, primarily in Iran and Iraq. Largely oral in everyday use, with a rich literary tradition but limited standardized writing conventions in contemporary use.
Southern Kurdish (Feyli and others)
Spoken in parts of Iraq and Iran. Less standardized and less well-documented than Kurmanji or Sorani. The Feyli Kurds, associated with parts of the Iran-Iraq border region, represent one significant sub-community within Southern Kurdish.
The Geopolitical Dimension
Kurdish varieties are spoken across at least four nation-states - Turkey, Iraq, Iran, and Syria - as well as large diaspora populations, particularly in Europe. The political status of the language differs radically across these contexts:
- Turkey: Decades of official suppression have left deep marks. While some liberalization has occurred since the early 2000s, language rights remain politically contested. Speakers in Turkey may have legitimate reasons to be cautious about participating in any survey that touches on language identity.
- Iraqi Kurdistan (KRI): Kurdish (Sorani) has official status and strong institutional support. The sociolinguistic situation is closer to a regional majority language - a very different context from Turkey or Iran.
- Iran and Syria: Kurdish is suppressed to varying degrees, and speakers face political risks in both countries.
- Diaspora communities (especially in Germany, Sweden, Netherlands, UK): Often represent the most active spaces for language maintenance, Kurdish-language media, and online content. Diaspora varieties may be diverging from homeland varieties in interesting ways.
This political heterogeneity is one reason the survey asks about country of residence and origin only as optional questions: the same question about "language use" means very different things depending on where a speaker lives.
Why We Do Not Assume a Single "Standard Kurdish"
Some language technology projects approach non-standardized languages by trying to create a single, unified standard - the assumption being that standardization is the goal every language should eventually achieve.
This project does not make that assumption. Kurdish communities have diverse and sometimes divergent views on whether standardization is desirable, and on which variety should serve as the basis for any standard. Research has found that a majority of German dialect speakers, for example, oppose standardized orthography for their dialect - we have no reason to assume Kurdish speakers are uniform on this question either.
One of the goals of this survey is to understand what Kurdish speakers themselves want - including on the question of standardization - rather than to assume that "more like a standardized language" is always better.