Why do closed scales 'break' when translated into Uzbek?

What usually breaks is not the items but the anchors — intensifier words like 'completely agree' or 'unlikely.' Their intensity in Uzbek and Tajik doesn't match the Russian one-to-one, so respondents in different versions react to different stimuli while you compare them as one scale. The fix is to have a native speaker translate the anchors and to back-translate the anchors specifically.

What is acquiescence bias and how do I counter it?

It's a systematic shift of answers toward the 'agree' options, amplified in the Uzbek context by a culture of courtesy and hospitality. You counter it with balanced and forced-choice formats instead of agree/disagree, neutral wording, behaviorally anchored items, and alternating scale direction.

How should I code open-ended answers that arrive in three languages?

Build one multilingual codebook: define categories by meaning rather than language, and add anchor examples in Russian, Uzbek, and where needed Tajik or Karakalpak to each one. Account for mixed speech and code-switching. This is a real line item in the budget and the main reason to ration open questions.

Can I give Uzbek respondents showcards with a scale?

You can, but with care, because of the Cyrillic-to-Latin transition and varying literacy: a card set only in Latin may be unreadable for an elderly rural respondent. Long scales are safer read aloud by the interviewer, and any numeric scale should always be reinforced with verbal anchors.

Which format is better for measuring satisfaction in a multi-region project?

For the measurement itself, a closed scale (for example, 1–5 or NPS), because it can be compared over time and across regions like Tashkent, Fergana, and Karakalpakstan. Add one short open question about the reasons behind the score — and verify the translation of the scale anchors in every field language.

Questionnaire Design

Open vs. Closed Questions in Multilingual Fieldwork

Open vs. closed questions in Uzbek surveys: how formats and rating scales behave across three field languages, and when to use each.

МИAISurvey MethodologyApril 30, 202614 min read

Choosing between an open and a closed question is choosing between depth and scale. But in the Uzbek field, a second trade-off sits on top of that one: whether your format survives translation into three languages. A scale that works flawlessly in the Russian version can quietly break in the Uzbek one, and a pile of open-ended answers arriving in Russian, Uzbek, and Tajik can bury your entire coding budget. This article is about how question type and rating scale actually behave under Uzbekistan's real multilingual field conditions.

The old trade-off: depth versus scale

A closed question offers a ready-made set of options: yes/no, single or multiple choice, a rating scale. Its strength is standardization — answers are easy to aggregate, compare across regions, and track over time. Its weakness is that it imposes your framing on the respondent and misses anything you didn't anticipate.

An open question lets people answer in their own words. It is indispensable when you don't know all the possible answers in advance, or when you want to understand the why behind a number. The cost is labor-intensive coding, the risk of blank and irrelevant answers, and a heavier burden on both respondent and interviewer.

That trade-off is in every textbook. What follows is what the textbooks leave out: what happens to both formats when the field speaks several languages and half the country is rural.

The scale that breaks in translation

A closed question with a rating scale feels like the "safest" format: numbers are universal, counting is easy. In practice, scales produce the most insidious errors in a multilingual project, because they break silently — the data looks clean, yet it isn't comparable across language versions.

The problem is in the anchor labels. Take a classic agreement scale. The Russian "совершенно согласен" (completely agree) is strong but tonally neutral. The direct Uzbek rendering "to'liq qo'shilaman" sounds slightly softer in intensity, while a variant like "mutlaqo qo'shilaman" adds a categorical edge the Russian anchor never had. The midpoint is the same story: the Russian "затрудняюсь ответить" and the Uzbek "javob berishga qiynalaman" are not the same as "neutral" or "neither agree nor disagree." A respondent in one version and a respondent in another are reacting to different stimuli — and then you compare their proportions as if they shared one scale.

NPS and satisfaction scales have the same issue. The word "recommend" ("tavsiya qilaman") and the concept of "likelihood" translate, but the intensity register — "definitely," "probably," "unlikely" — is idiomatic in Uzbek and does not map one-to-one onto the Russian gradations.

This drives the choice of scale type itself. A fully labeled scale (a verbal anchor on every point) measures more finely, but it is precisely the format that survives translation worst — it breaks on every intermediate word. A numeric scale with only the endpoints labeled ("1 = not at all satisfied … 5 = completely satisfied") translates more reliably, because the native speaker has to calibrate two or three anchors, not seven. So in a multilingual project it pays to deliberately trade some fineness for comparability: a short numeric scale with two well-calibrated ends is almost always more honest than a seven-point labeled grid whose middle anchors have drifted apart across versions.

The same goes for ranking and paired comparisons. "Rank these five options by importance" sounds simple, but in an oral interview in Uzbek or Tajik — and with an elderly respondent — it sharply raises cognitive load and invites a "just leave me alone" answer. Rating each item separately, or picking the "most / least important," carries translation and read-aloud delivery far more robustly than a full ranking.

The practical takeaways:

Translate the anchors, not just the items. Intensifier words ("completely," "somewhat," "unlikely") are the most fragile part of a scale. A native speaker must choose them, not a dictionary.
Back-translate the anchors specifically. An Uzbek-to-Russian back-translation quickly reveals that "fairly agree" and "completely agree" have collapsed into one option in the Uzbek version.
Check that the steps are even. A good scale moves in equal intervals. After translation, two adjacent anchors often turn out to be near-synonyms, with a gap between others.
Keep the number of gradations identical across versions. The temptation to "simplify" the Uzbek version to fewer points destroys comparability.

We cover translation and back-translation in more depth in our guide to effective questionnaire design. The point here is singular: a scale that hasn't passed linguistic review in every field language gives you a false sense of precision.

The politeness that inflates the top of the scale

There is a cultural pattern you cannot ignore in Uzbekistan: a tendency to agree out of courtesy. The guest in the home, the tea, the respect for someone who is "on official business" asking questions — all of it nudges the respondent toward confirming rather than disagreeing. On agreement scales this produces acquiescence bias: a systematic shift toward the top, "agree" end.

The effect is stronger where social distance between interviewer and respondent is large: a young interviewer with an elder, an urban interviewer in a rural mahalla (the neighborhood community), a man asking about family topics. And it disguises itself as good news: the client sees 85% satisfied and celebrates, even though part of that 85% is simply politeness.

What to do at the format level:

Favor balanced and forced-choice formats. Instead of "Do you agree the service is good?", offer two substantive positions: "Some people think the service is good, others think it needs improving. Which is closer to you?" That way agreement has no default direction.
Word things neutrally. Avoid leading statements that are polite to agree with. Not "How satisfied are you with our excellent service?" but "How would you rate the service?"
Use behaviorally anchored items instead of abstract agreement. "How many times in the past month did you…" is harder to "confirm out of politeness" than "Do you agree you often…?"
Alternate scale direction and mix wording (some statements reverse-keyed) to catch respondents who mechanically mark "agree" on everything.

So if the Uzbek version of a questionnaire shows ten points more "agreement" than the Russian one, don't celebrate too soon: first check whether it is politeness, and whether the anchors drifted apart in translation.

Open questions and the triple cost of coding

One open question in a single-language survey is coding volume X. The same question in the Uzbek field is X in Russian, plus X in Uzbek, plus occasionally X in Tajik (in Samarkand and Bukhara) or Karakalpak — plus a separate headache from mixed speech.

Because people do not answer "in one language." A respondent in Tashkent switches easily from Russian to Uzbek within a single sentence; in Fergana the answer arrives in Uzbek with Russian inserts; in Samarkand, partly in Tajik. Your coder has to understand all of it and assign answers to the same categories regardless of language.

So the rule is simple: one multilingual codebook, not three parallel ones. Categories are defined by meaning, not language, and applied to all answers at once:

Collect the first hundred or two answers from every language, not just the Russian ones.
Identify themes by meaning and fix each category's definition in the team's working language (usually Russian).
Add anchor examples in each field language to every category — so the coder recognizes the theme in the Uzbek and the Tajik phrasing alike.
Run two coders on a shared subset and check agreement before coding everything.

Budget separately for inconsistent transcription. Open answers written down by interviewers arrive in both Latin and Cyrillic, with dialect words and abbreviations; under digital collection some are typed on a phone, some dictated by voice. It helps to agree in advance on a transliteration convention and on what to do with mixed-language answers — otherwise coders spend half their time deciphering rather than categorizing. This is another hidden cost multiplier visible only to those who have processed the Uzbek field before.

This work is real and it costs money. It is the main argument for restraint with open questions in a multilingual field — not because they are "bad," but because processing them in Uzbekistan is more expensive than it looks on paper.

Literacy, showcards, and Latin versus Cyrillic

You cannot choose a question format in isolation from how it is presented in the field. And here Uzbekistan has its own specifics.

Self-administered closed formats and showcards (cards listing options or showing the scale visually) assume the respondent reads confidently. But the country is mid-transition from Cyrillic to Latin script: younger people were schooled in Latin, while many older and rural readers are more comfortable in Cyrillic. A card set only in Latin may be unreadable to an elderly respondent in a village, while a Cyrillic one is awkward for a schoolchild.

The practical consequences:

Long scales and complex closed formats are better read aloud by the interviewer rather than self-administered, especially for older and rural respondents.
If you use showcards, prepare them with both alphabets in mind or duplicate them, and reinforce any numeric scale with verbal anchors (almost everyone reads digits).
Simple formats are more robust. A binary choice and a short verbal scale survive both hesitant reading and being read aloud better than a seven-point grid with fine gradations.

This is one more argument for interviewer-read closed questions as the workhorse format across much of the field — and against self-administered grids that work in the city but fall apart in a rural mahalla.

Where the closed format wins

Closed questions win whenever you need comparability — across regions, across groups, over time.

Comparing regions. If you measure the same indicator in Tashkent, Fergana, and Karakalpakstan, only a standardized closed format yields numbers you can place side by side.
Tracking over time. Wave-on-wave measurement — market, brand, public opinion in the spirit of the Ijtimoiy Fikr public-opinion centre — rests on unchanged closed wording. Change an anchor between waves and you lose the trend.
Large samples. When you need proportions with an acceptable margin of error across thousands of questionnaires, open questions simply don't scale on processing cost.

The standardization of closed data pairs well with moving to digital collection: one instrument, identical options, no version drift between interviewers. We walk through how this works in practice in the getting-started guide for AISurvey.

Where the open format wins

Open questions are indispensable when you don't yet know the answer space.

Early exploration. Before you freeze the options of a closed question, an open stage — even across a dozen in-depth interviews — shows what words people actually use to describe a topic. In Uzbek these are often not the words you would translate from Russian.
One or two key "whys." In the main survey, a couple of open questions at the most important junctions give context no scale can extract.
Local vocabulary and the unexpected. An open question catches what isn't on your list: local brands, regional practices, reasons you never guessed.

Deciding the size and structure of the sample for that exploration is a separate topic; we cover it in our overview of sampling methods.

Who actually answers — and how that changes the format

A format doesn't live in a vacuum; it lives in a specific home, with a specific person across from the interviewer. In Uzbekistan that person is systematically not the one you planned for.

Because of labor migration — millions of working-age men, especially from the Fergana Valley and the south, work in Russia and Kazakhstan — the people at home during the day are more often women, the elderly, and the young. Any scale you tested on the "average" urban respondent now reaches someone with a different education level, a different language habit, and a different willingness to read a card. A complex seven-point grid that worked beautifully in a Tashkent pilot turns into random selection for an elderly woman in a Fergana village — and that, again, is about the robustness of the format, not only the translation.

Layer on gender norms: in a conservative household a male stranger may not be received, and a woman may not be available without the head of the household present. The format consequence is concrete: sensitive topics and fine-grained scales are better handled by a female interviewer reading aloud than by a self-administered card a respondent won't read in front of relatives.

And a separate decision: which language versions to produce at all. In Karakalpakstan a proper questionnaire needs a Karakalpak version, not "the Uzbek one they'll understand anyway" — it is a distinct Turkic language, and a scale translated only into Uzbek produces the same silent comparability failure there. In Samarkand and Bukhara a meaningful share of respondents are more comfortable in Tajik. Each additional version is another set of anchors to translate and back-translate, and another language in the codebook for open answers. That feeds directly into how many open questions you can afford.

A practical rule for the Uzbek field

Put it all together and a working rule emerges.

Use closed questions as the load-bearing structure: for measurement, regional comparison, and tracking. Build them on balanced, neutral, and where possible behaviorally anchored wording to dampen courtesy agreement. Translate scales at the anchor level and verify them with back-translation in every field language. Have the interviewer read complex closed formats aloud.

Use open questions sparingly and surgically: at the exploration stage and for one or two crucial "whys." Remember that each one multiplies by the number of field languages at coding time.

A good technique for Uzbekistan: a closed question with a balanced scale, immediately followed by a short open "Please explain why." You get a countable, comparable number and the context behind it — without a questionnaire made of ten open questions.

Don't overuse open questions

A questionnaire made of ten open questions tires the respondent, invites throwaway answers, and buries you in text across three languages that no one has time to code on schedule. One or two well-posed open questions deliver more value than ten perfunctory ones — and in a multilingual field that rule works with triple force.

Good questionnaire design in Uzbekistan isn't "open versus closed" — it's a sober calculation: what you are measuring, in how many languages, who will read it and how, and how many answers you can realistically process. Start designing your instrument in the AISurvey builder, and for the underlying principles come back to our field-research blog.

Frequently asked questions

Why do closed scales 'break' when translated into Uzbek?: What usually breaks is not the items but the anchors — intensifier words like 'completely agree' or 'unlikely.' Their intensity in Uzbek and Tajik doesn't match the Russian one-to-one, so respondents in different versions react to different stimuli while you compare them as one scale. The fix is to have a native speaker translate the anchors and to back-translate the anchors specifically.
What is acquiescence bias and how do I counter it?: It's a systematic shift of answers toward the 'agree' options, amplified in the Uzbek context by a culture of courtesy and hospitality. You counter it with balanced and forced-choice formats instead of agree/disagree, neutral wording, behaviorally anchored items, and alternating scale direction.
How should I code open-ended answers that arrive in three languages?: Build one multilingual codebook: define categories by meaning rather than language, and add anchor examples in Russian, Uzbek, and where needed Tajik or Karakalpak to each one. Account for mixed speech and code-switching. This is a real line item in the budget and the main reason to ration open questions.
Can I give Uzbek respondents showcards with a scale?: You can, but with care, because of the Cyrillic-to-Latin transition and varying literacy: a card set only in Latin may be unreadable for an elderly rural respondent. Long scales are safer read aloud by the interviewer, and any numeric scale should always be reinforced with verbal anchors.
Which format is better for measuring satisfaction in a multi-region project?: For the measurement itself, a closed scale (for example, 1–5 or NPS), because it can be compared over time and across regions like Tashkent, Fergana, and Karakalpakstan. Add one short open question about the reasons behind the score — and verify the translation of the scale anchors in every field language.

#question types#questionnaire design#rating scales#multilingual surveys#uzbekistan#methodology

Share:Telegram

About the author

МИ

AISurvey Methodology

AISurvey methodologists on sampling, question wording, and data quality in social and market research.