ESRA- 6th Annual ConferenceReykjavik, July 17, 2015

Spoken language


written language:

A challenge for the linguistic validation of data collection instruments for international surveys


Linguistic validation in 3MC

Set of processes that aims to ensure that the same questions are being asked, or the same constructs are being measured, via translated data collection instruments.

Comprises a number of quality assurance (LQA) and quality control (LQC) steps implemented both upstream and downstream.



Spoken language adds a new twist…

When CAPI/CATI systems are used, the interviewer follows a script and reads out the questions to the respondent.

 Questions and challenges for linguistic validation of materials which will not be seen in written form but only heard by respondents.

cApStAn: 15 years experience in LQA/LQC, starting with PISA – Active in CSDI, WAPOR, ESRA, 3MC, ITC… but practitioner’s viewpoint.

Empirically, we’ll strive to define conceptual and procedural issues, present real examples of difficult situations, propose solutions, and suggest open challenges


What accommodations in the source itself?

It seems appropriate and useful to relax the conventions of standard written English and use e.g. contractions “don’t”, “can’t”, “you’re”, etc.Gives translators an indication of the desired register (to the extent that it applies to their language)But let’s not go overboard

Whatcha gonna do?


What accommodations in the source itself?

Idiomatic expressions may be more frequently used in spoken versus written English. This is fine, but it is good practice to provide a translation/adaptation guideline.


What accommodations in the target languages?

In general, should the criterion of “equivalence to source” cover also equivalence of register (spoken versus written)?Yes, with care – this will mean different things in different languages.Example 1: less subjunctive in ItalianExample 2: negation in French


What accommodations in the target languages?

In general, can or should (some of) the criteria of “linguistic correctness” in the target language be relaxed in the case of texts which will never be seen in written form?

In general, no – not much more than the accommodations for spoken language register.

Examples: punctuation, spellcheck

What about formatting (


, underline,



What accommodations in the target languages?

Example 3: gender marks

Solution 1

Solution 2

Solution 3


Special cases

Diglossia: ‘everyday’ variety versus ‘high’ varietyExample 1: ArabicExample 2: Swiss-German

EU-MIDIS-IITarget group: Immigrants from North Africa and their descendants, in BE, ES, FR, ITTeam translation: -Tunisian linguist- Algerian linguist- Moroccan linguist‘Passe-partout’ Maghrebi Arabic version

SHARESOURCE:What was his relationship to you?GERMAN: Was war seine Beziehung zu Ihnen?SWISS-GERMAN:Was ist seine Beziehung zu Ihnen gewesen?


Special cases

Leading from the previous pointHow to check “linguistic correctness” in the target language when the standards for the latter are not clearly defined?Example 1: PIAAC, Serbo-Croatian-Bosniac version for Austria…… with some words in German


Special cases

One last example: PIAAC, Spanish version for the USAVarieties of Spanish spoken in the USA:- Mexican- Caribbean- Central American- Colonial

Most widespread, and becoming the ‘standardized dialect’ of Spanish in the continental United States

But if more


- interviewer

'help' text



- dynamic




(Tentative) Conclusions

From Prof. Lyberg’s keynote speech: “A good 3MC design is a mixture of standardization and flexibility”

Accommodations for colloquial register: Yes. But “text will not be seen in written form” is not a valid excuse to

generally relax linguistic standards


A‘good’ translation is one that is fit for purpose. The purpose here: help the interviewer carry out a successful interview and collect valid, comparable



helps? What hinders?

Spoken language likely to have

even more variants

than written


'Help' texts, dynamic text




