and space Anja Nedolu žko Magda Ševčíková Šárka Zikánová October 6 2016 Variabilita jazyků v čase a prostoru NPFL100 lingvisticky orientovaný kurz pro studenty PhD ID: 791938
Download The PPT/PDF document "Variability of languages in time" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Variability of languages in timeand space
Anja Nedolužko, Magda Ševčíková, Šárka Zikánová
October
6, 2016
Slide2Variabilita jazyků v čase a prostoru
NPFL100• lingvisticky orientovaný kurz pro studenty PhD• 1/1•
zápočet
• 2 body / 3 e-
kredity
•
zimní semestr
•
čtvrtek
, 10:40–12:10
•
místnost
S
10
Slide3Teachers, web page
Anja Nedoluzhko, Magda Ševčíková, Šárka Zikánová
–
{
nedoluzko,sevcikova,zikanova
}@
ufal.mff.cuni.cz
programming
tasks:
Zdeněk
Žabokrtský
–
zabokrtsky@ufal.mff.cuni.cz
http
://ufal.mff.cuni.cz/~nedoluzko/variability/
https://ufal.mff.cuni.cz/courses/npfl100
kod
=NPFL100
Slide4Course schedule
lectures given in two-hour (90-minute) blocks11 lectures/seminars
during
the fall
term
:
Anja
Nedoluzhko
(
October
13;
November
3, 10, 24;
December
1)
Šárka Zikánová (
December
8, 15, 22;
January
5
, 12
)
Discourse
Workshop:
October
20-21
programming task
s (
Zdeněk
Žabokrtský
):
1st programming task due to Nov.
2
2nd programming task due to Dec. 16
3rd programming task due to Jan. 4
Slide5Course schedule - Anja
Languages of the world, classification of languagesWriting systems around the worldLinguistic typology: P
honology
, syntax, word order, etc.
typology
of
relative
sentences
Typology of
grammar:
nominal
categories
verbs
t
ypology
of
possessivity
Typology
of
word
formation
Slide6Course schedule - Šárka
Typologie jazykových změn Jazyk z hlediska sociolingvistiky * Psaná a mluvená podoba
* Jazyková norma, kodifikace
*
Časové
variety
*
Vliv
cizích jazyků
Slide7Course completion requirements
Active participation in the
course
Three
obligatory programming
tasks:
Quantifying the richness of morphology
Task description:
choose at least 20 languages for your experiment. Using any data resource you want, implement a function that
assig
n
s a number between 0 and 1 to each language, which roughly reflects the richness of morphology of the given language (the higher the number, the more rich morphological system). For instance, the value for
Engl
i
sh
should be higher than the value for German, which should be higher than the value for Czech.
due to November
2
, 201
6
(to be sent to zabokrtsky@ufal.mff.cuni.cz)
Language clustering
Task description:
Choose at least 20 languages, and implement a function that divides them into several clusters of related languages. You can use any data resource and any clustering algorithm.
due to December
7
,
201
6
(zabokrtsky@ufal.mff.cuni.cz
)
Exploring parallel word formation
Task description:
Given a Czech-German dictionary (will be posted here later), try to automatically identify some regular sub-word translation equivalents
due to January
4
, 201
7
(zabokrtsky@ufal.mff.cuni.cz
)
Languages of the world
Slide9Languages of the worldnearly 7 thousand living languages in the world
spoken by 6 billion speakersa living language has at least one speaker for whom it is a first languageVS. extinct languages and languages spoken only as a second language
Slide10(b) Language list according to the number of first-language speakers
Ethnologue:1/ Chinese 1,213 million speakers
2/
Spanish 329
3/
English 328
4/
Arabic
221
5/
Hindi
182
6/
Bengali 181
7/
Portuguese 178
8/
Russian 144
9/
Japanese 122
10/ German, Standard 90.311/ Javanese 84.612/ Lahnda (Pakistan) 78.313/ Telugu 69.8
sevcikova@ufal.mff.cuni.cz
10
Variability of languages
14/ Vietnamese 68.6
15/ Marathi 68.1
16/ French 67.8
17/ Korean (South Korea) 66.3
18/ Tamil 65.7
19/ Italian 61.7
20/ Urdu 60.6
...
81/ Czech 9.5
...
123/ Slovak 5.0
...
172/
Tachelhit
(Morocco) 3.0
Slide11Languages of the worldnearly 7 thousand living languages in the world
spoken by 6 billion speakersa living language has at least one speaker for whom it is a first languageVS. extinct languages and languages spoken only as a second language
Ethnologue
: Languages of the World
, 1
9
th
edition, 20
16
Slide12Ethnologue
Ethnologue: Languages of the World, 16th edition, 2009print and web publication, M. Paul Lewis (ed.)
1248 pages, ISBN/13: 9
78-1-55671-216-6
http
://
www.ethnologue.com
6,909 descriptions
of living languages organized by continent and country
plus languages which are used only as a second language (28 lang.)
plus
lang
.
which have gone out of use since the first edition of
Ethnologue
in 1951 (421 lang.)
Σ
7358
lang
.
no long-extinct or ancient languagesa list (catalog) of languages sorted according to various aspectsrather than an encyclopedia describing the languages
coming from numerous sources, confirmed by reliable published sources and a network of field correspondents
sevcikova@ufal.mff.cuni.cz
12
Variability of languages
Slide13Information on languages in Ethnologue
a complete entry for a language:Primary language name
[ISO code] (Alternate names). Country speaker population. Population stability comment. Population in all countries. Monolingual population. Population remarks. Ethnic population. Location.
Class
:
Linguistic affiliation.
Macrolanguage
membership.
Dialects
:
Dialect names. Intelligibility and dialect relations. Lexical similarity.
Lg
Use
:
Language function. Bilingualism remarks. Domains of use. User age groups. Language attitudes. Viability remarks.
Lg
Dev
:
Literacy rates. Literacy remarks. Use in elementary or secondary schools. Publications and use in media.
Writing
: Scripts used. Other
: General remarks. Linguistic typology. Religion. Status. Map:
Map information.
sevcikova@ufal.mff.cuni.cz
13
Variability of languages
https://www.ethnologue.com/world
Slide14Genealogical classification of languages
based on genetic principleget rid
of
load
words
,
onomatopoeic
words
and
coincidences
search
for
regular
correspondences among certain
languageslanguages displaying systematic similarities and differences must have descended from a common source language, they were genetically related, i.e. form a language family
reconstruction of
proto-languages and proto-languages from
protolanguagesidea: know the roots
Slide15Genealogical classification of languages
languages displaying systematic similarities and differences must have descended from a common source language, they were genetically related
,
i.e
.
form
a
language
family
Slide16Indo-European
language
family
tree
Czech
Sanskrit
Greek
Latin
Gothic
jsem
asmi
eimi
sum
im
jsi
asi
essi
es
is
je
asti
esti
est
ist
4,500 – 2,500 years ago
Slide17Indo-European family: Ethnologue
vs. WALS426
vs. 176 languages
Indo-Iranian genus (310 lang.) in
Ethnologue
vs. Indic and Iranian genera (53 and 26 lang.) in
WALS
18 vs. 17
Slavic
languages
17
Ethnologue
WALS
Belorussian
Belorussian
Bosnian
Bosnian
Bulgarian
Bulgarian
Czech
Czech
Kashubian
Kashubian
Macedonian
Macedonian
Polabian
Polish
Polish
Russian
Russian
Rusyn
Croatian
Serbian
Serbian-Croatian
Silesian
Slavonic
Old
Church
Slovak
Slovak
Slovene
Slovene
Slovincian
Sorbian
Sorbian (Lower)
Sorbian (Lower)
Sorbian (Upper)
Sorbian (Upper)
Ukrainian
Ukrainian
Slide18Which two languages
are genetically related?
A
B
C
daḳiḳa
dakika
daka
minuta
ra
’s
kichwa
daka
hlava
daftar
daftari
maxberet
sešit
šams
jua
šemeš
slunce
nafs
moyo
nefeš
duše
ḫabar
habari
xadaša
zpráva
riḡl
mguu
regel
noha
wizāra
wizara
misrad
ministerstvo
Slide19Nostratic hypothesis
proposed, but still controversial, language family, 12,000 years ago language families of
Eurasia
originates with Holger Pedersen
(1903)
Slide20Slide21Slide22?
Slide23Glottochronology
time of divergencelist of Swadesh (Swadesh, 1955) – 100 basic wordsStarostin’s
method (get rid of loan words, time dependency)
Slide24Linguistic Typology
classification of FEATURES in the particular language phonologicalmorphological
word formation
syntactic
typology
structural
...
method: detailed study in one language and comparison to others
idea: structure, essence of types, way of thinking
Slide25WALS – The World Atlas of Language Structures
wals.infodatabase of phonological, grammatical and lexical properties of languages
obtained from reference grammars and other descriptive material
55 authors
e.g.
Greville
G. Corbett, Martin
Haspelmath
, Bernard
Comrie
, Matthew S. Dryer
1st version in 2005 (book with CD-ROM, Oxford University Press), 1st online version (
WALS
Online
) on 2008, current version
from 2013
WALS Online
as a joint effort of the Max Planck Institute for Evolutionary Anthropology (Leipzig) and the Max Planck Digital Library (Munich)
144
features (structural properties of language that describe “one aspect of linguistic diversity”)
concise linguistic description of the feature2 to 28 values of the featuredistribution of the feature values on the mapan entry for each languagename, geographical info, list of relevant features
25
Slide26Areal classification
language contactswave model of language change:
a new language feature
spreads
from a central region of origin in continuously weakening concentric circles
idea: sociolinguistics, contacts, history
Slide27Standard Average European
(SAE) terminology – B.Whorf
idea
– there are some features that European languages tend to have in common
Sprachbund
includes
:
Germanic
languages
;
Romance
languages
;
Baltic
languages
;
Slavic
languages; Albanian; Greek; Hungarian
Slide28SAE: a periphrastic perfect formed with 'have' +
pass_part
Slide29References
Swadesh, Morris. (1955). Towards greater accuracy in lexicostatistic dating. International Journal of American Linguistics, 21, 121–137Swadesh, Morris (1972). What is glottochronology? In M. Swadesh,
The origin and diversification of languages
(pp. 271–284). London: Routledge & Kegan Paul
.
Nostratic Dictionary
,
Aharon
Dolgopolsky
,
2008
Starostin
, Sergei. Methodology Of Long-Range Comparison. 2002
.
Slide30Nostratic
Dictionary, Aharon Dolgopolsky, 2008