Wednesday, April 2, 2008

The Comparative Method II

Let's say we have languages A, B, C, and D, and we want to know if they're genetically related. We start by taking a sample of basic vocabulary from each of them, and then compare those samples to each other. The best semantic categories from which to draw basic vocabulary includes pronouns (I, you, he, etc.), numbers (especially 1-10), human body parts (eye, head, heart, etc.), kinship terms (mother, father, etc.), natural objects and phenomena (sun, moon, water, fire, etc.), and very basic actions and states (see, know, die, etc.). Here's a small selection of possible vocabulary from languages A, B, C, and D.

A

B

C

D

English

ka

ka

ko

zimi

‘we’

nami

name

nom

soo

‘two’

paluma

paroma

polm

jai

‘eye’

litiki

redege

riik

enefo

‘mother’

kutusu

kodozo

kuts

vaha

‘water’

panu

pano

pon

haaz

‘eat’


In the comparative method, we're looking for regular phonological correspondences (which I'll abbreviate to RPCs). That is, given a set of words from each of the languages in question, we want to see if there is a regular pattern of phonemes that occurs between the items in the set. For example, look at the words for 'we' and 'water'. In all three of languages A, B, and C, these words begin with the phoneme k-. Likewise, in the words for 'eye', and 'eat', all the words in A, B, and C all begin with the phoneme p-. And in the word for 'two', we find all the words in A, B, and C beginning in n-. These are all examples of RPCs.
The phonemes don't all have to be the same to be regular. In the words for 'eye' and 'mother', language A has -l- whereas language B has -r-. Likewise, in 'we', 'two', 'eye', and 'eat, languages A and B have -a- where language C has -o-. These correspondences are just as regular as the k-, p-, and n- correspondences because they recur in multiple sets of words.
Now, what about language D? A close comparison between the different phonemes in language D's words and those of languages A, B, and C, reveals no meaningful correspondences; this suggests that D is not genetically related, or at least not closely genetically related, to languages A, B, and C, which, due to their numerous correspondences, seem to be very closely genetically related.
Now, a set of 6 words is not nearly enough to make conclusions about genetic relationships. Ideally, several hundred word sets should be selected, including not just nouns, verbs, or adjectives, but grammatical items such as plural markers on nouns, person, number, and tense markers on verbs, prepositions, different pronoun forms, etc. The more functional the items examined (those parts of the vocabulary which contribute more structure than meaning), the more solid the conclusions that can be drawn about genetic relationships.
Now, as an exercise for you the reader, here is a set of data from 5 European languages. Many of you will be familiar with at least some of these languages, but I'm going to change their spelling to make them slightly less recognizable, and also to reflect their true pronunciation a little better. Your task is to examine the word sets, look for regular phonological correspondences, and make a guess as to which languages are genetically related to which others (or which are more closely related to which, if it seems that they are all ultimately related).

A

B

C

D

E

English

ik

io

zhuh

ih

yo

‘I’

maan

luna

lyn

mont

luna

‘moon’

zyster

sorella

sur

shvester

ermana

‘sister’

vyyr

fwoko

fuh

foyer

fwego

‘fire’

akht

otto

wit

akht

ocho

‘eight’

vut

piede

pie

fus

pie

‘foot’

drinken

bere

bwar

trinken

bever

‘drink’


Bonus question: to which of the above do you think English is most closely related? What phonological correspondences can you name between English and that (or those) languages? Remember to concentrate on pronunciation, not spelling, since how a word is spelled may not reflect its pronunciation accurately.


4 comments:

Jason said...

Nice posts! You write in an educational style which as a layperson I appreciate.

Regarding the challenge question, I would have to guess either A or C, although I believe it's A.

I am sure you are building up to this question. So do linguists actually break out gigantic spreadsheets of words from different languages to find phonological correspondence? Even if there are recognized similarities, how is this information used to trace back to a meta language. I am also curious, how much are linguists utilizing software algorithms to compare these phonological similarities?

Da Bank said...

Great questions, dude. We do indeed have to break out the gigantic spreadsheets (I have one going for the six of the Germanic languages - I'm attempting a reconstruction on my own to test my skills), but we usually don't do it by ourselves of course. As for algorithms, Professor Bill Croft at UNM was talking to me about work in that field a few months ago - he mentioned that some linguists have been trying to compile all of the most common phonological mutations to create a program that could help in comparative reconstruction work. It would definitely be easier than going through thousands of word sets by hand (by eye?).
I'll talk in depth about the goal of the comparative method in the next post, but it's basically to reconstruct the ancestral form of a set of genetically related languages, which gives us insight into the linguistic situation of a particular cultural at a particular point in time.
And, you're correct - A is the most closely related to English in the set (it's Dutch), but D is the next closest (German). C is French, which may have looked more similar to you because of areal effects - French has had a huge impact on the English language, which I could talk about in a future post.

Jalal said...

Akh! the knowledge hurts my brain!

Also Jason answered before I could.

I could recognize most of the languages, but not french.

B and E are also very closely related with C related more to those two than the other pair.

Da Bank said...

Hey, sorry about that Jalal. Next time I won't give away the answers immediately.
Well, since you and Jason are the only one's reading so far, I will say that you're correct, B and E are very close, and C is closer to them than to A and D. I'll talk about why this is the case and give some more examples in the next post.
I'm sure you recognized E as Spanish; B is Italian.