Wednesday, April 2, 2008

The Comparative Method II

Let's say we have languages A, B, C, and D, and we want to know if they're genetically related. We start by taking a sample of basic vocabulary from each of them, and then compare those samples to each other. The best semantic categories from which to draw basic vocabulary includes pronouns (I, you, he, etc.), numbers (especially 1-10), human body parts (eye, head, heart, etc.), kinship terms (mother, father, etc.), natural objects and phenomena (sun, moon, water, fire, etc.), and very basic actions and states (see, know, die, etc.). Here's a small selection of possible vocabulary from languages A, B, C, and D.

A

B

C

D

English

ka

ka

ko

zimi

‘we’

nami

name

nom

soo

‘two’

paluma

paroma

polm

jai

‘eye’

litiki

redege

riik

enefo

‘mother’

kutusu

kodozo

kuts

vaha

‘water’

panu

pano

pon

haaz

‘eat’


In the comparative method, we're looking for regular phonological correspondences (which I'll abbreviate to RPCs). That is, given a set of words from each of the languages in question, we want to see if there is a regular pattern of phonemes that occurs between the items in the set. For example, look at the words for 'we' and 'water'. In all three of languages A, B, and C, these words begin with the phoneme k-. Likewise, in the words for 'eye', and 'eat', all the words in A, B, and C all begin with the phoneme p-. And in the word for 'two', we find all the words in A, B, and C beginning in n-. These are all examples of RPCs.
The phonemes don't all have to be the same to be regular. In the words for 'eye' and 'mother', language A has -l- whereas language B has -r-. Likewise, in 'we', 'two', 'eye', and 'eat, languages A and B have -a- where language C has -o-. These correspondences are just as regular as the k-, p-, and n- correspondences because they recur in multiple sets of words.
Now, what about language D? A close comparison between the different phonemes in language D's words and those of languages A, B, and C, reveals no meaningful correspondences; this suggests that D is not genetically related, or at least not closely genetically related, to languages A, B, and C, which, due to their numerous correspondences, seem to be very closely genetically related.
Now, a set of 6 words is not nearly enough to make conclusions about genetic relationships. Ideally, several hundred word sets should be selected, including not just nouns, verbs, or adjectives, but grammatical items such as plural markers on nouns, person, number, and tense markers on verbs, prepositions, different pronoun forms, etc. The more functional the items examined (those parts of the vocabulary which contribute more structure than meaning), the more solid the conclusions that can be drawn about genetic relationships.
Now, as an exercise for you the reader, here is a set of data from 5 European languages. Many of you will be familiar with at least some of these languages, but I'm going to change their spelling to make them slightly less recognizable, and also to reflect their true pronunciation a little better. Your task is to examine the word sets, look for regular phonological correspondences, and make a guess as to which languages are genetically related to which others (or which are more closely related to which, if it seems that they are all ultimately related).

A

B

C

D

E

English

ik

io

zhuh

ih

yo

‘I’

maan

luna

lyn

mont

luna

‘moon’

zyster

sorella

sur

shvester

ermana

‘sister’

vyyr

fwoko

fuh

foyer

fwego

‘fire’

akht

otto

wit

akht

ocho

‘eight’

vut

piede

pie

fus

pie

‘foot’

drinken

bere

bwar

trinken

bever

‘drink’


Bonus question: to which of the above do you think English is most closely related? What phonological correspondences can you name between English and that (or those) languages? Remember to concentrate on pronunciation, not spelling, since how a word is spelled may not reflect its pronunciation accurately.


The Comparative Method I

The phonological change that languages undergo over time is at present probably the best understood aspect of language change. Phonology is the component of speech that deals with the sounds utilized in a particular language and how they are combined to produce lexical items (words).
All spoken human languages have a set of speech sounds that they utilize, known as a phoneme inventory (each sound in the set is known as a phoneme). Out of the total number of sounds that humans can produce and regularly use in language production, only a subset is utilized by any particular language, though languages differ on which set of sounds they utilize. An important distinction must be made between phonetic differences and phonemic ones. A phonetic difference within a language does not contribute to differences in meaning (though it may mark differences in regional accent or dialect). For example, in English, the pronunciation of the 't' in 'top' is slightly different from the pronunciation of the 't' in 'stop'; the first 't' is pronounced with a slight puff of air (known as aspiration), while the second 't' is not. This difference is a phonetic one, since the two sounds are not identical in an absolute sense, but is not a phonemic difference, since pronouncing 'top' without the puff of air, or pronouncing 'stop' with it, will not change the fact that we're still saying the words 'top' and 'stop - that is, there'll be no confusion on the part of the listener. However, if we changed the 't' in 'top' to a 'p', it would result in saying a different word, 'pop'; the difference between 't' and 'p' in English is therefore a phonemic one, because it serves to distinguish different words from each other. We say, then, that 't' and 'p' are different phonemes in the phonemic inventory of English, while 't' with aspiration and 't' without aspiration are known as different allophones (essentially different phonetic versions) of the common phoneme 't'.
The phoneme inventory of any given language is unstable over time; some phonemes are lost, some gained, some become differentiated in different contexts. For example, Old English (the variety of English that was spoken in Britain between the 5th and 12th centuries AD) did not utilize the sound 'zh', as in the word 'seizure', in its phoneme inventory. This sound was only added later through the addition of words from Old French and through a phonological process called 'palatalization' (the process that causes 't' to be pronounced 'ch' in combinations such as 'what you'). Similarly, Old English had a phoneme 'kh' in its inventory (the sound of 'ch' German 'Bach' or Scottish 'loch'), but this phoneme was eventually lost by the Modern English period (roughly 1500 to the present), although it is maintained in the related Scots language.
So, we know that a language's phoneme inventory changes over time. But how did we figure this out? Through a process called the Comparative Method, we're able to reconstruct earlier stages of a language (most easily earlier phonological stages), even in the absence of any written records from those earlier stages. In the comparative method, we compare the vocabulary between different languages to find regular phonological correspondences between them.
The first step in this process is to figure out if a set of languages is genetically related (i.e. evolved from a common ancestral language). If we know nothing about the languages beforehand, we more or less have to start from scratch, choosing a group of languages that are in close geographical proximity, and comparing very basic vocabulary items in each of them. In part II of this post, I'll illustrate this process first with some imaginary data, then with some data from a group of European languages.

Sunday, March 30, 2008

What is Historical Linguistics?

More than any other subfield of linguistics, historical linguistics seems to get a lot of media coverage. You've probably at one time or another seen a newspaper article talking about how the latest linguistic research has shed new light on some ancient human migration, or on the influence that one culture has exerted over another. Because of this, historical linguistics seems to be the branch of ling that the general public is most familiar with, and (justly, in my opinion) the most fascinated by.
Broadly, historical linguistics seeks to examine two things: the history of human language, from its origins in the distant past up to today, as well as the history of particular languages and language families; and human history itself, through the evidence of past migrations and interactions that has been left in today's languages. When historical ling evidence is combined with evidence from archaeology, cultural anthropology, and genetics, a huge amount of information about the dim recesses of pre-history can be made available to us.
So what do historical linguists study, and what do we do with what we find? Well, we have to start with the processes that universally affect human languages over time:
  • Human languages are constantly changing. The mutations that gradually accumulate in human languages can be likened in certain ways to those that accumulate in the DNA of living organisms. The changes are slow and often random (though there are some general patterns), but given enough time they can completely transform the identity of a language.
  • Languages are constantly affecting one another. Unless a community is totally isolated from all other humans, its language is going to be affected by the languages of neighboring peoples, and will affect those languages in turn.
  • Languages are constantly being replaced. If a community shifts from language A to language B, until no one continues to speak language A, then language A is extinct. However, new languages are always coming into existence. If community X splits into communities Y and Z, and these two communities then become isolated from each other, then eventually language X will become language Y in community Y and language Z in community Z.
We then examine the relationships between languages, of which there are two general types:
  • Genetic relationship. Two or more languages are said to be genetically related if they share a common ancestral language. That is, if language X has evolved into languages Y, Z, and W, perhaps due to the original community X separating into isolated communities Y, Z, and W, then languages Y, Z, and W are said to be genetically related, since they share a common ancestor in language X (it's important to remember that once X has become Y, Z, and W in the daughter communities, X has functionally ceased to exist). Note that, though the term genetic is used, no relationship to DNA or molecular genetics is meant to be implied. A group of genetically related languages is known as a language family.
  • Areal relationship. Two or more languages are said to be in an areal relationship if they have affected each other in any way over time. This includes the transfer of vocabulary, grammatical structures, or sounds between any of the languages, and often implies a history of cultural, governmental, trade, or military influence among the peoples in question. The terms Sprachbund (German: "language union") and linguistic union refer to a group of areally related languages.
  • The two relationships are not mutually exclusive; languages that are genetically related may also be areally related, and it often happens that a language shares a genetic relationship with one set of languages and an areal relationship with a different set.
We now have the basic tools to begin analyzing the linguistic data. In the next post, I'll talk about some common features of linguistic change, and we'll take a look at some real-world examples.

Thursday, March 27, 2008

Para los que hablen castellano

Como mencioné abajo, hablo español/castellano, así que de vez en cuando haré posts para todos ustedes que sean de latinamérica o cuyos corazones pertenezcan allí. Tentaré de balancear el contenido de este blog entre los dos idiomas, pero ya que uso el inglés más, probablemente resultará que la mayoría del contenido estará en ese idioma.
E forze scriverò qualche post in italiano, quando il desidero mi colpa... sebbene il mio italiano già non è tan buono come era...

Wednesday, March 26, 2008

Preliminaries

So I guess I'll start off by introducing myself, though the only people who are likely to read this blog are people I already know. My name's Aaron, I'm from San Francisco, California, and I'm currently working on my MA in linguistics at the University of New Mexico in Albuquerque. I'm a Bahá'í by religion, and did my year of service in La Paz, Bolivia, from July 2006 to July 2007.
As far as linguistics goes, my interests are firmly in the anthropological side. I'm primarily interested in historical linguistics, especially language contact, and in endangered language revitalization, which I plan to make a career of. What I am not interested in is theory, though I have adopted the Functionalist/Cognitive approach over the Formalist/Movement-Based/Chomskyan approach. Seriously, though, I find questions of (psycholinguistic) theory generally tedious and uncompelling, especially compared to my real passion for historical reconstruction, classification, and language contact phenomena. If I could get away with it, I'd be content with merely collecting the data, leaving others to interpret how it contributes to our Theory of Language, but apparently that's unacceptable - at least for a master's student.
Anyway, I know a lot of linguists are irked by the public perception that a linguist is someone who "knows lots of languages", but I in fact pride myself on being a polyglot, since it's made so many aspects of my life (not the least of which my linguistic work) easier. I am a native speaker of English, and speak Spanish at very close to native fluency. In descending order of competence I also speak Italian, German, Arabic, Portuguese, and French. I'm currently learning Navajo, and I'm familiar with Latin, Dutch, and Aymara. I would like to learn Mandarin, Persian, and Quechua as well.
I intend for this blog to focus mostly on linguistics, but I'll have commentary on history, current events, religion, and other topics as well. I'll try to update it at least once a week, maybe twice, so check back if you're at all interested in anything I end up rambling on about. Finally, I'm also an amateur SF (speculative fiction) writer, so if you like soft science fiction, historical fiction, and alternate history fiction, I'll have some of my shorter work on here at some point too.

Sunday, March 23, 2008

Chicken

First post, dedicated to everyone's favorite movement-based grammar villain, Da Feature Chicken. Hop some affixes for his stamp of approval.