Move back one step  Move forward one step  Display the start-up screen
Larger font Smaller font

The Indo-European background

Major language families
The Indo-European family

Before looking at the Germanic languages, and English in particular, students should grasp the relationships of these to the other branches of their parent language family – Indo-European – with which they form a genetic group. The different groups later split up into further subgroups and ultimately into the individual languages we know. In the course of this the speakers often moved from their original homelands, in many cases the centre of a group was displaced considerably, e.g. the Celts who are no longer found in central Europe, but on the fringe of north-west Europe. Some groups, such as the Illyrians, disappeared entirely and some, such as Anatolian (represented by Hittite), have died out but are recorded in historical documents.

Major language families

The names in Italics after the Arabic numerals indicate the language families, the names in square brackets are those of the major languages of each group.

EUROPE AND ASIA MINOR 1) Indo-European (see below). 2) Finno-Ugric [Finnish, Estonian, Hungarian and various smaller languages in Russia as well as beyond the Urals, e.g. the Samoyed languages]. 3) Caucasian [languages of the mountainous region between the Black Sea and the Capsian Sea, characterised by many highly differentiated languages in a small area, Geogrian (south Caucasian) is the best known of these]. 4) Altaic (Turk languages) [Turkish and various other languages, such as Azeri, Uzbek, Turkmen, etc., stretching eastward of Turkey as far as the border with China, includes Mongolian and Tungusic languages]. 5) Independent The only surviving independent language in Europe is Basque which has not been proven to be genetically related to any surrounding language. From history we have other examples of language isolates, e.g. Etruscan in ancient Italy.

NORTH-EAST ASIA (SIBERIA AND ALASKA) 1) Paleo-Asiatic [consists of a few small languages spread over a vast area of eastern Siberia]. 2) Eskimo-Aleut [few speakers spread over a large area stretching from Siberia through Alsaka and Canada to Greenland].

NORTH AFRICA 1) Afro-Asiatic (Hamito-Semitic) [Branches into Semitic, which includes Arabic proper, Hebrew, Ethiopic and Aramaic, and Berber (in the Atlas mountains), along with Cushitic, Egyptian (Coptic) and Chadic (Hausa); it is the language family with the oldest linguistic records].

SUBSAHARAN AFRICA 1) Niger-Congo [a very large group, grouping into Western Sudanic, with the branches Mande, West Atlantic, Gur and Kwa, and Benue-Congo of which the main branch is Bantu with over 500 languages stretching down to South Africa, includes Xhosa, Zulu and Kiswahili]. 2) Nilo-Saharan [a diverse group stretching across the Sahara to Sudan].

SOUTH AFRICA 1) Khoisan [Bushman, Hottentot and other minor indigenous languages of the South African peninsula, noted for the presence of clicks.].

SOUTH ASIA (Indian subcontinent, Pakistan) 1) Dravidian [Telugu, Tamil, Kannada; the second most important family in India]. 2) Munda [consists of a number of languages spoken on the east coast of India]. The remaining languages are Indo-European.

SOUTH-EAST, EAST ASIA 1) Sino-Tibetan [divides into at least three sub-groups: Sinitic, the chief representative of which are the dialects/languages of Chinese, Tibeto-Burman including Burmese and Tibetan, Tai which contains the two major languages Thai and Lao]. 2) Mon-Khmer [Khmer spoken in Cambodia]. 5) Independent. There are a number of independent languages in South and South-East Asia: Burushaski in Kashmir (northern India) is spoken by approximately 30,000. The language Ainu is spoken by even fewer speakers on various islands in northern Japan. Apart from these cases there are the three national languages Vietnamese, Japanese and Korean. A possible link exists between the latter two, but it is tenuous and contested.

AUSTRALIA AND OCEANIA 1) Austronesian [Indonesian, Polynesian; consists of hundreds of island languages spread throughout a large area in the West Pacific]. 2) Papuan [a group in a small area (that of the state of Papua, one half of the island state of Papua New Guinea); it contains very many different languages in a small area and is comparable in diversity with the Caucasus]. 3) Australian [the indigenous language family of Australia, consists of many languages spoken in all by not more than 50,000 aborigines].

THE AMERICAS Very many languages in many families are spoken by the native Americans of both continents. Among the major North American families (Canada, United States, Mexico) are: Algonkian, Wakashan, Salishan, Athapascan, Penutian, Yuman, Iroquoian, Siouan, Muskogean, Uto-Aztecan, Oto-Maguean, Zoquean, Mayan. The major families of Central-South America are: Macro-Chibchan, Ge-Pano-Carib and Andean Equatorial. Some of these languages, such as Quechua [Andean Equatorial], are spoken over a very large area (in Chile, Peru, Bolivia, Columbia and Ecuador) while Guaraní (in Paraguay) has official status alongside Spanish.

The Indo-European family

The proto-language of Indo-European probably originated in the area of present-day Ukraine / Southern Russia and was spoken up until about 3000 BC. From this area the speakers of this language spread out in various directions, eventually yielding separate dialects, the inputs to major subdivisions of the Indo-European family. Various features (most phonological) are used to distinguish these early divisions of the proto-language, such as the treatment of original /k/ which was either shifted to /s/ or preserved as /k/. Languages classified along this axis are said to be either centum languages (from the initial /k/ in the Latin word for 100) or satem (from the initial /s/ in the corresponding word in Avestan, an Indo-Iranian language).

Centum languages Satem languages
Celtic, Italic, Germanic Indian, Iranian, Baltic,
Greek, Hittite, Tokharian     Slavonic, Armenian, Albanian

Individual groups of Indo-European

INDO-IRANIAN This consists of the languages in and around Iran and of those groups who spread into north-west India and later throughout the whole country. Hindi and Urdu (the latter a close relative in Pakistan) are the main languages of the Indic branch whose classical form is Sanskrit.

ANATOLIAN An extinct group consisting in the main of Hittite, the language of the Hittite Empire (1700-1200 BC). Tablets containing remains of this language were discovered and identified in Turkey in the early twentieth century.

ARMENIAN A branch available from the ninth century AD in a Bible translation. It has continued as East Armenian (in the republic of Armenia) and West Armenian in Eastern Turkey.

TOKHARIAN Remnants of this language (an A and B version) were discovered by a German expedition at the beginning of the twentieth century in western China. It died out towards the end of the first millennium.

HELLENIC (GREEK) The set of dialects known as Classical Greek belong to this branch. Almost unbroken records are available covering over 2500 years. It continues as modern Greek.

ALBANIAN Despite its small numbers, Albanian represents a separate branch of the Indo-European family. First records are available from the 15th century.

ITALIC The term ‘Italic’ is used for those dialects of ancient Italy which include Latin but also Oscan and Umbrian (which strictly speaking form a separate branch). It continues as the set of Romance languages.

CELTIC Once spoken over a wide area in central Europe, the Celtic languages were pressed further west by rival Indo-European peoples which began to fill central and western Europe (Germanic tribes and Romans). It continues as the languages of the Celtic fringe of the British Isles and Breton in French Brittany.

GERMANIC This branch probably originated in southern Scandinavia and spread out from there to cover the area of present-day Germany, the regions to the south (Austria and Switzerland), the North Sea coast, England and the entire Scandinavian peninsula along with the Faroes and Iceland.

BALTIC A branch of its own with three representatives Lithuanian, Lettish and Old Prussian. The last language has been extinct since the 18th century. Present-day Lithuanian is particularly archaic and of special interest to Indo-Europeanists.

SLAVIC The oldest written form of Slavic is Old Church Slavonic. Nowadays there are three main branches: 1) Southern Slavic [Slovene, Croatian, Serbian, Macedonian, Bulgarian], 2) Western Slavic [Polish, Sorbian, Wendish, Czech, Slovak] and 3) Eastern Slavic [Russian, White Russian, Ukrainian].



Mallory, J. P. and D. Q. Adams 2006. The Oxford Introduction to Proto-Indo-European and the Proto-Indo-European World. Oxford: University Press.

Buck, Carl Darling 1949. A dictionary of selected synonyms in the principal Indo-European languages. Chicago: University of Chicago Press.


Clackson, James 2007. Indo-European Linguistics. An Introduction. Cambridge: University Press.

Fortson, Benjamin W. 2004. Indo-European language and culture. An introduction. Oxford: Blackwell.