|
More on Munda Languages Project and the Munda
Languages
The output of the Munda Languages project will include the
digitization of existing legacy materials, a searchable
cross-language database of the Munda languages that will
serve as the basis for all future linguistic research on
this poorly known family of languages, as well as a
searchable database of annotated audio/video materials on
the languages (using ELAN as the basis of the annotations).
Note that only rough estimates are available for numbers of
speakers of many of the endangered languages and smaller
Munda-speaking populations. This is due largely to the fact
that the Indian census does not list language/ethnic groups
numbering under 10,000 persons. Also there is considerable
confusion of language and ethnic group names as well.
The Munda Languages project has three basic facets:
documentation and archiving of endangered language
materials, digitization and annotation of legacy materials
dating back 45 years and the compilation of a web-accessible
database of typological features of Munda languages.
Documentation
The documentation project begins with the video and audio
recording of speakers of various ages, levels of competency
and dialects from the endangered languages listed above.
Annotations will minimally have four (or five) tiers: one
rendering the Munda language in IPA transcription, one tier
of interlinearized glossing using the Leipzig glossing
conventions, an English translation and a translation into
Oriya and/or Hindi, whichever is appropriate (or in the case
of the widespread and disparate Turi, both).
The digitization is to be carried out in conjunction with,
and the archiving of the data from the project will be
housed in, ELAR, at the School of Oriental and African
Studies, University of London, with a mirror site housed on
a server at the local host institution, Department of Tribal
Languages, University of Ranchi, Jharkhand State, India. Dr.
Ganesh Murmu is our local contact.
Digitization of legacy materials
A number of legacy materials in different media need to be
digitized and annotated to supplement the field data. These
legacy materials are in a variety of formats, ranging from
analog recordings dating back 40-45 years, unpublished text
collections and lexical lists, including the massive Munda
comparative lexical materials described below.
Typological Database
The annotated sessions of the endangered Munda languages are
being entered into a searchable, web-accessible relational
database, linked to audio/video files and text-type
annotations according to a number of typological features,
viz. vocalic (including suprasegmental) and consonantal
features, features of nominal and verbal morphosyntax
inflectional and derivational categories, auxiliary
structures, etc.), as well as characteristics of simplex and
complex clause structure. Entries consist of values and
commentary discussion, time-linked to video and audio
examples whenever possible.
This typological database of Munda languages when completed
will ultimately serve as a complement to the large
comparative Munda lexical database already under way (see
below)
Additional Information on the Munda Languages The Munda language family of eastern and central India
represents one of the most fascinating and theoretically
stimulating language families on the planet. Unfortunately,
very little primary data on the roughly 20-odd members of
the Munda language family are widely known or even available
to the world wide linguistic community. This is in part due
to the fact that for some languages, the data is quite out
of date and for others, the only materials that exist are
unpublished, or in hard to find sources and/or in languages
that are not widely known by the linguistic community at
large.
Where are the Munda languages spoken and how long
have they been there?
Although probably immigrants from the east (where most of
their sister languages in the broad Austroasiatic phylum
remain today) the Munda peoples appear to be the tribal
autochthons of eastern India, their ancestors having already
occupied their current domains of inhabitance at a time
significantly predating the arrival of Aryan- and Dravidian-
speaking populations of the region. This is codified in the
standard designation applied to all Munda-speaking peoples
(and strictly speaking to certain non-Munda peoples as well)
adivasi ‘first’.
Currently Munda-speaking peoples are found in large
concentrations in the Indian states of Orissa, Jharkhand,
and Madhya Pradesh, with further communities in adjacent
parts of the states of Chhatisgarh, Bihar, West Bengal,
Uttar Pradesh, Andhra Pradesh, and Maharashtra, and even
further a field in Bangladesh and Nepal.
How many people speak these languages?
Of the roughly two dozen or so Munda languages still spoken,
at least one quarter (if not more) appear to exhibit some
degree of language endangerment, ranging from moribund (Gorum)
to severely endangered with a few hundred (Koda/Kora) or a
few thousand speakers (Hill and Plains Gta?, Remo, Turi;
also Bijori, Agariya, Bhumij, Korwa and Mahali not covered
in the present proposal); for at least one endangered Munda
language, Koraku, no data is available as it is conflated in
census statistics with Korwa or previously Korku. The
non-endangered but threatened languages have in the tens to
hundreds of thousands speakers still (Juang, Kharia, Sora,
Gutob, Birhor, Bhumij) while stable languages often number a
million (Mundari) or several million (Santali).
Turi has maybe 4,000 Kherwarian Munda speakers scattered
throughout various districts of Jharkhand, West Bengal,
Chhatisgarh, Orissa and Madhya Pradesh. For certain groups,
what little information there is often conflicts with other
such reports, e.g. KodÚa (Kora) appears to have but 1-2%
language retention among the heavily Aryanized (Bengali) or
Dravidianized (Kurux) population of 31,000 according to
Parkin (1991: 24), i.e. yielding under 500 total speakers),
but has been reported to have as many as 7-25,000 in other
sources–a number that assuredly reflects ethno-linguistic
identity rather than linguistic competence per se (a stated
policy of the Indian census).
What languages are Munda-speaking people speaking instead of
their ancestral tongue?
While many Munda-speaking peoples also command one or more
Indo-Aryan or Dravidian language fluently (e.g. Bengali,
Hindi, Chhatisgarhi, Desia Oriya, Sadani/Sad[a]ri, Marathi,
Kurukh, Telugu), the rates of ancestral ‘mother tongue’
preservation among the youngest generation, as well as the
sociolinguistic dynamics and contexts of its use in the
actual Munda-speaking communities are generally lacking,
even in the most recent such sources (e.g. the LSI Orissa
2002 volume; Ishtiaq (1999); Itagi and Singh (ed.) (2002)).
Who are the Munda-speaking people? Munda peoples practice a range of traditional indigenous
religions sometimes mixed with locally appropriate
quasi-Hindu practices (as well as Christianity in some
areas), venerating stone megaliths built by their ancestors,
maintaining sacred groves, and in places still practicing an
ancient water-buffalo ritual sacrifice. Over the past centuries, some Munda-speaking peoples have
been largely discriminated against in India as meat-eating
non-Hindus (and non-Muslims). In terms of traditional
economy, Munda-speaking peoples mainly practice[d] nomadic
hunter-gatherer foraging and/or subsistence agriculture. In
recent times, an urban population has developed, notably in
Ranchi, the capital of the newly constituted Munda-dominant
state of Jharkhand.
What are Munda languages like?
Although poorly known, what little is known about the Munda
languages seem to have great relevance to several unrelated
fields of inquiry in comparative linguistics, as well as to
the prehistory of the Indian Subcontinent. These include
general theoretical and typological linguistic studies,
South Asian areal studies, and the history of the
Austroasiatic language family more widely.
In general, Munda languages appear to exhibit a typological
profile that is very different from that which is typical of
the Mon Khmer languages to which they are related (cf.
Donegan and Stampe 1983; Donegan 1993), but these
differences are not always attributable to Dravidian and/or
Indo-Aryan influence. For example, the verb structure of the
North Munda languages is extremely synthetic, indeed
significantly more synthetic than structures typical of
either Dravidian or Indo-Aryan languages. In this way, they
share certain structural affinities with so-called
‘pronominalized’ Tibeto-Burman languages, with which they
may have formed an earlier areal group, prior to the
intrusion of Dravidian- and Indo-Aryan speaking populations.
A better understanding of the nature and origin of the Munda
languages will help elucidate the complex issues surrounding
the nature and degree of synthesis characteristic of the
ancestral Proto-Austroasiatic [PAA] language as well as the
original clausal syntax and system of nominal categorization
and inflection found in PAA.
With regards to verbal and syntactic phenomena
characteristic of Munda languages (insofar as these can be
gleaned from the attested sources) there appear to be
systems of noun incorporation patterns that are highly
marked or even unique: double argument and even agent
argument noun incorporation in Sora.
Another characteristic of the verbal systems of particular
Munda languages that are rare or unique among the world’s
languages is the agreement of a verb with both an argument
and a logical possessor of that argument (rather than a kind
of ‘possessor raising’ where the possessor of the argument
is preferentially encoded as the argument itself–a system
found in numerous languages worldwide) that is attested in
Santali (Neukom 1999) and Santali-like Kherwarian North
Munda varieties, e.g. the Turi language covered in this
proposal, which has been reported to be a Santalized
Mundari-like speech variety. For more see Anderson (2007).
Several Munda languages are reported to have contrastive
creaky voice and/or low pitch, laryngealization or other
voice register phenomena.
What kind of documentation is there of Munda languages?
The majority of the Munda languages could be considered
poorly documented, including some of the larger and
non-endangered ones (e.g. Ho, Sora, Korku). Even basic
demographic information on certain of these groups is
lacking. Despite the existence of some Munda language
materials in the (1904) Linguistic Survey of India [LSI],
these are far from satisfactory. As Emeneau put it in his
1955 work (cited in Mahapatra et al. [eds.] 2002)
"[o]n the
Munda languages little need to be said. They have so far
either been badly described or known only as names in the
Survey, which certainly did not succeed in mapping them
all." Many of the Munda varieties represented in the LSI are
simply translations of the prodigal son tale from the Bible.
Selected Publications on Munda languages
|
PERMISSION REQUEST PENDING
Recent advances in Proto-Munda reconstruction
(Mon-Khmer Studies 34: 159-184. 2004)
[AndersonMKS copy.pdf] |
|
Dravidian Influence on Munda
(International Journal of Dravidian Linguistics 32 (1):
27-48. 2003). |
The Munda Online Comparative Dictionary
The Munda lexical database project has been underway for
several years and in its current draft form holds roughly
50,000 entries from 12 languages. It was begun by Dr.
Manideepa Patnaik in 1999 and joined by Dr. Gregory Anderson
the following year. It currently exists in Word and Excel
formats, but there are not yet any associated sound files
(although many have been recorded for example for Ho and
Sora, to a lesser extent Remo as well), and insufficient
metadata.
This database will include not only the attested forms, but
for a number of entries, intermediate proto-language forms
and where possible, Proto-Munda forms are being added as
well (based on currently ongoing research), thus furthering
its use to researchers in other historical or social
scientific disciplines dealing with India. Living Tongues
Institute for Endangered Languages is currently engaged in
the process of collecting sound and other media files to
populate this resource.
Sample Entry From Munda Comparative Lexicon
Sample legacy lexicon sheet
Talking Comparative Dictionary Samples
Click
play button once or twice to open in your default audio
player.
Some files are larger and may take longer to load.
Ho Warang Chiti Unicode initiative
Researchers from Living Tongues Institute have been working
with representatives of both the Ho community of India as
well as the Unicode Consortium to facilitate constructive
dialogue between these groups on the proper encoding of the
indigenous Warang Chiti (Varang Kshiti) script so that the
Ho community may communicate over the Internet and have an
Internet presence of their own design. Click here to see a
draft of the preliminary report submitted.
Sample of Ho writing
Preliminary Report to Unicode
|