The Economist had a pretty interesting chart last week that ranked India third in terms of language diversity.
Via: The Economist
I was curious to find out how languages were counted and classified in India and following is the result of my mini-research:
The constitution lists 22 languages in the 8th Schedule and these are called ‘scheduled languages’. Originally, there were only 14 languages listed in the 8th scheduled and since independence three (or two?) amendments were made to include Sindhi, Konkani, Meiteilon, Nepali, Bodo, Dogri, Maithili and Santhali.
The Census of India (since 1971?) reports only those languages which have more than 10,000 native speakers. Hindi is the mother tongue for 41% of Indians, followed by Bengali (8%) Telugu (7%), Marathi (7%), Tamil (6%), Urdu (5%), Gujarati (4%), Kannada (4%), Malayalam (3%), Oriya (3%), Punjabi (3%), Assamese (1%), Maithili (1%), Santali (1%), Kashmiri (1%). Nepali, Sindhi, Konkani, Dogri, Manipuri, Bodo and Sanskrit are spoken by less than 1% Indians.
It is interesting to note that there are 5 non-scheduled languages: Bhili/Bhilodi (Rajasthan/MP/Maharashtra), Gondi (MP/Chhattisgarh), Khandeshi (Maharashtra), Kurukh/Oraon (Andaman and Nicobar/Chhattisgarh/Jharkhand) and Tulu (Karnataka) which have more native speakers than Bodo (which is the scheduled language with least number of speakers, if we do not consider Sanskrit). Here’s the data:
To open spreadsheet in new window/tab, click here.
Chapter 4 of the Report of the National Commission for Religious and Linguistic Minorities also has some interesting data regarding the linguistic profile of states (see table 4.2) and a classification of states into 5 categories based on language diversity/linguistic tensions (see table 4.4).
The report is available here.