Response To Mike Campbell on Chinese Language Classification

An autodidact named Mike Campbell has issued a long critique of my Chinese language classification.
There are problems with his analysis.
First of all, Campbell says we need to defer to the Chinese on what is a dialect and what is a language. But top Sinologists in the West are saying that the Chinese are falling down on the job and not working according to the modern scientific definition of what is a language and what is a dialect.
The Chinese linguists operate, like Chinese medicine, according to a completely different format that is pretty much at odds with the one used in the West and in much of the rest of the world.
One element of this format is the fangyan. A fangyan has many meanings, but in Chinese it tends to mean “dialect,” or better yet, “topolect.” It also tends to mean the speech form of a given county. But the Chinese definition of the word “dialect” differs radically from the definition used by linguists elsewhere in the world. For one thing, questions of intelligibility with other lects are left out of the definition of fangyan.
Chinese linguists also use hua, which means something like “speech.” This tends to be more expansive than fangyan, but at the same time it can occur down to the level of dialect. Examples include Putonghua, Shanghaihua, Beijinghua, etc, but also Pinghua and Tuhua. It tends to be geographically based – the speech of a particular geographical location, however that geographical location can be expansive or very restricted. But this is not the case in Putonghua, which is just “average speech”, and is spoken all over China.
The third category is yu. Yu is probably the category that Western linguists would most commonly associate with “language” or even “language family.” Yu only refers to separate languages within Chinese. Outside Chinese, the word wen tends to be used. Examples are Wuyu, Minyu, Huiyu, etc.
No one seems to quite know exactly what the Chinese classification is at any given time.
According to Campbell, we must not do anything until the Chinese act first, but they only make a new language maybe once every few years, and they are failing even at that.
Campbell states that Scots and Bavarian are dialects, not languages. He says that Scots is a dialect of English and Bavarian is a dialect of German. However, Ethnologue says that Scots is a separate language and so is Bavarian. The intelligibility of Bavarian and German is only 40%. I lack figures for Scots, but clearly intelligibility is lower than 90%.
Ethnologue is run by SIL. SIL has been granted the task of assigning all of the new ISO numbers. An ISO number means that a lect has been officially recognized by the world linguistic community as a separate language. So SIL are the linguistic scientists who world community has given the task of deciding what is a language and what is not. Campbell is saying that SIL does not know what they are talking about.
Campbell states that mutual intelligibility cannot be determined by talking to speakers and simply asking them whether or not they can understand “those people over there.”
According to Campbell, this is inaccurate. He says the only way to determine intelligibility is through scientific testing methods looking for % in phonology, lexicon, morphology, syntax, etc. He also says that tonal differences are irrelevant for Chinese, because differences in tones do not impede communication, but I would beg to differ on that. Chinese speakers have told me that closely related lects with much different tones can be very difficult to understand, at least at first.
On Ethnologue’s Mexico page, extensive tests have been done on various lects spoken in small villages determining intelligibility between one lect and another. Intelligibility testing is commonly done by simply sitting a speaker of Lect A down in front of a recorded corpus of Lect B and see how much they can understand.
Campbell says that intelligibility testing on human informants is inherently erroneous because as speakers of Close Lect A hear more and more of Close Lect B, they can understand it over a period of time (the exposure factor). This is the problem of interdialectal learning.
Interdialectal learning (the tendency of closely related lects to hear each others’ lects and quickly learn to speak them and hence muddy the waters of intelligibility), trumpeted by Campbell as a reason that intelligibility testing cannot be done on human informants, is regarded by SIL as different from inherent intelligibility. Inherent intelligibility is best regarded as a test of the ability to use the mother tongue.
In other words, when two lects are said to be “inherently unintelligible” this appears to be referring to “virgin” speakers who have not yet had the opportunity to learn each other’s dialects.
Similarly, members of Lect A may simply be bilingual in Lect B, which also invalidates intelligibility testing. However, measures have already been developed to determine bilingualism and the degree of it. A favorite one is SLOPE. SRT is also used in bilingualism testing. Like other intelligibility testing instruments, they have been subjected to tests for reliability and validity over the years.
Further, testing has evolved to the point where we can begin to ferret out bilingualism from inherent intelligibility. In Casad 1974 the author describes testing done on speakers of Mazatec, a Mexican Indian language.
Intelligibility testing was done to see how well they understood Huautla, a related language. Three female speakers had scores in the 50-60% range, and three males had scores in the 90-100% range. Huautla is a local market language that is learned as a second language by many non-Huautla in the surrounding area. I would gather that 55% represents true inherent intelligibility and the 95% speakers represent practiced bilinguals.
At any rate, in the survey, the figures were averaged together so that Mazatec speakers had 76% intelligibility with Huautla and Mazatec and Huautla were said to be separate languages.
Campbell also throws out a red herring in the notion that certain members of a group may simply refuse to hear the language of another group and insist that they do not understand it. Although existent, this problem has little relevance in intelligibility testing. SIL does testing with cross sections of communities.
Furthermore, SIL notes that intelligibility is typically distributed evenly across a community with regard to sex, class and age.
The SD’s for inherent intelligibility in a community are narrow, less than 15%, whereas the SD’s for bilingualism are much higher. This is because in the case of bilingualism, communities differ. Some feel a strong need to learn the other language, others feel no need at all. Further, members differ in their access to an opportunity to learn the other language, even though they may wish to learn it.
This should throw out the notion that females, the aged, the young or the old, the wealthy or the poor, will automatically give us false data on intelligibility.
Campbell hints that intelligibility is poorly defined. However, SIL has listed a hierarchy of intelligibility. SIL says that intelligibility below 70% is “unintelligible” and intelligibility over 90% is “adequately intelligible” (this usually conforms to our ideas of a dialect). Between 71-89% is what SIL calls “marginally intelligible.” Lately, SIL throws most lects with under 90% intelligibility into separate languages.
Campbell recommends throwing out all intelligibility testing with informants as inherently inaccurate and focusing instead of measures of language similarity.
However, SIL notes that linguistic similarity is not an adequate single predictor of intelligibility. For instance, testing in the Philippines revealed pairs of lects with vocabulary similarity of 52, 66, 72 and 74% which had over 90% intelligibility (were inherently intelligible). Over 80% vocabulary similarity for lect pairs resulted in several cases of inherent intelligibility. So lexical similarity is not an adequate measure at all for measuring intelligibility.
In testing of Polynesian, Siouan and Buang, it was found that the higher the level of lexical similarity up to a certain point, the lower the intelligibility scores were. This is counterintuitive, but it shows once again that lexical similarity is poor measure.
Morris Swadesh was the founder of lexicostatistics, the study of lexical similarity. Lexicostatistics has its uses, but determining between closely related languages and dialects is apparently not one of them.
This myth seems to be dying a hard death. Robert Longacre and Sarah Gudschinsky were involved in long debates with Swadesh about the validity of lexical similarity measures, and they seem to have been proven right. The latest findings calculate that any study that uses lexical similarity alone to determine intelligibility of lects has a 4.5-1 chance of failing to do so with any reliability.
Word lists still have their uses. Where word lists show similarities between lects below 60%, odds are that we are dealing two separate languages, and there is no need to do any further intelligibility testing. And they have obvious uses in historical linguistics and in determining genetic relationships between languages.
Vocabulary similarity below 67%, though, typically reveals intelligibility estimates below 60%. Intelligibility below 60% is inadequate for all but the very simplest communication. Before any kind of even slightly complex or revealing messages can be conveyed, intelligibility usually needs to be over 85%. Casad found that 90% intelligibility on a narrative test was necessary before one could move to more complex kinds of communication. Here once again we get into the dialects.
Intelligibility is usually asymmetrical. In other words, Lect A can understand 80% of Lect B, but Lect B can only understand 70% of Lect A. There are arguments about the reasons for this, but one suggestion is that higher figures result from some sort of bilingual learning.
Campbell also points out that it is not uncommon that people speaking the same language cannot always understand each other. He asks how often we have heard a fellow English speaker of the same dialect say something and we did not catch what they were saying for some reason or other. The implication is that we need to throw out all testing with informants due to this.
SIL has actually examined this, and they often include a test called “home-town” in which people are presented with narratives within their own dialect and an intelligibility score is given for that. It is true that sometimes this is lower than 100%, but it is typically not much lower. Nevertheless, using the “home-town factors” of Lects A and B as controls in factor analysis helps greatly when moving on to actual intelligibility between Lect A and Lect B.
One thing to do is to throw out all sentences or questions that score less than 100% on home-town, since if the speakers can’t even understand these sentences well when their own people speak them, how can we measure how well they understand them when speakers of other lects speak them?
Campbell suggests that there are no tests available to use on human informants that pass the smell test of empiricism. This is not the case.
One test, the Sentence Repetition Test (SRT), has been used for decades, subjected to many papers and studies, and criticized and modified in many ways.
In this case of SRT, testing of group members individually has been shown to be superior to testing them in groups. The reason for this is because when you do intelligibility testing in a group of say eight people, you can run into a strong personality or high-ranking male in that group who might say he understands much more than he really does for some reason or another,  possibly to show off. The other less dominant group members then follow his lead and give false high readings on the intelligibility test.
Many linguists, led by SIL, have been leading the way in intelligibility testing for decades now. Some of the top figures in in this subfield are the couple Joseph and Barbara Grimes of SIL. Joseph Grimes is a retired linguistics professor from Cornell.
In addition, a number of computer programs have been created that help the researcher to test intelligibility.
Another charge, that intelligibility testing lacks adequate controls, has been shown to be false. Bias in both experimenter and subject has been shown to be a problem, as is the case in most or all science, and measures have been undertaken to deal with it.
The notion that this subfield of Linguistics, intelligibility testing, is unscientific should be laid to rest.
Ethnologue seems to place tremendous importance on mutual intelligibility, however defined. Mutually unintelligible lects are assumed to be separate languages by Ethnologue. Their criteria for splitting off a dialects into languages seems to be 90%. Below 90%, separate languages. Above 90%, dialects of a single language.
In conclusion, Mr. Campbell’s principal contentions in his critique are all incorrect.
First, he suggests that the very concept of mutual intelligibility between lects is impossible to define or prove. SIL has shown that the concept can be defined and tested by reliable instruments.
Second, he says that the use of human informants in mutual intelligibility testing is so prone to error that it cannot guarantee satisfactory results. This is not the case. SIL has proven, through decades of testing, that mutual intelligibility is best done, or possibly can only be reliably done, through intelligibility tests with human informants.
Third, he throws up a number of red herrings that supposedly prove the inherent unreliability of human informants in intelligibility testing. All of these are shown to be the very red herrings that I claim they are, although it is true that unrecognized bilingualism is a problem, but it can often be ferreted out.
Fourth, he says that the only way to reliably test for intelligibility is to compare lects via tones, phonology, morphology, syntax and lexicon. This is an extremely complicated process utilizing math and computer programs and can only be undertaken by practiced linguists. In truth, such elaborate testing, while interesting, is entirely unnecessary.
Fifth, he suggests that any Western reformulations of Chinese language classification need to first defer to the Chinese. The problem here is that the Chinese have completely fallen down on the job. We cannot defer to the Chinese without upsetting our entire system of language classification. The Chinese are entitled to their system, but it is at odds with that used by the rest of the world.

References

Casad, Eugene H. 1974. Dialect Intelligibility Testing. Summer Institute of Linguistics Publications in Linguistics and Related Fields, 38. Norman, OK: Summer Institute of Linguistics of the University of Oklahoma.
Casad, Eugene H. 1992. “State of the Art: Dialect Survey Fifteen Years Later.”‭ In Eugene H. Casad (ed.), Windows on Bilingualism, 147-58. Summer Institute of Linguistics and the University of Texas at Arlington Publications in Linguistics, 110. Dallas, TX: Summer Institute of Linguistics and the University of Texas at Arlington.
Grimes, Barbara F. 1992. “Notes on Oral Proficiency Testing (SLOPE).”‭ In Eugene H. Casad (ed.), Windows on Bilingualism, 53-60. Summer Institute of Linguistics and the University of Texas at Arlington Publications in Linguistics, 110. Dallas, TX: Summer Institute of Linguistics and the University of Texas at Arlington.
Grimes, Joseph E. 1992. “Calibrating Sentence Repetition Tests.”‭ In Eugene H. Casad (ed.), Windows on Bilingualism, 73-85. Summer Institute of Linguistics and the University of Texas at Arlington Publications in Linguistics, 110. Dallas, TX: Summer Institute of Linguistics and the University of Texas at Arlington.
Grimes, Joseph E. 1992. “Correlations Between Vocabulary Similarity and Intelligibility.”‭ In Eugene H. Casad (ed.), Windows on Bilingualism, 17-32. Summer Institute of Linguistics and the University of Texas at Arlington Publications in Linguistics, 110. Dallas, TX: Summer Institute of Linguistics and the University of Texas at Arlington.
Please follow and like us:
error0
fb-share-icon20
Tweet 20
fb-share-icon20

19 thoughts on “Response To Mike Campbell on Chinese Language Classification”

  1. Robert,
    I only have one name (first-middle-last), but I am one of the few people that everybody calls by the middle name, Mike. My article was not a long critique of your article–I have written it with a completely different goal in mind, I only made one reference but I can remove it if you wish. I have never had the job or felt the need to go around making people mad and I’m not interested in doing so in any way. I’m offering a friendly update and response to your article here. I hope we can continue communication in a friendly way, if you’re willing.
    It’s true and I agree, the word “fangyan” does not mean dialect and I think we established this already. The words developed with different histories and semantics. I agree that it’s good to stop calling individual locations dialect but rather topolect which is what the Chinese word means. Dialect can instead refer to a region of mutually intelligible topolects, and this is what Chinese linguists and SIL are doing. The Chinese linguists don’t lump all Min as one language–if I could translate some passages for you–they are saying they are indeed different languages. I’m not sure where you read this or was it older material from the time when that was assumed to be the case? (Min once was claimed as a blanket language).
    I think another issue is that frequently Chinese linguists write that they are aware of a particular speech of an area but more analysis is required before they come to conclusions, so there are large areas of the classification with “unclassified” topolects still, or areas like “Tuhua” that more is being discovered on a regular basis. Most of the analysis in China only touches at the county level but everybody knows there is more work to do with different languages spoken within the same county, especially in Guangxi province. More or less innocent until proven guilty approach. Coming out with new classifications every five years is not entirely true, they are merely updating the classification or shifting things as more data becomes available, that’s why it’s good to stay abreast with new information and I will try to find time this or next week to take a look at some of the newest articles coming out. Would you like me to send you any of the data (I’ll translate)? At least you know what is being written, whether you want to use it or not.
    I do understand and have researched the various kinds of informant testing that you have mentioned, my approach was more critical of Zheng Jincheng’s article of DOC-based lexical testing between major topolects (I’m away from my computer and don’t have the correct non-pinyin spelling of his name at the moment). However, I’m under the impression that our two approaches are significantly different: you’re dealing with informant testing, and I’m dealing with classification.
    All linguists know about work of Greenberg, Voegelin, et al, and although Greenberg has advocates like Ruhlen, most linguists are in opposition or use it as a rough guide until more data is collected, again innocent until proven guilty. Most of the linguists I know are not big fans of Greenberg but his work is important because he has used a consistent approach to set up divisions among hitherto great unknown areas of the world. The same can be said today of Karlgren. He produced some very important works, but as a pioneer a lot of what he knew then and what we know now are different and few people rely on Karlgren. But his readings were a primer for me in learning about Chinese dialectology and then I went on to read the developments after Karlgren and how a lot of it has changed.
    I did not suggest there are no tests available to use on human informants, instead I was saying that few if any researchers like aforementioned Zheng Jincheng have taken syntax, grammar and sociolinguistics into consideration. When only working with data, this is vital when you lack informants.
    However, if we do take the approach like you’re saying not everybody in Group A will have the attitudes towards Group B, but how can you know for sure? If it is deeply ingrained in their culture, or if Group B used to be owned as slaves or were at constant war in the past, then who is to say that anybody in Group A will have a differing attitude towards Group B? Group C and D may not have had this situation and their two languages maybe share the exact same difference as A and B, however your testing will produce very different results. I think it’s still important to compare the two languages as a whole, from phonology, grammar, syntax to determine their closeness. A Swadesh list can tell us, yes, these two languages fundamentally sprung from the same source because those choice of words are not usually borrowed but more work is required beyond a Swadesh list and what informants tell us (I think we disagree on this). OR, maybe what informants tell us and what we discover from analysis of the languages on paper should be treated as two different things, then we won’t necessarily be at loggerheads here.
    I find what you mention about the lexical similarity tests failing as interesting, and I believe it too. It reminds me of Sinoxenic borrowings of Chinese words into neighboring Korean, Japanese, and Vietnamese which all now have approximately 60% of their core lexicon borrowed from Chinese. But these languages belong to other families and developed separately. Although they share typological features in neighboring areas, suffice it to say the large number of shared vocabulary does not indicate intelligibility. But I’m sure you’ve already done this analysis on actual related languages like Galego. Your approach and my approach to this probably differ. I think it’s neat to tell laypeople Scots and Bavarian/Tyrolisch are separate languages, and you’re right they’re not mutually intelligible without exposure, but from a scientific approach are the percentages above or below 90%? I don’t know, but my hunch is that they highly likely fall into the range of dialect rather than language (do you think it’s okay to disagree with each other). An American colleague of mine here recalls once while studying in Leiden (and now fluent in Dutch and can speak German), at one time he met at a family function of a friend who all are Limburg speakers. He was confused at first whether it was German and well, it was just so different he could hardly make head or tails of it.
    My claim in my article is simply this:
    Upon first encounter, most people will not understand a single thing or very little. However paying attention closely they may understand a little more, and of course more with more exposure. But the question remains, does one actually have to specifically pick out and learn new phrases on their way to learning or can you pick them up in passing assuming to understand? I thought I could do this when I was learning Taiwanese Southern Min after achieving fluency in Mandarin. However I was wrong. There is a lot of similar vocabulary and structures, but most of the time I have to go out of my way to learn common phrases and names of things. Robert Cheng of the University of Hawai`i has written about the mistakes Mandarin speakers use (Mandarinisms) when trying to speak Taiwanese Southern Min when the correct way is more often than not a completely different syntax.
    I’ve done many oral comprehension tests on students learning English as a foreign language, and even those that have studied for more than ten years fail a lot on these tests. One of the sentences on my test is: “You sure put in a lot of overtime!” (being careful to say it just like we do in normal conversation). They are requested first to translate back into Chinese for me the meaning they heard without repeating any of it first, then second repeat the sentence in English. After the test, over 90% of the students who have studied English to an advanced level feel they understood everything on my test very well. However, their comprehension really fails them! More than 95% of my students give me a translation of “You should spend more time on that” or variations of that idea. Although they claim to understand, what they actually interpret into meaning is completely different than what was said. This doesn’t mean they don’t know English or that they’re not advanced students, it just means they haven’t been exposed to enough discourse in the language and that changes rapidly with daily exposure.
    I would like to thank you for continuing this discussion and I’m sure we can learn from each other, as I have in your article. I hope our discussions are fruitful and we can agree to disagree on certain things without being hostile–I’m not set in stone on everything but I am willing to take a critical look at the arguments at hand. One advantage to you if you wish to continue communication is that I have access to many of the writings coming from China. And I apologize again for my very first reaction from a few weeks ago.
    Regards,
    James Michael Campbell (“Mike”)

  2. Robert,
    Regarding the 4 Mins that Ethnologue lists. Actually there are 5. The name Pu-Xian is an abbreviation for two cities within Fujian that are also Min, but considered by Chinese linguists as a separate language and Ethnologue lists correctly.
    Also, Ethnologue lacks Qiongwen Min (of Hainan island) considered by Chinese linguists as a separate language. Ethnologue also lacks Shaojiang Min spoken in Northwest Fujian province and considered a separate language.
    Glad to be of help.
    Regards,
    James Michael Campbell (“Mike”)

  3. Robert,
    I think you’ve made a good point on fangyan, hua, and yu (there is also wen).
    Chinese is limited in its words and their etymologies to meet the standards of modern science, but I don’t see any misunderstanding going on between the writings of Chinese linguists, unlike what you say western linguists are saying. May I say a few words on this?
    Fangyan we have determined as topolect, but as used many centuries ago could also refer to any language of a different region. Today it has a specific use and currently applies to a “county”, notwithstanding the fangyan of neighboring counties may be the exact same thing. So topolect is still the best translation. Some counties have speakers of several different fangyan and languages and this has already been addressed in the literature.
    Hua is usually tacked on to a place name. The “speech” of a particular place as long as there are no others competing (for example Nanning in Guangxi has several languages). I’ve never seen it used by Chinese writers as Wuhua or Minhua .
    However, Pinghua and Tuhua do exist because these are smaller linguistic areas that linguists have limited information on and their classification has not been conclusive yet. Actually, “ping” itself means “flat” and “tuhua” is used by everybody in China to refer to the uncouth speech of hinterlands, as “tu” means dirt or earth. But Tuhua is an actual name of the speech of particular areas between Hunan and Guangdong, and there are different kinds of Tuhua within that group. There is still a lot of work being done in these areas. I’m sure that their proper naming conventions will change over time as more knowledge becomes available. These are most likely temporary names.
    Putong means universal or ubiquitous. So Putonghua is just universal speech, a lingua franca, not really a mother tongue of anybody.
    Yu refers to a language, much the same way we do in the west.
    Yuyan is the noun for language. Yu is attached to another word, like Yingyu (English), Deyu (German), Eyu (Russian), Riyu (Japanese), Yueyu (Cantonese), Wuyu (Wu language), Huiyu (Hui language), Jinyu (Jin language).
    Wen refers to a language with a proper literature. I’ve never seen anybody say Wuwen or Minwen or Yuewen (even though Cantonese does have a written standard). But common uses are Yingwen (English), Dewen (German), Ewen (Russian), Yidaliwen (Italian), Riwen (Japanese), Hanwen (Korean), Taiwen (Thai), Yinduwen (Hindi).
    Hope that helps.
    Regards,
    James Michael Campbell (“Mike”)

  4. Gentlemen, thanks for this kind debate. I’m learning many things about the Sinic languages I’m so far from. From my personal experience Ethnologue is a good source as it is open to discussion and improvement. I’m sure both of you will help improve this classification!

  5. Hi Mike, I have answered a lot of your concerns in a couple of emails.
    There is now a proposal before Ethnologue to split three more Mins off, I think Teochew, Hainan and Xiamen into three languages. I agree with this. Clearly, Shaojiang Min is a separate language.
    I have been looking into those Tuhua. They seem to be related, but the Chinese seem to be lumping a few of them to the big languages. One I dealt with today seems to be being cleaved off to Hakka. Those Tuhua are incredibly diverse and have vast differences between them and even internally. It’s clear we are dealing with multiple languages here.
    Pinghua and Tuhua are really just junk categories or trash cans where they are tossing stuff temporarily until they can start to figure it out. Ping does look related to Cantonese though.
    I think that Wuhua was just used by some Chinese speakers on Internet postings.
    The Sinologists I am talking to are distressed that the Chinese have only cleaved say 14 languages or so off of Chinese. They insist that there are far more than that.
    The problem you are talking about with L2 learners of English should not be much of a problem with intelligibility testing, although I understand your concerns. The problem is that English and Chinese are about as far apart as one can possibly get in relation. So you are going to get lots of confusion. Where lects share 70-89% cognates, as with Chinese, you are going to get much more intelligibility and the instances you describe are going to be dramatically fewer, if they exist at all.
    SIL has done testing where they just ask people, “Can you understand Lect B speakers?” They say yes, no or sort of. If sort of, they try to quantify it in %. Then SIL gives them intelligibility testing, and the testing usually comes out pretty much how the initial questioning went. If the people say, “We understand them just fine,” there won’t be any testing.
    I’m not aware of cases lately where intelligibility testing is generating false positives or negatives, but all the information is out there. If we say Lect A speakers can (but in truth they can’t) or can’t (but in truth they can) understand Lect B speakers, sooner or later, someone is going to set us straight.
    In terms of Scots and Bavarian, I don’t know the percentages or even if testing has been done. However, Ethnologue has cleaved off in addition to Bavarian about 20 other German languages, and supposedly they are not even done. I don’t know the figures for Bavarian, but I have heard 40% intelligibility tossed around. Tyrolian is just a German dialect with 90%+ intelligibility with German, but there are some little lects up there that got split off like Walser and Cimbrian.
    Intelligibility for Scots has got to be below 90%. I watched Trainspotting and I couldn’t understand a word of it!
    As a general rule, in cases of 90%+ intelligibility, Ethnologue will NOT cleave off a new language. There are many groups demanding new languages out there, like Moldavians and Valencians, but Ethnologue has been denying them since intelligibility is 90% with Romanian and Catalan for each. With Galego, it is 85-90%, so Galego is a barely a separate language, but there are lots of Portuguese hopping mad about this.
    Ethnologue is cleaving off new languages in a few cases where there is 90%+ intelligibility, but these often get challenged and removed later. One case is Scanian, now removed. It’s apparently just a Swedish dialect, but the very loud Scanians sneaked it into Ethnologue a while back. You also noted cases of obvious dialects like Croatian, Bosnian and Serbian getting split off to languages simply because they are lects of a separate nation now and also I would add because they hate each others’ guts!
    Ethnologue has cleaved off 8 different kinds of Dutch Low Saxon and a lot of Dutch are furious. My understanding is that these lects have poor intelligibility.
    In general, by the 90%+ rule, Ethnologue has not done nearly enough splitting!
    The things we look at are
    1. Swadesh list. This tells us: separate language, dialects or partial intelligibility (lects of unknown status)
    next
    2. We can talk to people. Do you understand those people over there or not? They say yes, no, or sort of.
    next
    3. We move on to actual testing with the instruments such as SIL has been working on for decades. These have problems and they are constantly being refined to give us better validity and reliability.
    Complex testing of phonology, tones, lexicon, syntax, morphology, etc. is cool, but who has the time? They could surely be used to test our informant testing with one more layer.
    Sociolinguistic attitudes are very interesting and I do understand your concern, but in reading over this SIL stuff, it doesn’t appear to be a problem. Even if Lect A speakers used to be enslaved by Lect B, they are still straight up about whether they understand Lect B or not. You test a group and if and when they differ, you just do an average.

  6. Linguistics is a relatively useless field. Only so much meaningful work can be done there. Contrast this to engineering or the sciences, where advancements actually have a real world impact on our health, quality of life, etc. I lump linguistics in the same category as philosophy. The same stuff can only be rehashed so many times…

  7. It’s about as useless as any other field that is involved in explaining the world around us, botany, biology, anthropology, history, sociology, literature, etc. When it comes down to it, a lot of academic work is just explanatory. Not all science is about making the world a better place. Humans want to explain, understand and make sense of the world around them. A lot of what the social sciences do is just that.
    It may be useless, but people are very, very, very interested in Linguistics! Language is something all of us use all the time, and especially speakers of certain tongues are fascinated by any kind of work done in that area, after all, we are discussing the language that they live with 24-7.

  8. I am Chinese & I speak Mandarin, Min-nanese, Teochew, Cantonese, Taishanese & Hakka. Let me say this- they are all dialects of Chinese. Once one is accustomed to the nuances of the dialect, one will understand it. Take this from a native.

    1. As a Malaysian native, U conveniently forgot to mention that these languages have “joined forces” in your land. M’sians use the same set of vocabulary and grammar to speak all these languages, borrowing from one into another almost at will. The un-converged, uncut versions of these languages are mostly way different from each other.

  9. I am European and I speak Italian, French, Spanish, Portuguese, and Romansh. Let me say this- they are all dialects of Latin. Once one is accustomed to the nuances of the dialect, one will understand it. Take this from a native.

  10. Ok, lots ot cover. First about the linguistics is a worthless field: really? Which part of it? Computational linguistics is part of what makes Google search work and how they set algos for it. Speech recognition is thankful to linguistics as A.I. Speech Pathology is from linguistics. So clearly a rather personal no substance attack on the field.
    As for Ethnologue, I know the editor, Paul Lewis. He will tell you that dialects and languages are separated by a few things: 1) Mutual intelligibility (some of the testing was already delineated) 2) Political (it is said a separate language has its own army and navy, dialects do not). SIL is bankrolled to several millions of dollars a year–they are the only ones doing this sort of compiling of languages (especially the geo locked languages). What I would like to commend here is that unlike the You Tube dilettantes who spout nothing but personal unqualified opinions, the citations here is a strong point.
    Christophe Clugston

  11. Great discussion.
    I think Lindsay is right in using mutual intelligibility as the criterion for determining what’s a language. I also think that intelligibility can be real tough to measure, and that something should be said for the kind of situation where mutual un-intelligibility is only temporary, i.e. where a week of exposure has the speakers off and running.
    As Campbell puts it, “But the question remains, does one actually have to specifically pick out and learn new phrases on their way to learning or can you pick them up in passing assuming to understand?”
    So languages A and B are mutually unintelligible, but speakers become able to understand each other after a week of steady contact. Languages C and D are mutually unintelligible, and speakers still can’t understand each other after months of steady contact, unless they learn each other’s language or use a third language. Do we treat both situations the same and call them different languages? I think that’s worth thinking about.
    Campbell brings up another valid point: attitudes influence intelligibility. Part of this is raw, conscious effort. Part of this is psychological and pretty much subconscious.
    Another point that nobody has brought up yet is topic dependency. Mutual intelligibility usually varies depending on what the speakers are trying to talk about. A “deep” Taiwanese Hokkien speaker and a “deep” Medan (Sumatra) Hokkien speaker could probably understand each other reasonably well across a wide range of household and agricultural topics, but if it came to fixing a car or a motorbike, they’d be speaking different languages, in effect. The task of quantifying intelligibility gets harder if we wanna pin this down. Maybe a “basket of topics” concept could be advanced, kind of like the “basket of goods and services” concept used to measure inflation.
    There’s a video on Youtube where two Siam Thai speakers go up into central Guangxi and try to communicate w/ Zhuang speakers speaking only Siam Thai. First it doesn’t work, then it starts working. They realize that it only works when the topic is one that’s heavy on shared vocabulary.
    Based on intelligibility criteria, how many languages is Hokkien (what Lindsay calls “Xiamen”)? A lot of Penang Hokkien would go over a Taiwanese Hokkien speaker’s head at first exposure, just b/c of intrinsic linguistic differences. Typically, there would also be a lack of effort on the part of the Taiwanese speaker to understand a non-Taiwanese form of Hokkien. Even beyond this, psychologically, both sides (but esp. the Taiwanese) have a hard time acknowledging an unfamiliar form of their familiar Hokkien tongue. Due to subconscious psychological reasons and a lack of effort, they may honestly not be able to understand each other (assuming the Penang speaker is one of the few with no Taiwanese Hokkien media intake). The shared vocabulary, collocations, idioms, etc., though, are definitely enough for them to understand each other w/ just an attitude adjustment.
    Yet, I don’t think the shared vocabulary and grammar are “good enough” to establish that PngHk and TWHk are dialects of the same language. How do we really know? What strikes me as being much better evidence is having witnessed TWHk and PngHk speakers communicating effectively in their respective dialects w/o having to resort to another language – even though such encounters have typically resulted in a quick switch to Mandarin as of the last 10 or 15 years or so. Intelligibility is tricky to quantify, no doubt; but lexical and syntactic similarity have got to be even trickier to measure in any meaningful way.
    I have to take exception with a couple of Campbell’s minor points. They sound suspiciously like the stuff U read in papers by some (not all) Chinese scholars.
    Campbell says, “Fangyan we have determined as topolect, but as used many centuries ago could also refer to any language of a different region. Today it has a specific use and currently applies to a “county”, notwithstanding the fangyan of neighboring counties may be the exact same thing.”
    I don’t know what Campbell means by “today it has a specific use”. It’s not only common for laypeople to use “fangyan” to refer to the speech of a province or any other region, it’s also pretty common for scholars to spit out collocations like “Yue (~ Cantonese) fangyan”, never mind that “Yue” is a group of languages spoken across two provinces of China and taking in at the very, very least three mutually unintelligible languages.
    Campbell also says, “It reminds me of Sinoxenic borrowings of Chinese words into neighboring Korean, Japanese, and Vietnamese which all now have approximately 60% of their core lexicon borrowed from Chinese. But these languages belong to other families and developed separately…”
    This is kind of begging the question. What if the North Chinese political grip on Vietnam was somehow renewed? Sure enough, Vietnamese would continue to absorb “Chinese” elements deeper and deeper into its lexicon and structures, to the point where a linguist from the “modern” linguistics tradition would say it was a Chinese language. And indeed the evidence seems to reveal that this is exactly how Hokkien, Teochew, Hailamese, Wenzhou, Hoisan (Taishan), etc. “became” Chinese languages. The best paper I’ve seen on this was by a Chinese scholar named Pan Wuyun (潘悟云). What’s Sinoxenic? Who was neighboring what? What’s core lexicon? Who developed separate and who developed together, and where and when? These are unresolved questions, not the open-and-shut case that most linguists in the field (even many non-Chinese) seem to think it is.
    Campbell is probably right in saying, “Hua is usually tacked on to a place name. The “speech” of a particular place as long as there are no others competing (for example Nanning in Guangxi has several languages).” I would add that competing languages w/i counties is the rule rather than the exception throughout tropical and coastal subtropical China. The tendency in each area (not necessarily just one county) with competing languages is for each language to go by a two or three syllable nickname where the last syllable is usually 話 (hua in Mandarin). Cantonese (but not the Hoisan type) is usually known as 白 hua. Hokciu (a.k.a. Foochow) is known locally as 平 hua (exact same name as Tuhua). In the Leizhou area, 海 hua and 黎 hua are two distinct “Min” varieties, reportedly mutually intelligible only w/ each other or at most also w/ some type of Hailamese / Hainanese Min.
    Speaking of which, a primer on Hailamese was published about a century ago in Singapore. The author (de Souza) explains in the introduction which dialect of Hailamese the book is based on, and says that dialects of Hailamese from the other side of the island are “perfectly impossible to understand”. So there may actually be more than one language w/i just Hailamese Min.
    Finally, about the Chinese scholars falling down on the job. I would say that, first of all, they generally don’t think this is their job. To them, “Chinese” is basically “assumed” to be one language. U could just call that shoddy academics. Secondly, though, some Chinese scholars are doing a pretty good job, such as Pan Wuyun. In the Anglo tradition, a guy like Pan Wuyun would come out at some point with a “come-on-and-own-up, most-of-all-y’all-is-wrong” paper. But unfortunately that kind of thing is really rare in China. And so it’s left to foreign scholars or guys like Lindsay or myself to say this, w/ the disclaimer (at least in my case) that there are many individual decent scholars in China too.

    1. Quite honestly, Mr. Campbell is not much of a linguist and he is not welcome at all in formal linguistics circles because no one likes him and he has never published. He has such an unpleasant personality that no one wants anything to do with him. He is also a criminal IMHO.
      Thank you very much for this comment!

      1. He has not published and eh makes some fundamental errors because he hasn’t had an academic training. However, i am curious in what regard he is a criminal?

        1. He is a very bad person. He has one of the worst personalities of anyone I have ever met. He would not last one day in academia. He could never even get hired in the first place. He won’t be able to publish in any journal due to his toxic personality, and soon word will get around, and he will be pre-emptively banned everywhere. Also he will not be able to give presentations at conferences. He is pretty much locked out of the field before he even started.
          He ran a Chinese translation business over in Taiwan. The complaints against him were legion. The business was more or less fraudulent and he cheated and ripped off many, many clients. Go to the translation forums and everyone is calling him a thief and a criminal.
          There were many complaints to the Taiwan government to try to shut him down but the Taiwan government is completely corrupt and apparently never shuts down any businesses for being frauds. I think eventually he packed it up and headed off into new scamming territories. Guy would make a great politician. He has the proper moral fiber. Either that or corporate executive. Dude has a bright future.

  12. Yes I read about his Taiwan exploits. And you are correct he doesn’t have an under grad degree let alone grad degree in linguistics. why he doesn’t at least get an undergrad degree is very strange.

Leave a Reply

Your email address will not be published. Required fields are marked *