Splitters Versus Lumpers in Historical Linguistics

Warning:  Long, runs to 57 pages. This article is intended at the moment more for the general audience than for specialists,  but specialists may also find it of interest. At the moment, it is not properly formatted or edited to be of use for publication in an academic journal, but perhaps it could be published in such a format some day.

For background into what Historical Linguistics is, see this Wikipedia article. Basically it involves determining which languages are related to each other via various means and once that is determined, reconstructing a proto-language that the related languages descended from, along with, hopefully, regular sound correspondences which supposedly proves the relationship once and for all. The argument in Historical Linguistics now is between conservatives or splitters or progressives or lumpers.

Splitters say that the comparative method – described above as reconstructing a proto-language with regular sound correspondences – is necessary in order to prove that two or more languages are related. However, they also say, probably correctly, that this method is not useful beyond ~6,000 years. Any relationships beyond that time frame would not be provable by the comparative method and hence could never be proven. This effectively shuts down all research into long-range older language families.

Some lumpers say that this method is not necessary and instead relationships can be determined by simply looking at the two or more languages, a process called comparison or mass comparison. I point out below that comparison need not be cursory but could mean deep study of languages over 10, 15, or 20 years.

They tend to focus on core vocabulary, numerals, family terms, pronouns, and deictics, in addition to small morphological particles – all things that are rarely borrowed. Once they find a number of these items that resemble one another greater than chance, they say that the two languages are related because chance and borrowing are ruled out.

They say that this is the way to prove language relatedness, not the comparative method. The comparative method instead is used to learn interesting things about language families that have already been discovered via comparison, such as reconstructing proto-languages and finding regular sound correspondences.

Splitters say that comparison or mass comparison is not a valid way of proving that languages are related and that only the comparative method can be used to prove this. However, as noted, they set a 6,000- year time limit on the method needed to prove this, and this walls off a lot of potential knowledge and about ancient and long-range language relationships as unprovable and hence undiscoverable. In a way, they are shutting the door to new scientific discovery beyond a certain time frame by claiming that the method needed to make these discoveries doesn’t work beyond X thousand years.

Other lumpers disagree that the comparative method has a time limit on it and are attempting to use the comparative method to reconstruct ancient long-range language families and find regular sound correspondences between them. Unfortunately, most of their efforts are in vain as splitters are using increasingly strict criteria for proof of language relationship and hence are shooting down most if not all of these efforts being done “in the proper way.”

So they are saying that proof must be done in a certain way, but when people try to play by the rules and use that way to find proof, they keep moving the goalposts and using increasingly strict, petty, and quibbling methods to in general say that the relationship is not proven.

So the say, “You must use this tool for your proof!” And then people play fair and use the tool, and almost always say, “Sorry, you didn’t prove it!” It all feels like a game that is rigged to fail is most if not all cases.

Hence, the current trend of extreme conservatism in Historical Linguistics has set up rules seem to be designed to prevent the discovery of most if not all new language families, in particular long-range families older than 6-8,000 years.

I am quite certain that long-range language families such as Altaic (with either three families or five), Indo-Uralic, Uralic-Yukaghir, Hokan, Penutian, Mosan, Almosan, Japanese-Korean, Gulf, Yuki-Gulf, Elamite-Dravidian, Quechumaran, Austroasiatic-Hmong Mien, Coahuiltecan, North Caucasian, or Na-Dene will never be proven in my lifetime, and that’s not to mention the more extreme proposals such as Eurasiatic, Nostratic, Dene-Caucasian, Austric, and Amerind, although the evidence for the first and last of these is quite powerful.

There are simply too many emotions tied up in any of these proposals. Further, many linguists have spent a good part of their careers arguing against these proposals. It is doubtful that any amount of evidence will cause them to change their minds. Scientists, like any other humans, don’t like to be shown that they’re wrong.

Lyle Campbell, Maryanne Mithun, Mauricio Mixco, Sarah Grey Thomason, Joanna Nichols, William Poser, Peter Daniels, Dell Hymes, Larry Trask, Gerrit Dimmendaal, Donald Ringe, Juha Janhunen, William Bright, and Paul Sidwell are among the leaders of this new conservatism.

At first I was very angry at what these people were doing, especially the most egregious cases such as Campbell. Then I realized that people lie and misrepresent things all day long every single day in my life and that this behavior is fairly normal behavior in humans, especially in a mushy area like this one where hard truths are hard to come by and most stated facts are more properly matters of opinion or could be construed that way.

I realized that they are simply defending a scientific paradigm and that unfortunately, this is the rather underhanded and emotion-ridden environment that defending paradigms tends to produce.

Though to be completely honest, I should not be singling these people out because the current conservatism is simply consensus and acts as the current paradigm on the language relatedness question in Historical Linguistics. The people listed above are at the top of the profession and are often considered the best historical linguists. They write books on historical linguistics. A number are considered to be ultimate authorities on questions of language relatedness. They are simply the leading edge of the current conservative consensus and paradigm in the field.

Although granted, of all of them, Campbell seems to be the most extreme conservative. He is also one of the top historical linguists in the world. Mixco, Mithun, and Poser are about on the same level as Campbell.

Campbell, Mithun, Thomason, and Mixco are Americanists whose conservatism was set off by the publication of Joseph Greenberg’s Language in the Americas (LIA) in 1987.

All of the linguists above are noted for the excellent scholarship.

The conservatives who are denying most if not all new families are are called splitters.They tend to be very angry if not out and out abusive, engaging in bullying, mockery, ridicule, ostracization, and all of the usual techniques used in science against the proposers of a new paradigm.

The people who propose long-range families are called lumpers. Lumpers are heavily disparaged in the field nowadays such that almost no one wants to be known as a lumper or associated with such. However, many other historical linguists seem to be taking a more moderate fence-sitter stance where they are open to questions of new language families, including long-range families.

Among the long-range families that the moderates are open to considering nowadays are Indo-Uralic, Dene-Yenisien, and Austro-Tai. Some of the smaller long-range families in the Americas even have supporters among the most hardline of splitters. I’m even dubious about well-argued proposals such as Dene-Yenisien.

Thomason takes extreme umbrage to the notion that splitters have a bias that will not allow few if any new families to be discovered after Greenberg compared them with Malcolm Guthrie’s objections to Greenberg’s new classification of Bantu. However, after thinking this over for some time now, I now believe that Greenberg is correct. The splitters have their minds made up. They are going to allow few if any new families to be discovered. A few of them have caved a bit.

I also work in mental health, and it’s pretty obvious to me when something is not right about a scientific debate. I’ve been getting that vibe about the splitters versus lumpers debate from the very start. When a debate in science has degenerated into bias, ideology and ideologues, propaganda, politics, and in particular extreme emotion, it gives off a certain intuitive feel about it. This debate has felt this way from Day One. To put it simply, the debate simply doesn’t smell right. I have a feeling that science left the room along time ago here.

One thing I noticed was that people who have worked on one particular language or family for much of their careers are especially angry and aggressive about the notion that their family could possibly be related to anything else. Indeed famous linguists were remarking on this tendency as early as 1901. Among the reasons given was that they had their hands full already without new work to take on and a disinclination to see their language family related to anything else as this would deny its specialness.

Trask is forceful that Basque could not possibly have any outside relatives.

I saw a debate on the Net some years ago with Trask and a Spanish assistant holding court over a debate over the external relations of Basque. Those who argued for external relations were pushing a relationship with the Caucasian languages, which is possible though not proven in my opinion. Trask and his assistant were very angry and aggressive in holding down the fort. Apparently everything was a Spanish borrowing. The debate didn’t smell right at all.

With a background in psychology, I wonder what is going on here. One possibility is as Greenberg suggests and as was suggested back in 1901 – simple narcissism. When one specializes in a language family for a long time, it probably become blurred with the self such that the self and the family become married to each other, and it’s hard to tell where one ends and the other begins. Yourself and the family you’ve spent your career working on become one and same thing. If your family is not related to anything else, it’s special.

We all think we are special. This is the essence of human narcissism. To say that their favorite language has relatives is to deny its specialness almost as if to say that our egos were not real but were instead extensions of other people’s egos. Actually if you read Sartre or study modern particle physics, that’s not a bad theory, but most people bristle at the notion.

I met Korean and Japanese people when I was doing my Masters. Both beamed when they told me that their language had no known relatives. Of course that made it special in their eyes and played right into their ethnocentrism.

Another problem may be the trajectory of one’s career. If one has been arguing forcefully for 30 years that there are no known relations to your family, your reputation is going to take a huge hit if you have to agree that you were wrong all those years.

There is also a politics question.

Another reason is Politics. We are dealing here with a Paradigm. For a good description of a Scientific Paradigm, see Thomas Kuhn’s The Structure of Scientific Revolutions. Kuhn holds that science is by its nature very conservative, some sciences being more conservative than others. A Paradigm is set up when the field reaches a satisfactory consensus that a particular theory is correct. After a while, serious barriers go up to any challenges to overthrow the proven theory.

The challenges are first ignored, then ridiculed (often severely), then attacked (often ferociously) and then, if the challenge is successful, it is accepted (often slowly and grudgingly). Kuhn pointed out that defenders of the old theory are usually so reluctant to see the paradigm overthrown that we often must wait literally until their deaths to finally overthrow the paradigm. They defend it to their deathbeds. I suggest we are dealing with something more than pure empiricism here.

It is quite risky to challenge a paradigm in science. People’s careers have suffered from it. A supporter of Keynesian economics, then challenging the current paradigm in economics, could not get hired at any university in the US during the 1930’s.

In the splitters versus lumpers debate, we have been in the Anger phase for some time now. We seem to be settling out of it, as many are taking a fence-sitting position and arguing for attempts to resolve the debate to make it less heated.

The Paradigm here involves extreme skepticism about any new language families to the point that any new families are simply going to be rejected on all sorts of grounds. Paradigms involve politics at the academic level. When a Paradigm is set up in science, almost all scientists write and do research within the paradigm. Anything outside of the paradigm is derided as pseudoscience or worse.

The problem is that when a Paradigm in in effect, all scholars are supposed to publish within the Paradigm. Publishing outside the paradigm is regarded as evidence that one is a kook, a crank, is practicing pseudoscience, or that one is crazy or a fool. It is instructive in this debate to note that most of the prominent lumpers are independent scholars operating outside of the politics of academia.

I have had them tell me that the only reason they can take the lumper position that they do is because they are independent and don’t have a university job, so there are no repercussions if they are wrong. They told me that if they had a professorship, they would not be able to do this work. They have also told me that they know for a fact that certain splitters might jeopardize their jobs, careers, and especially their funding if they took a lumper position. This was given as one of the reasons for their dogmatic splitterism.

In addition, science works according to fads, or more properly, standard beliefs. The trends for these beliefs are set by the biggest names in the field. The biggest names in Linguistics are all splitters now. They are the trendsetters, especially in whatever specialty of Historical Linguistics you are working in. Everyone else in the field is dutifully following in their footsteps. As an up and coming young scholar, you are supposed to follow the proper trends and hypotheses of your field to uphold the consensus of scholars in your area of specialty. As you can see there is a lot more than simple empiricism going on here.

With my background, I look for psychological motivations anywhere I can find them. And science is no stranger to bias and emotional psychological motivations driving, or usually distorting it. We are human and humans have emotions. Emotion is the enemy of logic. Logic is the basis of empiricism. Hence, emotions are the enemy of science.

Scientists are supposed to remain objective, but alas, they are humans themselves and subject to all of the emotional psychological motivations that the rest of them are. Scientists are supposed to police themselves for bias, but that’s probably hard to do, especially if the bias is rooted in psychological processes or in particular if it is unconscious, as many such processes are.

Campbell’s case is an extreme one, but I believe it is simply motivated by internal psychological process inside of the man himself.

Campbell is driven by psychological complexes. His entire turn towards extreme conservatism in this debate was set off by the huge feud he had with Greenberg, and everything since has flowed from that. He took a very angry position that LIA was completely false and did his best to trash its reputation far and wide. This disparagement is still the order of the day, and Greenberg’s name is as good as mud in the field.

Then Campbell generalized his extreme splitterist reaction to LIA out to all of the language families in the world because if he allowed any new families elsewhere in the world, he might have to allow them in the Americas, and he could not countenance that. Note also that Campbell has gone out of his way to specifically attack Greenberg’s four-family split in his proposal for language families in Africa.

This proposal, done with Greenberg’s derided method of mass comparison, has had a successful result in Africa and has been proven with the test of time. Campbell cannot allow this because if he admits that Greenberg was right in Africa, he might have to accept that he might be right in the Americas too, and that’s beyond the pale. So in his recent works he has specifically set out to state that Afroasiatic, Nilo-Saharan, Niger-Kordofanian, and Khoisan – the four families of Greenberg’s classification – have not been proven to exist yet. The truth is exactly the opposite, but the psychological process here is bald and naked for all to see.

Here he specifically trashes these language families because they were discovered by Joseph Greenberg, Campbell’s bete noir. Campbell’s agenda is to show the Greenberg is a preposterous kook and crank, although he was one of the greatest linguists of the 20th century. Greenberg’s African work is regarded as true, and this poses a problem if Campbell is to characterize Greenberg as a charlatan.

If Greenberg was right about one thing, could he not be right about another? In order to lay the foundation for the theory that Greenberg’s method doesn’t work and that it cannot discover any language relationships, Campbell will have to deny the method ever had any successes. So he sets about to deny that Greenberg’s four African families are proven.

Splitters have come up with a repertoire of reasons to shoot down proposed language relations and most are pretty poor.

They rely on overuse of the borrowing, chance, sound symbolism, nursery word, and onomatopoeia explanations for non-relatedness. There is also an overuse of the comparative method with excessively strict standards being set up for etymologies and sound correspondences. In a number of cases, linguists are going back to the etymologies of their proto-languages and reducing them by up to half.

In the last 20 years, Uralicists have gone back over the original Proto-Uralic etymologies and gotten rid of fully half of them (from 2,000 down to 1,000) on a variety of very poor reasons, mostly irregular sound correspondences. It appears to me that while there were some obvious bad etymologies in there, most of the ones that were thrown out were perfectly good.

Irregular sound correspondences is a bad reason to throw out an etymology. Keep in mind that 50% of Indo-European etymologies have irregular correspondences. By the logic of Uralicists we should throw out half of IE etymologies then. If Campbell finds any irregular sound correspondences in any new proposal, he automatically rejects it on those grounds alone. What the Uralicists have done is vandalism.

This is not just conservatism. It is out and out Reaction. Worse, it is nearly a Conservative Revolution, which I won’t define further. It is akin to a city council declaring that all of the old, beautiful buildings in the city are going to be torn down because they were not constructed properly. Will they be rebuilt? Well, of course not. Most of the top Uralicists are involved in this silly and destructive project.

In a recent paper, George Starostin warned that the splitters were not just conservatives determined to stop all progress. He pointed out that there was actually a trend towards rejection and going backwards in time to dismantle families that have already set up on the grounds that they were not done perfectly enough. As we can see, his warning was prescient.

There are statements being made by moderates that both sides, the splitters and the lumpers, are being equally unreasonable. As one linguist said, the debate is between lazy lumpers (Just believe us, don’t demand that we prove it!) and angry splitters (Not only is this new family false, but all new families proposed from now on will also be shot down!). He suggested that they are both wrong and that the solution lies in a point in the middle. I don’t have a problem with this moderate centrist belief

The splitter notion itself rests on an obvious falsehood, that there are hundreds of language families in the world that have no possible relationship with each other.

According to Campbell, there are 160 language families and isolates in the Americas. The question is where did all of these entities come from. Keep in mind, in Linguistics, the standard view is that these 160 entities are not related to each other in any way, shape, or form. Thinking back, this means that language would have had to have developed in humans 160 times among the Amerindians alone.

The truth is that there was no polygenesis of language.

Sit back and think for a moment. How could language possibly have been independently developed more than one time? Obviously it arose in one group. How could it have arose in other groups too? It couldn’t and it didn’t. Did some of the original speakers go deaf, become mutes, forget all their language, and  then have children, raising them without language, in which case the children devised language for themselves?

Children need comprehensible input to develop language. No language to hear in the environment, no language for the children to acquire on their own. With coclear implants, formerly deaf people are now able to hear for the first time. A woman got hers at age 32. Since she missed the Critical Period for language development, the window of which closes at age 8, she  has not, even at this late  date, been able to acquire language satisfactorily. She missed the boat. No input, no language.

Obviously language arose only once among humans. It had to. And hence, all human languages are related to each other de facto whether we can “prove” it by out fancy methods or not. In other words, all human languages are related. Those 160 language families and  isolates in the Americas? All related. Now we may not be able to prove which languages they are related to specifically and most closely, but we know they are all related to each other.

In the physical sciences, including Evolutionary Psychology, many things are simply assumed because the alternate theories could not have happened. But we have no evidence of much of anything in Evolutionary Psychology or Evolutionary Anthropology. We know our ancestors lived in X place at Y times, but we have no idea what they were doing there. We can’t go back in time to prove that this or that happened.

Using the logic of linguists, since we cannot make time machines to go back in time and make theories about Evolutionary Anthropology and Evolutionary Psychology of these peoples, we can make no statements about this matter, as the only way to prove it would be to see it. In physics, there are particles that we have never seen. We have simply posited their existence because according to our theories, they have to exist. According to linguists, we could not posit the discovery of these particles unless we see it.

Contrary to popular rumor, everything in science does not have to be “proven” by this or that rigorous method. Many things are simply posited, as no real evidence for their existence exists, either because we were not there or because we can’t see them, or in the case of pure physics, we can’t even test out our theories. They exist simply because they have to according to our existing theories, and all competing theories fall down flat.

Well, the Americanists beg to disagree. Greenberg’s theory was so extreme and radical that the entire field erupted in outrage. None of their alternate theories, not even one of them, make the slightest bit of sense.

Despite the fact that these languages are obviously related to each other, in order to “officially prove it” we have to use a method called the comparative method whereby proto-languages and families are reconstructed and regular sound correspondences are shown between the languages being studied.

This is the only way that we can prove one language is related to another. That’s simply absurd for a few reasons.

First of all, I concur with Joanna Nichols that the comparative method does not really work on language families older than 6-8,000 years. Beyond that time, so many sound changes have taken place, semantics have been distorted, and terms fallen out of use that there’s not much of anything left to reconstruct. Furthermore, time has washed away any evidence of sound correspondences.

Although Nichols is a splitter, I have to commend her. First, she’s right above.

Second, realizing this, she says that the comparative method will always fail beyond this time frame. I believe she thinks then that we need to use new methods if we are to prove that long-range families exist. The method she suggests is “individual-identifying evidence,” which seems to be another way of saying odd morpheme paradigms that were probably not borrowed and are hardly existent outside of that family.

This harkens back to Edward Sapir’s “submerged features,” where he says we can prove the existence of language families by these small morphemic resemblances alone.

The rest of the field remain sticks in the mud. They say that we must use the comparative method to discover that languages are related because no other method exists. The problem is that as noted, as splitters themselves note, if the comparative method fails beyond 6,000 years back, all attempts to prove language families that old or older are bound to fail.

The splitters seem positively gleeful that according to their paradigm, few if any new language families will be discovered. This delight in nihilism seems odd and disturbing. What sort of science is gleeful that no new knowledge will be found? Even in the even that this is true, it’s depressing. Why get excited about something so negative?

Many language families in the world were discovered by Greenberg’s “mass comparison” or simply comparing one language to another, which should be called “comparison.” And in fact, many of the smaller language families in the world are still being posited by the means of comparison or mass comparison. Comparison need not be the broad, sweeping, forest for the trees, holistic method Greenberg employs. I argue that it means lining up languages and looking for common features. We could be lining up one language against another and that would also be “comparison.”

It need not be a shallow examination. One could examine a possible language for five, ten, fifteen, or twenty years.

After studying a pair or group of languages for some time, if one finds a group of core vocabulary items that resemble one another and are above the rate found by chance (7%), and after which borrowing has been ruled out (core vocabulary is rarely borrowed), then you have proof positive of a language family.

I fail to understand why examining a language or group of languages for a long period of time to find resemblances and try to rule out chance or borrowings is a ridiculous method. What’s so ridiculous about that? Sure, it’s nice to reconstruct and get nice sound correspondences going, but it’s not always necessary, especially in long-range comparisons when such methods are doomed to failure.

One more thing: if splitters say that the comparative method fails beyond 6,000 years, why do they keep putting long-range families to the test using the comparative method? After all, the result will always come up negative, right? What’s the point of doing a study you know will come up negative? Just to get your punches in?

There are a number of folks who have bought into the splitters’ arguments and are trying to discover long-range families by the comparative method of reconstructing the proto-language and finding regular sound correspondences between them. A number of them claim to have been successful. There have been attempts to reconstruct proto-languages and find regular sound correspondences with Altaic, Nostratic, Dene-Caucasian, Dene-Yenisien, Austro-Tai, Totonozoquean, and Uralo-Yukaghir.

Altaic, Nostratic, and Dene-Caucasian all have proto-languages reconstructed with good sound correspondences running through them. Altaic and Nostratic have etymological dictionaries containing many words, 2,300 proto-forms in the case of Altaic in a 1,000 page volume. Further, a considerable Nostratic proto-language was reconstructed by Dogopolsky and Illich-Svitych.

All of these efforts claim that they have proven their hypotheses. However, the splitters such as Campbell have rejected all of them. So you see, even when people follow the mandated method and play it by the book the way they are supposed to, the splitters will nearly always say that the efforts come up short. It’s a rigged game.

How about another question? If the comparative method fails is doomed beyond 6,000 years, why don’t we use another method to discover these relationships? The splitter rejoinder is that there is no other method. It’s the comparative method or nothing. But how do they know this? Can they prove that other methods can never be used to successfully discover a language relationship?

The following quotes are from a textbook or general text on Historical Linguistics by Lyle Campbell and Mario Mixco, A Glossary of Historical Linguistics. The purpose of this paper will be misrepresented as critics who will say that I am a lumper who is saying criticizing splitters for their opposition to known language families.

There is some of that here, but more than lumper propaganda, what I am trying to do here more than anything else is to show how Campbell and Mixco have been untruthful about linguistic specialist consensus regarding these families. In most cases, they are openly misrepresenting the state of consensus in the field.

As will be shown, Campbell and Mixco repeatedly seriously distort the state of consensus regarding many language families, particularly long-range ones. They usually favor a more negative and conservative view, saying that a family has little support when it has significant support and saying it is controversial when the consensus in the field is that the family is real. Campbell and Mixco engage in serious distortions of fact all through this text:

Campbell and Mixco:

Afroasiatic: Enjoys wide support among linguists, but it is not uncontroversial, especially with regard to which of the groups assumed to be genetically related to one another are to be considered true members of the phylum.

There is disagreement concerning Cushitic, and Omotic (formerly called Sidama or West Cushitic) is disputed; the great linguistic diversity within Omotic makes it a questionable entity for some. Chadic is held to be uncertain by others. Typological and areal problems contribute to these doubts. For example, some treat Cushitic and Omotic together as a linguistic area (Sprachbund) of seven families within Afroasiatic.

Campbell and Mixco are wrong. Afroasiatic is not controversial at all. There is widespread consensus that the family exists and that all of the subfamilies are correct.

The “we can’t reconstruct the numerals” argument is much in evidence here too. See the Altaic debate below for more on this. One argument against Altaic is “We can’t reconstruct the numerals.” However, Afroasiatic is a recognized family and not only has reconstruction itself proved difficult, but the numerals in particular are a gigantic mess. It seems that one does not need to have a fully reconstructed numeral set after all to have a proven language family.

There is consensus that Cushitic is a valid entity. Granted, there has been some question about Omotic, but in the last 10-15 years, consensus has settled on an agreement that Omotic is part of Afroasiatic.

The great diversity of Omotic is no surprise. Omotic is probably 13,000 years old! It’s amazing that there’s anything left at all after all that time.

Where do we get the idea that a language family cannot possibly be highly diverse? Chadic is also uncontroversial by consensus. I am not aware of any serious proposals to see Cushitic and Omotic as an Altaic-like Sprachbund of mass borrowings. Campbell and Mixco’s comments above are simply not correct. The only people questioning the validity of Afroasiatic or any of its components are Campbell and Mixco, and they are not an experts on the family.

Campbell and Mixco:

Berber is usually believed to be one of the branches of Afroasiatic.

This is far too pessimistic. Berber is recognized by consensus as being one of the branches of Afroasiatic.

Campbell and Mixco:

Niger-Kordofanian (now often just called Niger-Congo): A hypothesis of distant genetic relationship proposed by Joseph H. Greenberg in his classification of African languages. Estimated counts of Niger-Kordofanian languages vary from around 900 to 1,500 languages. Greenberg grouped ‘West Sudanic’ and Bantu into a single large family, which he called Niger-Congo, after the two major rivers, the Niger and the Congo ‘in whose basins these languages predominate’ (Greenberg 1963: 7).

This included the subfamilies already recognized earlier: (1) West Atlantic (to which Greenberg joined Fulani, in a Serer-Wolof-Fulani [Fulfulde] group), (2) Mande (Mandingo) (thirty-five to forty languages), (3) Gur (or Voltaic), (4) Kwa (with Togo Remnant) and (5) Benue-Congo (Benue-Cross), with the addition of (6) Adamawa-Eastern, which had not previously been classified with these languages and whose classification remains controversial.

For Greenberg, Bantu was but a subgroup of Benue-Congo, not a separate subfamily on its own. In 1963 he joined Niger-Congo and the ‘Kordofanian’ languages into a larger postulated phylum, which he called Niger-Kordofanian.

Niger-Kordofanian has numerous supporters but is not well established; the classification of several of the language groups Greenberg assigned to Niger-Kordofanian is rejected or revised, though most scholars accept some form of Niger-Congo as a valid grouping.

As Nurse (1997: 368) points out, it is on the basis of general similarities and the noun-class system that most scholars have accepted Niger-Congo, but ‘the fact remains that no one has yet attempted a rigorous demonstration of the genetic unity of Niger-Congo by means of the Comparative Method.’

There is consensus among scholars that Niger-Kordofanian is a real thing.

Campbell and Mixco:

Nilo-Saharan: One of Greenberg’s four large phyla in his classification of African languages. In dismantling the inaccurate and racially biased ‘Hamitic,’ of which Nilo-Hamitic was held to be part, Greenberg demonstrated the inadequacy of those former classifications and argued for the connection between Nilotic and Eastern Sudanic.

He noted that ‘the Nilotic languages seem to be predominantly isolating, tend to monosyllabism, and employ tonal distinctions’ (Greenberg 1963: 92). To the extent that this classification is based on commonplace shared typology and perhaps areally diffused traits, it does not have a firm foundation. Nilo-Saharan is disputed, and many are not convinced of the proposed genetic relationships. It is generally seen as Greenberg’s wastebasket phylum, into which he placed all the otherwise unaffiliated languages of Africa.

First of all, Nilo-Saharan is not classified based on its language typology which were perhaps areally diffused. There is also a great deal of the more typical evidence in favor of this language family. Second,  it is not true that it lacks a firm foundation and that many are not convinced of its reality. The consensus among experts is that this family exists and the overwhelming majority of the subfamilies and isolates Greenberg put it in are correct.

Saying that it is a wastebasket phylum does not make sense because the Nilo-Saharan languages are only found in  a certain part of Africa. If it was truly such a phylum, there would be languages from all over Africa placed in this family.

According to Roger Bench, a moderate, there is now consensus in the last 10-15 years that Nilo-Saharan is a real thing.

Consensus has formed that 75% of the languages and families Greenberg put in Nilo-Saharan form a valid family. Controversy remains about the other 25% including Songhay, the Gumhuz family, and a few isolates. Some say these are part of Nilo-Saharan but others say they are not. Nilo-Saharan probably has a great time depth of ~13,000 years at least, such  that little probably remains to reconstruct. Reconstruction of Nilo-Saharan has proved difficult.

Yes, Campbell and Mixco say that Nilo-Saharan is not real, but they are not specialists.

Campbell and Mixco:

Khoisan: A proposed distant genetic relationship associated with Greenberg’s (1963) classification of African languages, which holds some thirty non-Bantu click languages of southern and eastern Africa to be genetically related to one another. Greenberg originally called his Khoisan grouping ‘the Click Languages’ but later changed this to a name based on a created compound of the Hottentots’ name for themselves, Khoi, and their name for the Bushmen, San.

Khoisan is the least accepted of Greenberg’s four African phyla. Several scholars agree in using the term ‘Khoisan’ not to reflect a genetic relationship among the languages but, rather, as a cover term for all the non-Bantu and non-Cushitic click languages.

Although it is probably true that Khoisan is the least accepted of Greenberg’s families, that’s not saying much, as it only means that 80% of experts accept its reality instead of 100%. I do not know who these several scholars are who feel that Khoisan is a typological area for click languages, but they do not seem to be specialists. Overall, Campbell and Mixco seriously distort consensus on Khoisan in this passage.

According to George Starostin, in the last 5-10 years, there is now consensus that Khoisan exists. There are five major Khoisan scholars, and four of them agree that Khoisan is real, with all of them including Sandawe and most including Hadza. There is one, Traill, who says it’s not real, but he is also a notorious Africanist splitter.

Campbell and Mixco:

Eurasiatic: Greenberg’s hypothesis of a distant genetic relationship that would group Indo-European, Uralic–Yukaghir, Altaic, Korean–Japanese–Ainu, Nivkh, Chukotian and Eskimo–Aleut as members of a very large ‘linguistic stock’. While there is considerable overlap in the putative members of Eurasiatic and Nostratic there are also significant differences. Eurasiatic has been sharply criticized and is largely rejected by specialists.

I have no doubt that Eurasiatic has been sharply criticized, but apart from a negative review in Language by Peter Daniels, the controversy seems quite muted compared to the furor over Amerind. I am also not sure that it is largely rejected by specialists. It probably is, but most of them have not even bothered to comment on it. I believe that this family is one of the best long-range proposals out there.

Based on the data from the pronouns alone, it’s obviously a real entity, though I would include Indo-European, Uralic-Yukaghir, Altaic including Japanese and Korean, Chukotian, and Eskimo-Aleut, leaving out Nivki for the time being and certainly leaving out Ainu. Nivki does seem to be a Eurasiatic language but it’s not a separate node. Instead it may be a part of the Chukotian family. Or even better yet, it seems to be part of a family connected to the New World via the Almosan family in the Americas.

I feel that Eurasiatic is a much more solid entity than Nostratic. Not that I am against Nostratic, but it’s more that Eurasiatic is a simple hypothesis to prove and with Nostratic, I’m much less sure of that. On the other hand, to the extent that Nostratic overlaps with Eurasiatic, it is surely correct.

Campbell and Mixco:

Indo-Anatolian: The hypothesis, associated with Edgar Sturtevant, that Hittite (or better said, the Anatolian languages, of which Hittite is the best known member) was the earliest Indo-European language to split off from the others. That is, this hypothesis would have Anatolian and Indo-European as sisters, two branches of a Proto-Indo-Hittite.

The more accepted view is that Anatolian is just one subgroup of Indo-European, albeit perhaps the first to have branched off, hence not ‘Indo-Hittite’ but just ‘Indo-European’ with Anatolian as one of its branches. In fact the two views differ very little in substance, since, in either case, Anatolian ends up being a subfamily distinct from the other branches and in the view of many the first to branch off the family.

The view that Anatolian is just another subgroup of IE is not the more accepted view. In fact, it has been rejected by specialists. Indo-Europeanists have told me that Indo-Anatolian is now the consensus among Indo-Europeanists, so Campbell and Mixco’s statement that Indo-Anatolian is a minority view is false.

Campbell and Mixco:

Nostratic (< Latin nostra ‘our’): A proposed distant genetic relationship that, as formulated in the 1960s by Illich-Svitych, would group Indo-European, Uralic, Altaic, Kartvelian, Dravidian and Hamito-Semitic (later Afroasiatic), though other versions of the hypothesis would include various other languages. Nostratic has a number of supporters, mostly associated with the Moscow school of Nostratic, though a majority of historical linguists do not accept the claims.

There are many problems with the evidence presented on behalf of the Nostratic hypothesis. In several instances the proposed reconstructions do not comply with typological expectations; numerous proposed cognates are lax in semantic associations, involve onomatopoeia, are forms too short to deny chance, include nursery forms and do not follow the sound correspondences formulated by supporters of Nostratic.

A large number of the putative cognate sets are considered problematic or doubtful even by its adherents. More than one-third of the sets are represented in only two of the putative Nostratic branches, though by its founder’s criteria, acceptable cases need to appear in at least three of the Nostratic language families. Numerous sets appear to involve borrowing. (See Campbell 1998, 1999.) It is for reasons of this sort that most historical linguists reject Nostratic.

It is probably correct that consensus among specialists is to reject Nostratic, but serious papers taking apart of the proposal seem to be lacking. Nevertheless, most dismiss it and it is beginning to enter into the emotionally charged terrain of Altaic and Amerind, particularly the former, and belief in it is becoming a thing of ridicule as it is for Altaic. Nevertheless, there have been a few excellent linguists doing work on this very long-range family for decades now.

Campbell and Mixco:

Indo-Uralic: The hypothesis that the Indo-European and Uralic language families are genetically related to one another. While there is some suggestive evidence for the hypothesis, it has not yet been possible to confirm the proposed relationship.

This summary seems too negative. Indo-Uralic is probably one of the most promising long-range proposals out there. I regard the relationship between the two as obvious, but to me it is only a smaller part of the larger Eurasiatic family. Frederick Kortland has done a lot of good work on this idea. Even some hardline splitters are open to this hypothesis.

Campbell and Mixco:

Altaic: While ‘Altaic’ is repeated in encyclopedias and handbooks most specialists in these languages no longer believe that the three traditional supposed Altaic groups, Turkic, Mongolian and Tungusic, are related. In spite of this, Altaic does have a few dedicated followers.

The most serious problems for the Altaic proposal are the extensive lexical borrowing across inner Asia and among the ‘Altaic’ languages, lack of significant numbers of convincing cognates, extensive areal diffusion and typologically commonplace traits presented as evidence of relationship.

The shared ‘Altaic’ traits typically cited include vowel harmony, relatively simple phoneme inventories, agglutination, their exclusively suffixing nature, (S)OV ([Subject]-Object-Verb) word order and the fact that their non-main clauses are mostly non-finite (participial) constructions.

These shared features are not only commonplace typological traits that occur with frequency in unrelated languages of the world and therefore could easily have developed independently, but they are also areal traits shared by a number of languages in surrounding regions the structural properties of which were not well-known when the hypothesis was first framed.

This one is still up in the air, but Campbell and Mixco are lying when they say that idea has been abandoned. Most US linguists regard it as a laughingstock, and if you say you believe in it you will experience intense bullying and taunting from them. Oddly enough, outside the US, in Europe in particular, Altaic is regarded as obviously true. However, notorious anti-Altaicist Alexander Vovin has camped out in Paris and is now spreading his nihilistic doctrine to Europeans there.

The problem is that almost all of the US linguists who will laugh in your face and call you an idiot if you believe in Altaic are not specialists in the language. However, I did a study of Altaic specialists, and 73% of them believe in some form of Altaic.

So the anti-Altaicists are pushing a massive lie – that critical consensus has completely abandoned Altaic and regards as a laughingstock, but their project is more Politics and Propaganda than Science. In particular, it’s a fad. So Altaic is in the preposterous position where almost all of the people who know nothing about it will laugh in your face and call you an idiot if you believe in it and the overwhelming majority of specialists will say it’s real.

Altaic must be the only nonexistent family that has an incredibly elaborate 1,000 page etymological dictionary, full reconstructions of the proto-languages, etymologies of over 2,000 Altaic terms, and elaborate sound correspondences running through it. The anti-Altaicists use the silly “we can’t reconstruct the numerals so it’s not real” line here.

Altaic is obviously true based on 1-2 person pronoun paradigms at an absolute minimum. The anti-Altaic argument of course, is preposterous. As noted, they dismiss a vast 1,000 page Etymological Dictionary with 2,300 reconstructed etymologies as a hallucinated work.

There are vast parallels in all three families at all levels, in particular in the Mongolic-Tungusic family, which gets a 100% with computer programs. The go-to argument here has always been that these changes are all due to borrowings, but for this to have occurred, borrowing would have had to occur between large far removed language families on such a vast scale the likes of which has never been seen anywhere on Earth.

The argument that entire 1-2 pronoun paradigms have been borrowed is particularly preposterous because 1-2 pronouns are almost never borrowed anyway, and there has never been a single case of on Earth of the borrowing of a 1-2 person pronoun paradigm, much less the borrowing of one at the proto-language level. So the anti-Altaicists are arguing that something that has never happened anywhere on Earth not only happened, but happened more than once among different proto-languages. So the anti-Altaic argument is that something that could not possibly have happened actually occurred.

This is the conclusion of every paper the splitters write. Something that has never occurred on Earth and probably could not possibly happen not only occurred, but occurred many times around the globe for thousands of years.

Many regard including Japonic and Koreanic in Altaic as dubious, although having looked over the data, I am certain that they are part of Altaic. But they seem to be further away from the traditional tripartite system than the traditional three families are to each other. If we follow the theory that Japanese and Korean have been split from Proto-Altaic for 8,000 years, this starts to make a lot more sense.

The ridiculous massive borrowings argument specifically fails for geographical reasons. Proto-Turkic was never next door to Proto-Mongolic and Proto-Tungusic. The Proto-Altaic homeland is in the Khingan Mountains in Western Manchuria and Eastern Mongolia. Tungusic split off from Altaic 5,300 years ago, leaving Proto-Turkic-Mongolic in Khingans. 3,400 years ago, Proto-Turkic broke from Proto-Turkic-Mongolic and headed west to Northern Kazakhstan and the southern part of the Western Siberian Plain, leaving Mongolic alone in the Khingans.

Proto-Transeurasian – Khingans 9,000 YBP

Proto-Korean – Liaojiang on the north shore of the Bohai Sea 8,000 YBP.

Proto-Japanese – Northern coast of the Shandong Peninsula on the southern shore of the Bohai Sea 8,000 YBP

Proto-Tungusic – Amur Peninsula 5,300 BP. Breaks apart 2,000 YBP.

Proto-Turkic – Northern Kazakhstan 3,400 BP.

Proto-Mongolic – Khingans 3,400 BP.

Can someone explain to me how Mongolic and Tungusic borrow from Turkic 3,000 miles away in a different place at a different time in this scenario? Can someone explain to me how any of these proto-languages borrowed from each other at all, especially as they were in different places at different times?

Not only that but supposedly both Proto-Mongolic and Proto-Tungusic each borrowed from Proto-Turkic separately. These borrowings included massive amounts of core vocabulary in addition to an entire 1st and 2nd person pronoun paradigm.

Keep in mind that the borrowing of this paradigm, something that has never happened anywhere, supposedly occurred not just once but twice, between Proto-Tungusic 5,300 YBP on the Amur from Proto-Turkic in North Kazakhstan 3,000 miles away 2,000 later, and at the same time, between  Proto-Mongolic in the Khingans and Proto-Turkic in Northern Kazakhstan 3,000 miles away. How exactly did this occur?

And can someone explain to me how Proto-Korean and Proto-Japanese borrow from either of the others under this scenario?

Campbell and Mixco:

Turkic: A family of about thirty languages, spoken across central Asia from China to Lithuania. The family has two branches: Chuvash (of the Volga region) and the non-Chuvash Turkic branch of relatively closely related languages. Some of the Turkic languages are Azeri, Kyrgyz, Tatar, Crimean Tatar, Uighur, Uzbek, Yakut, Tuvan, and Tofa. Turkic is often assigned to the ‘Altaic’ hypothesis, though specialists have largely abandoned Altaic.

As noted above, it is simply incorrect that specialists have largely abandoned Altaic. This is simply carefully crafted propaganda on the part of Campbell and Mixco. In fact, my own study showed that 73% of experts in these families felt that Altaic existed at least in some form, if only in a relationship with two out of the three-five languages.

Campbell and Mixco:

Some scholars classify Korean in a single family with Japanese; however, this is a controversial hypothesis. Korean is often said to belong with the Altaic hypothesis, often also with Japanese, though this is not widely supported.

Japonic-Koreanic has considerable support among specialists in these languages, although it is not universally accepted. Campbell and Mixco are excessively negative about the level of support for an expanded Altaic. In fact, an expanded Altaic which includes Japanese and Korean in some part of it has significant though probably not majority support. Perhaps 30-40% of specialists support it.

Shandong Peninsula with Tianjin and Liaojiang across the Bohai Sea, location of the Proto-Japonic and Proto-Korean homelands.

Proto-Japanic and Proto-Koreanic were both spoken in Northeastern China 8,000 YBP. Proto-Japonic was spoke on the north of the Shandong Peninsula and Proto-Koreanic was spoken across the Bohai Sea in Tianjin and especially across the Bohai Straights on the Liaodong Peninsula. They may have stayed here next to each other for 3,000 years until the Proto-Koreanics moved to the Korean Peninsula 5,000 YBP, displacing the Ainuid types there. Proto-Japonics probably stayed in Shandong until 2,3000 YBP when they left to populate Japan and the Ryukus, displacing the Ainu who were already there.

Campbell and Mixco:

Yeniseian, Yenisseian: Small language family of southern Siberia of which Ket (Khet) is the only surviving member. Yeniseian has no known broader relatives, though some have been hypothesized (see the Dené-Caucasian hypothesis).

Campbell and Mixco state and serious untruth here, including some weasel words. By discussing Dene-Caucasian in the same breath as relatives of Yenisien, they are able to deflect away from the more widely accepted proposal of a link between Yenisien in the Old World and Na-Dene in the New World. This is Edward Vajda’s Dene-Yenisien proposal.

The problem is that this long-range proposal has the support of many people, including splitter Johanna Nichols. Of the 17 experts who weighed in on Dene-Yenisien, 15 of them had a positive view of the hypothesis. Campbell and Mixco are the only two who are negative, but neither are experts on either family. All specialists in either or both families support the proposal. When 15 out of 17 is not enough, one wonders at what point the field reaches a consensus. Must we hold out for Campbell and Mixco’s approval for everything?

Campbell and Mixco:

Nivkh (also called Gilyak): A language isolate spoken in the northern part of Sakhalin Island and along the Amur River of Manchuria, in China. There have been various unsuccessful attempts to link Nivkh genetically with various other language groupings, including Eurasiatic and Nostratic.

Granted, there is no consensus on the affiliation of Nivkhi. However, a recent paper by Sergei Nikolaev proved to me that Nivkhi is related to Algonquian-Wakashan, a family of languages in the Americas. One of these languages is Wakashan, and there has been talk of links between Wakashan and the Old World for some time.

Michael Fortescue places Nivkhi in Chukotko-Kamchatkan. Greenberg places it is Eurasiatic as a separate node. But as Chukotko-Kamchatkan is part of Eurasiatic, they are both saying the same thing in a way. My theory is that Nivkhi is Eurasiatic, possibly related to Chukoto-Kamchatkan, and like Yeniseian, is also connected to languages in North America as some of the Nivkhi probably migrated to North America and became the American Indians. In this way, we can reconcile both hypotheses.

There are three specialist views on Nivkhi. One says it is Eurasiatic, the other that it is Chukotian, and the third that it is part of the Algonquian-Wakashan or Almosan family in the New World. Consensus is that Nivkhi is related to one of two other entities – other languages in Northeastern Asia or a New World Amerindian family. So expert consensus seems to have moved away from the view of Nivkhi as an isolate.

Campbell and Mixco:

Paleosiberian languages (also sometimes called Paleoasiatic, Hyperborean languages): A geographical (not genetic) designation for several otherwise unaffiliated languages (isolates) and small language families of Siberia.

Perhaps the main thing that unites these languages is that they are not Turkic, Russian or Tungusic, the better known languages of Siberia. Languages often listed as Paleosiberian are: Chukchi, Koryak, Kamchadal (Itelmen), Yukaghir, Yeniseian (Ket) and Nivkh (Gilyak). These have no known genetic relationship to one other.

Taken as a broad statement, of course this is true. However, Chukchi, Koryak, and Kamchadal or Itelmen are part of a family called Chukutko-Kamchatkan. This family has even been reconstructed. Campbell and Mixco’s statement that these languages have no known genetic relationship with each other is false.

Campbell and Mixco:

Austroasiatic: A proposed genetic relationship between Mon-Khmer and Munda, accepted as valid by many scholars but not by all.

The fact is that Austroasiatic is not a “proposed genetic relationship.” Instead it is now accepted by consensus. That there may be a few outliers who don’t believe in it is not important. I’m not aware of any linguists who doubt Austroasiatic other than Campbell and Mixco, and neither is a specialist. Austroasiatic-Hmong-Mien is the best long-range proposal for Austroasiatic, but it has probably not yet been proven. Austroasiatic is also part of the expanded version of the Austric hypothesis.

Campbell and Mixco:

Miao-Yao (also called Hmong-Mien): A language family spoken by the Miao and Yao peoples of southern China and Southeast Asia. Some proposals would classify Miao-Yao with Sino-Tibetan, others with Tai or Austronesian; none of these has much support.

This seems to be more weasel wording on the part of the authors. By listing Tai or Austronesian and Sino-Tibetan as possible relatives of Miao-Yao and then correctly dismissing it, they leave out a much better proposal linking Hmong-Mien to Austroasiatic.

This shows some promise, but the relationship is hard to see amidst all of the Chinese borrowing. As noted, the relationship between Hmong-Mien and Sino-Tibetan is one of borrowing. The relationship with Tai or Austronesian is part of Paul Benedict’s original Austric proposal. He later turned against this proposal and supported a more watered down Austric with Austronesian and Tai-Kadai, which seems to be nearing consensus support now.

Campbell and Mixco:

Austric: A mostly discounted hypothesis of distant genetic relationship proposed by Paul Benedict that would group together the Austronesian, Tai-Kadai and Miao-Yao.

More weasel wording. It is correct that Benedict’s original Austric (which also included Austroasiatic) was abandoned even by Benedict himself, a more watered down Austric that he later supported consisting of Austronesian and Tai-Kadai called Austro-Tai has much more support. They get around discussing the watered down Austro-Tai with good support by limiting Austric to Benedict’s own theory which even he rejected later in life. In this sense, they misrepresent the debate, probably deliberately.

In fact, evidence is building towards acceptance of Austro-Tai after papers by Weera Ostapirat and Laurence Sagart seem to have proved the case using the comparative method. Roger Blench also supports the concept. In addition, to Benedict, it is also supported by  Lawrence Reid, Hui Li, and Lawrence Reid. It is opposed by Graham Thurgood, who is a specialist (he was my main academic advisor on my Master’s Degree in Linguistics). It is also opposed by Campbell and Mixco, but they are not specialists. Looking at expert opinion, we have seven arguing for the theory and one arguing against it. Specialist consensus then is that Austro-Tai is a real language family.

Even the larger version of Austric, including all of Benedict’s families plus Ainu and the South Indian isolate Nihali, has some supporters and some suggestive evidence that it may be correct.

Campbell and Mixco:

Tai-Kadai: A large language family, generally but not
universally accepted, of languages located in Southeast Asia and southern China. The family includes Tai, Kam-Sui, Kadai and various other languages. The genetic relatedness of several proposed Tai-Kadai languages is not yet settled.

Tai-Kadai is not “mostly but not universally accepted.” It is accepted by consensus as an existent language family. Perhaps whether some languages belong there is in doubt but the proposal itself is not controversial. Campbell and Mixco’s statement that Tai-Kadai remains controversial is a serious distortion of fact.

Campbell and Mixco:

Na-Dene: A disputed proposal of distant genetic relationship, put forward by Sapir, that would group Haida, Tlingit and Eyak-Athabaskan. There is considerable disagreement about whether Haida is related to the others. The relationship between Tlingit and Eyak-Athabaskan seems more likely, and some scholars misleadingly use the name ‘Na-Dené’ to mean a grouping of these two without Haida.

Levine and Michael Krauss, two top Na-Dene experts, are on record as opposing the addition of Haida to Na-Dene for 40 years. A recent conference about Edward Vajda’s Dene-Yenisien concluded that there was no evidence to include Haida in Na-Dene. However, a recent paper by Alexander Manaster-Ramer made the case that Haida is part of Na-Dene. This paper was enough to convince me. Further, the scholar with the most expertise on Haida has said that Haida is part of Na-Dene. So Campbell and Mixco are correct here that the subject is up in the air with both supporters and opponents.

The statement that a relationship between Tlingit and Eyak-Athabaskan seems “more than likely” is an understatement. I believe it is now linguistic consensus that Tlingit is part of Na-Dene, so Campbell and Mixco’s statement is not quite true.

Campbell and Mixco:

Tonkawa: An extinct language isolate of Texas. Proposals to link Tonkawa with the languages of the Coahuiltecan or Hokan-Coahuiltecan hypotheses have not generally been accepted.

I’m sure it is the case that Coahuiltecan and Hokan-Coahuiltecan affiliations of Tonkawa have been rejected. A Coahuiltecan connection was even denied by Manaster-Ramer, who recently proved that the family existed. That said, there are interesting  parallels between Tonkawa and Coahuiltecan that I cannot explain. However, a recent paper by Manaster-Ramer made the much better case that Tonkawa was in fact Na-Dene.

Campbell and Mixco:

Amerind: The Amerind hypothesis is rejected by nearly all practicing American Indianists and by most historical linguists. Specialists maintain that valid methods do not at present permit classification of Native American languages into fewer than about 180 independent language families and isolates. Amerind has been highly criticized on various grounds.There is an excessive number of errors in Greenberg’s data.

Where Greenberg stops – after assembling superficial similarities and declaring them due to common ancestry – is where other linguists begin. Since such similarities can be due to chance similarity, borrowing, onomatopoeia, sound symbolism, nursery words (the mama, papa, nana, dada, caca sort), misanalysis, and much more, for a plausible proposal of remote linguistic relationship one must attempt to eliminate all other possible explanations, leaving a shared common ancestor as the most likely.

Greenberg made no attempt to eliminate these other explanations, and the similarities he amassed appear to be due mostly to accident and a combination of these other factors.

In various instances, Greenberg compared arbitrary segments of words, equated words with very different meanings (for example, ‘excrement/night/grass’), misidentified many languages, failed to analyze the morphology of some words and falsely analyzed that of others, neglected regular sound correspondences, failed to eliminate loanwords and misinterpreted well-established findings.

The Amerind ‘etymologies’ proposed are often limited to a very few languages of the many involved. Finnish, Japanese, Basque and other randomly chosen languages fit Greenberg’s Amerind data as well as or better than do any of the American Indian languages in his ‘etymologies’; Greenberg’s method has proven incapable of distinguishing implausible relationships from Amerind generally. In short, it is with good reason Amerind has been rejected.

The movement into the Americas came in three waves.

The first wave brought the Amerinds. It is here where the 160 language families reside. According to the reigning theory in Linguistics, this group of Amerindians came in one wave that spoke not only 160 different languages but spoke languages that came from 160 different language families, none of which were related to each other. These being language families which, by the way, we can find scarcely a trace of in the Old World.

The second wave was the Na-Dene people who came along the west coast and then went inland.

The last wave were the Inuits.

Greenberg simply lumped all of the 600 languages of the  Americas into a single family. The argument was good, though I’m not sure he proved that every single one of those languages were all part of Amerind. But a lot of them were. The n- m- 1st and 2nd person pronouns are found in 450 of those languages. The ablauted t’ana, t’una, t’ina word, meaning respectively human child  of either sex, all females including family terms, and all males including family terms are extremely common in Amerind.

So t’ana just means child. T’una means girl, woman, and includes various names for all sorts of female relatives – grandmother, cousin, aunt, niece, etc. T’ina means boy, man, and includes the family terms grandfather, brother-in-law, uncle, cousin, and  nephew. This ablauted paradigm is found across a vast number of these Amerind languages, and it is nonexistent in the rest of the world.

Quite probably most to all of those languages having that term are part of a single family. What are the other arguments? That 300 languages independently innovated these terms, in this precise ablauted paradigm, on their own? What is the likelihood of that?

That these items occurring across such vast swathes of languages is due to chance? But this paradigm does not exist anywhere else, so how could it be due to chance? That these core vocabulary items were borrowed massively all across the Americas, when family terms like that are rarely borrowed? That’s not possible. None of the alternate theories make the slightest bit of sense.

Hence, the Amerind languages that have the n- m- pronoun paradigm and the t’ana, t’una, t’ina ablauted names for the sexes and the terms of family relations by sex are quite probably part of a huge language family. I’m well aware that a few of the languages having those terms could be due to chance. I’m pretty sure that about zero of those pronouns and few, if any, of those family terms were borrowed.

However, not all Amerind languages have either the pronoun paradigm or the ablauted sex term. In those cases, I’m unsure if those languages are all part of the same language. But if you can put those languages in families and reconstruct to the proto-languages and end up with the pronoun paradigm or the ablauted family term reconstructed in the proto-language of that family, I’m sure that family would be part of Amerind. That’s about all you have to do to prove relationship in Amerind.

Campbell and Mixco:

Penutian: A very large proposed distant genetic relationship in western North America, suggested originally by Dixon and Kroeber for the Californian language families Wintuan, Maiduan, Yokutsan, and Miwok-Costanoan. The name is based on words for ‘two’, something like pen in Wintuan, Maiduan, and Yokutsan, and uti in Miwok-Costanoan, joined to form Penutian.

Sapir, impressed with the hypothesis, attempted to add an Oregon Penutian (Takelma, Coos, Siuslaw, and ‘Yakonan’), Chinook, Tsimshian, a Plateau Penutian (Sahaptian, ‘Molala-Cayuse,’ and Klamath-Modoc) and a Mexican Penutian (Mixe-Zoquean and Huave).

The Penutian grouping has been influential, and later proposals have attempted to unite various languages from Alaska to Bolivia with it. Nevertheless, it had a shaky foundation based on extremely limited evidence, and, in spite of extensive later research, it did not prove possible to demonstrate any version of the Penutian hypothesis and several prominent Penutian specialists abandoned it. Today it remains controversial and unconfirmed, with some supporters but with many who doubt it.

The statement that today it “remains controversial and unconfirmed, with some supporters but with many who doubt it,”  has no basis in fact. It is surely controversial and it is probably unconfirmed by linguistic consensus. Yes, it has a number of supporters, and there are quite a few who doubt it. However, among those who doubt it, none of them are specialists in these languages. Hence, we are dealing with an Altaic situation here, where the specialists believe in it but the non-specialists insist it’s nonsense.

In fact, the consensus among the specialists on these languages is that Penutian exists. A Penutian family comprising Maiduan, Utian (Miwok-Costanoan), Wintuan, Yokutsan, Coosan, Siuslaw, Takelma, and Kalapuyan and Alsean (Yakonan), Chinookan, Tsimshianic, Klamath-Modoc (Lutuami), Cayuse and Molala (Waiilatpuan), Sahaptian has been proven to my satisfaction. I am uncertain of the Penutian status of Mixe-Zoque and Huave (Mexican Penutian), although I believe that Huave and Mixe-Zoque are related to each other, albeit at a very deep time depth of 9,000 years.

Anti-Penutianists have not published a paper in a long time. The last one I remembered was published by William Shipley, and he’s been gone for a while. I am not aware of one expert on these languages who says Penutian does not exist.

Campbell and Mixco:

Cayuse-Molala: A genetic classification no longer believed that linked Cayuse (of Oregon and Washington) and Molala (of Oregon) in a single assumed family. The evidence for this was later shown to be wrong and the hypothesis was abandoned.

According to Campbell and Mixco, Cayuse is an isolate. I assume they see Molala as an isolate too. There probably is no Cayuse-Molala family, but Molala is part of Plateau Penutian, and Cayuse may be part of the same group. Plateau Penutian is part of the Penutian hypothesis, which appears to be true. By not mentioning these facts, Campbell and Mixco’s statement is quite misleading.

Campbell and Mixco:

Mosan: A now abandoned proposal of distant genetic relationship that would group Salishan, Wakashan and Chimakuan together.

Another part of this proposal was that Mosan was part of a larger family with Algonquian called Almosan. An excellent series of papers was published recently by Sergei Nikolaev that validated Almosan and proved to me that it was related to Nivkhi in the Old World.

Michael Fortescue argued a few years before that Mosan was a valid entity and that was related to the Old World language Nivkhi. Recently, Murray Gell-Mann, Ilia Peiros, and Georgiy Starostin also supported Almosan and grouped it with Chukotko-Kamchatkan and Nivkhi. David Beck recently argued that Mosan is a language area or Sprachbund instead of a genetic family.

So far we have four specialists arguing that Mosan exists, and one saying it does not. The consensus among specialists seems to be that Mosan is a valid language family. At any rate, Campbell and Mixco’s statement that this proposal is “now abandoned” is false.

For Almosan, we have four specialists saying it exists and two apparently saying it does not. Expert consensus on Almosan is optimistic.

Hokan: A controversial hypothesis of distant genetic relationship proposed by Dixon and Kroeber among certain languages of California; the original list included Shastan, Chimariko, Pomoan, Karok, and Yana, to which they soon added Esselen, Yuman, and later Chumashan, Salinan, Seri, and Tequistlatecan. Later scholars, especially Edward Sapir, proposed various additions to Hokan. Many ‘Hokan’ specialists doubt the validity of the hypothesis.

It is not true that many Hokan specialists “doubt the validity of the hypothesis.” I can’t remember the last time I saw an anti-Hokan paper. Yes, Campbell, Mixco, and Mithun say Hokan does not exist, but they are not specialists. The consensus among specialists such as Mikhail Zhikov, Terence Kaufman, and Marcelo Jokelsy is that Hokan exists. I have only found one specialist who disagrees with the Hokan hypothesis, and she merely doubts the existence of Ch’imáriko.

I believe that a Hokan family consisting of Karuk, Shasta-Palaihnihan, Ch’imáriko, Yana, Salinan, Pomoan, Yuman, Seri, and Tequistlatecan exists, although I would leave out Chumashan, Washo, and Jicaquean or Tolan. Chumashan is an isolate, and while Washo and Tolan may be Hokan at a very deep time depth, the few possible cognates are not enough to provide evidence of this. I am agnostic on Esselen, which is only known from a 350 word list collected by friars at a California mission.

I have not seen any evidence that Coahuiltecan is Hokan. There is some evidence, though not probative enough for me, that Lencan and Misumalpan may be Hokan. Nevertheless, Lencan and Misumalpan form a language family that has even been accepted by Campbell himself. This is the only long-range family proposal he has supported since the publication of LIA.

Although Campbell’s opinion on many hypotheses may be waved away as he is not an expert on that family or language, Lencan and Misumalpan are right up his alley as he is an expert in languages in Central America. He has focused mostly on Mayan, but he also knows the other languages of the region well.

Campbell and Mixco:

Cochimí–Yuman: A family of languages from Arizona, California and Baja California, with two branches, extinct Cochimí (of Baja California) and the Yuman subfamily (members of which are Kiliwa, Diegueño, Cocopa, Mojave, Maricopa, Paipai, and Walapai–Havasupai–Yavapai, among others). Cochimí–Yuman is often associated with the controversial Hokan hypothesis, though evidence is insufficient to embrace the proposed relationship.

The consensus among experts in the Cochimí–Yuman family, including Mikhail Zhikov and Terence Kaufman, is that it is part of the Hokan family. Campbell disbelieves in the association but he is not an expert. However, Mixco opposes the Hokan affinity of Cochimi-Yuman, and granted, he is actually a specialist on these languages. So among specialists, we have two who support the Hokan association and one who opposes it. The specialist consensus then would be that they are this association is a promising hypothesis, but it is not yet proven. This is different from Campbell and Mixco’s wording, which is more negative.

Campbell and Mixco:

Coahuiltecan: A hypothesis of distant genetic relationship that proposed to group some languages of south Texas and northern Mexico: Coahuilteco, Comecrudo and Cotoname, and sometimes also Tonkawa, Karankawa, Atakapa and Maratino (with Aranama and Solano assumed to be varieties of Coahuilteco).

Sapir proposed a broader classification of Hokan–Coahuiltecan, joining the Coahuiltecan proposal with the broader Hokan hypothesis, and placed this in his even larger Hokan–Siouan super-stock. None of these proposals has proven sufficiently robust to be accepted generally.

I am not aware of any specialists who have recently argued against the existence of Coahuiltecan. Yes, Campbell and Mixco do not accept it, but they are not specialists. A recent paper by Alexander Manaster-Ramer proved the existence of Coahuiltecan to my satisfaction. I believe that a Coahuiltecan family consisting of Comecrudo, Cotoname, Aranama, Solano, Mamulique, Garza, and Coahuilteco absolutely exists. Karankawa is probably a part of this family. I am not aware that any specialist is arguing against the existence of this family at the moment.

I do not think there is good evidence for other postulated languages such as Atakapa and Tonkowa. First of all, Tonkawa is probably Na-Dene as per another paper by Manaster-Ramer. Atakapa is part of the Gulf family. However, I am not yet convinced that Coahuiltecan is as member of the Hokan language family.

Campbell and Mixco:

Gulf: Hypothesis of a distant genetic relationship proposed by Mary R. Haas that would group Muskogean, Natchez, Tunica, Atakapa and Chitimacha, no longer supported by most linguists.

The notion that Gulf is no longer supported by most linguists is simply incorrect. There have only been four linguists who studied this family.

The first was Mary Haas, who also proposed a relationship with Yuki as Yuki-Gulf. Haas was always dubious about Chitimacha’s addition to Gulf.

Greenberg resurrected Yuki-Gulf in LIA.

Pam Munro is an expert on these languages. A while back she published a paper on Yuki-Gulf. I read that paper. The resemblances are so stunning between Muskogean, Natchez, Tunica, Atakapa and Chitimacha that I was shocked that anyone doubted the relationship. Furthermore, the relationship with Yuki and Wappo, a full 2,500 miles away in Northern California, was shocking.

The fourth was Geoffrey Kimball, who concluded that Gulf was probably a family but that this could not be proven.

There evidence for Gulf in Munro’s paper was good, and there even appeared to be sound correspondences running through the relationship. What was shocking about it was that Yuki and Wappo could not possibly have borrowed from Gulf because Gulf is in Louisiana 2,500 miles away. So how did all these resemblances come in? Chance is ruled out. Borrowing could not have happened. Therefore a relationship at least between Yuki and the Gulf languages is obvious.

Munro’s paper took the position that Greenberg’s Yuki-Gulf hypothesis was correct. However, there are some problems. First, Atakapa as part of Gulf has been controversial, in part because it has also been tied in with Coahuiltecan. Indeed there are resemblances between the two, and they were not spoken next to each other so borrowing can be ruled out.

Perhaps a way of solving the matter is to posit not only Yuki-Gulf but a larger family that includes Coahuiltecan as Greenberg does in LIA. I have no idea how justified this is, but there are certainly surprising resemblances between Atakapa and the Coahuiltecan languages.

Furthermore, whether or not Chitimacha is part of Gulf has been up in the air from the beginning when Haas published her paper. Recent papers have made the case that Chitimacha is related to Mesoamerican language families of Mexico such as Mixe-Zoque and Totonacan. These papers used the comparative method. Campbell has rejected this hypothesis.

That Tunica at the very least shows a close relationship with Muskogean is not even controversial. The idea has a long pedigree and is presently supported by all experts in this family.

Geoffrey Kimball examined the data recently and concluded that from the evidence, it appears that Gulf exists, but we will never be able to prove it, as he puts it. However, he stated that Tunica is almost certainly related to Muskogean. At this point, I would think that Tunica-Muskogean at the very least should be considered consensus among specialists.

Kimball’s paper had a number of problems, mostly that he was operating with a negative stance towards the existence of the family. Further, there were issues with his notions of sound symbolism and borrowing in the paper where his explanations made no sense at all.

Let’s evaluate Campbell and Mixco’s statement that Gulf is no longer supported by most linguists.

We have four specialists on record about whether or not a Gulf family exists.

Mary Haas: Positive, minus Chitimacha

Joseph Greenberg: Positive

Pamela Munro: Positive

Geoffrey Kimball: Probably exists but it’s not possible to prove it.

Brown et al: Chitimacha is a part of the Totonozoquean family, not the Gulf family. The other members of Gulf are not members of this family.

Three out of the four specialists on the Gulf family say that the Gulf family is a reality. The other feels it exists but cannot be proven. And there is uncertainty about whether Chitimacha is probably not part of Gulf. The consensus among experts is that Gulf is a real language family.

Campbell and Mixco’s statement that Gulf is no longer supported by most linguists is simply false.

Furthermore, I would like to point out that a good case can be made for the existence of a Totonozoquean family consisting of the Mixe-Zoque and Totonacan languages. Whether this is consensus among experts is somewhat up in the air.

Campbell and Mixco:

Macro-Gê: A proposed distant genetic relationship composed of several language families and isolates, many now extinct, along the Atlantic coast (primarily of Brazil). These include Chiquitano, Bororoan, Botocudoan, Rikbaktsa, the Gê family proper, Jeikó, Kamakanan, Maxakalían, Purian, Fulnío, Ofayé and Guató. Many are sympathetic to the hypothesis and several of these languages will very probably be demonstrated to be related to one another eventually, though others will probably need to be separated out.

This is much too pessimistic. Macro-Gê is not a proposed long range family -it is a large language family in South America accepted by consensus. It is not true that many are sympathetic to it; instead, the consensus is that it is correct. Nor is it correct to say that it will probably be demonstrated eventually. In fact, it is already an accepted reality.

Campbell and Mixco:

Quechumaran: Proposed distant genetic relationship that would join Quechuan and Aymaran. While considerable evidence has been gathered in support of the hypothesis, it is extremely difficult in this case to distinguish what may be inherited (and therefore evidence of a genetic relationship) from what may be diffused (and therefore not reliable evidence of a genetic connection).

It is true that there is no consensus on the existence of Quechumaran. The consensus seems to be as above that it is not yet proven. Those opposed to the idea throw out the usual borrowing scenario, but they have had to push the large number of borrowings in core vocabulary all the way back to Proto-Aymara and Proto-Quechua. In my opinion, “massive borrowing of core vocabulary at the proto-language level” is simply another word for genetics.

Gerald Clauson, the famous Turkologist opponent of Altaic, had to keep pushing his massive borrowings of core vocabulary further and further back until he eventually had the scenario taking place at the Proto-Turkic, Proto-Tungusic, and Proto-Mongolic levels. See above for my analysis on why these three proto-languages could not possibly have borrowed from each other as they were in different places in different times.

A similar problem exists with opponents of the Uralo-Yukaghir theory, in which they are also forced to deal with a large amount of core vocabulary dating back a long time. Hakkinen tried to solve this problem by pushing the borrowing all the way back to not just Proto-Uralic but Pre-Proto-Uralic. Pre-Proto-Uralic at 8,000 years to me means nothing less than Uralo-Yukaghir. What else could it mean? He has heavy borrowing of core vocabulary between Pre-Proto-Uralic and Proto-Yukaghir. That’s another way of saying genetics.

Campbell and Mixco:

Macro-Guaicuruan (also spelled Macro-Waykuruan, Macro-Waikuruan): A proposed distant genetic relationship that would join the Guaicuruan and Matacoan families of the Gran Chaco in South America in a larger-scale genetic classification. Grammatical similarities, for example in the pronominal systems, have suggested the relationship to some scholars, but the extremely limited lexical evidence raises doubts for others. Some would also add Charruan and Mascoyan to these in an even larger ‘Macro-Waikuruan cluster.’

It is not true that this is a proposed long-range family suggested by some by doubted by others. In fact, Macro-Guaicuruan is accepted by consensus and is as uncontroversial as Macro-Gê, Pama-Nyungan, and other such families. There is however debate about which families are members outside of the Guaicuruan and Mataguayo language families that make up the essence of the family. There have been suggestions to add Lule-Vilela and the Zamucoan, Charruan, and Mascoyan families to this family. I do not feel that these additions are yet warranted.

Campbell and Mixco:

Pama-Nyungan: A very large, widely spread language family of Australia, some 175 languages. The name comes from Kenneth Hale, based on the words pama ‘man’ in the far northeast and nyunga ‘man’ in the southwest. Languages assigned to Pama-Nyungan extend over four-fifths of Australia, most of the continent except northern areas.

Pama-Nyungan is accepted by most Australianists as a legitimate language family, but not uncritically and not universally. It is rejected by Dixon; it is held by others to be plausible but inconclusive based on current evidence. Some Pama-Nyungan languages are Lardil, Kayardilt, Yukulta, Yidiny, Dyirbal, Pitta-Pitta, Arrente, Warlpiri, Western Desert language(s), and there are many more.

Actually, consensus now is that this family of Australian languages does indeed exist. True, Dixon challenged the existence of Pama-Nyungan recently, but his opposition was so outrageous and it prompted a quick surge of papers from Australianists defending the existence of Pama-Nyungan. The notion that other Australianists feel that Pama-Nyungan is possible but presently inconclusive is not correct. I am not aware of a single Australianist other than Dixon who feels this way. Instead, Pama-Nyungan is about as uncontroversial as Macro-Gê, Afroasiatic, or Austroasiatic.

Campbell and Mixco:

‘Papuan’ languages: A term of convenience used to refer to the languages of the western Pacific, most in New Guinea (Papua New Guinea and the Indonesian provinces of Papua and West Irian Jaya), that are neither Austronesian nor Australian. Papuan definitely does not refer to a genetic relationship among these languages for no such relationship can at present be shown.

That is, the term is defined negatively and does not imply a linguistic relationship. While most are spoken on the island of New Guinea, some are found in the Bismark Archipelago, Bougainville Island and the Solomon Islands to the east, and in Halmahera, Timor and the Alor Archipelago to the west.

There are some 800 Papuan languages divided in the a large number of mostly small language families and isolates not demonstrably related to one another.

For what it’s worth, this statement by Campbell and Mixco is correct.

Campbell and Mixco:

One large genetic grouping that has been posited for a number of Papuan languages is the Trans-New Guinea phylum, which is promising but not yet confirmed.

Trans-New Guinea is not “promising but not yet confirmed.” Instead it is an uncontroversial language family accepted by the consensus of all specialists.


Beck, David (1997). Mosan III: A Problem of Remote Common Proximity. International Conference on Salish (and Neighbo(u)ring) Languages.
Benedict, Paul K. (1942). “Thai, Kadai, and Indonesian: A New Alignment in Southeastern Asia.” American Anthropologist 44, 4: 576–601.
Benedict, Paul K. (1975). Austro-Thai Language and Culture, with a Glossary of Roots. New Haven: HRAF Press.
Blench, Roger (2008). The Prehistory of the Daic (Tai-Kadai) Speaking Peoples. Presented at the 12th EURASEAA Meeting in Leiden, the Netherlands, 1-5 September 2008.
Blench, Roger (2018). Tai-Kadai and Austronesian Are Related at Multiple Levels and Their Archaeological Interpretation (draft).
Blust, Robert (2014). “The Higher Phylogeny of Austronesian and the Position of Tai-Kadai: Another Look,” in The 14th International Symposium on Chinese Languages and Linguistics (IsCLL-14).
Campbell, Lyle and Marianne Mithun (Eds.) (1979). The Languages of Native America: An Historical and Comparative Assessment.
Campbell, Lyle and Mauricio J. Mixco (2007). A Glossary of Historical Linguistics. Edinburgh University Press.
Campbell, Lyle and William J. Poser (2008). Language Classification: History and Method. Cambridge: Cambridge University Press
Fortescue, M. (1998). Language Relations across Bering Strait: Reappraising the Archaeological and Linguistic Evidence. (Nivkhi is Mosan.)
Fortescue, Michael (2011). “The Relationship of Nivkh to Chukotko-Kamchatkan Revisited.” Lingua 121, 8: 1359-1376. (Nivkhi is Chukoto-Kamchatkan.)
Gell-Mann, Murray; Ilia Peiros, and George Starostin (2009). “Distant Language Relationship: The Current Perspective.” Journal of Language Relationship.
Greenberg, Joseph H. (2000). Indo-European and Its Closest Relatives: The Eurasiatic Language Family. Volume 1, Grammar. Stanford: Stanford University Press.
Greenberg, Joseph H. (2002). Indo-European and Its Closest Relatives: The Eurasiatic Language Family. Volume 2, Lexicon. Stanford: Stanford University Press.
Heine, Bernd (1992). African Languages. International Encyclopedia of Linguistics, ed. by William Bright, Vol. 1, pp. 31-36. Oxford: Oxford University Press. (No such thing as Nilo-Saharan.)
Krauss, Michael E. (1979). Na-Dene and Eskimo-Aleut. The Languages of Native America: Historical and comparative assessment, ed. by Lyle Campbell and Marianne Mithun, pp. 803-901. Austin: University of Texas Press. (Haida not part of Na-Dene.)
Levine, Robert D. (1979). Haida and Na-Dene: A New Look at the evidence. IJAL 45: 157-70. (Haida not part of Na-Dene.)
Li, Hui (李辉) (2005). Genetic Structure of Austro-Tai Populations (Doctoral Dissertation). Fudan University.
Mixco, Mauricio J. (1976). “Kiliwa Texts.” International Journal of American Linguistics Native American Text Series 1: 92-101
Mixco, Mauricio J. (1977). “The Linguistic Affiliation of the Ñakipa and Yakakwal of Lower California”. International Journal of American Linguistics 43: 189-200.
Nicola¨i, Robert (1990). Parent´es Linguistiques (`A Propos du Songhay). Paris: CNRS. (Dimmendaal says Songhay is Nilo-Saharan.)
Nikolaev, S. (2015). Toward the Reconstruction of Proto-Algonquian-Wakashan. Part 1: Proof of the Algonquian-Wakashan Relationship.
Nikolaev, S. (2016). Toward the Reconstruction of Proto-Algonquian-Wakashan. Part 2: Algonquian-Wakashan Sound Correspondences.
Ostapirat, Weera (2005). “Kra-Dai and Austronesian: Notes on Phonological Correspondences and Vocabulary Distribution,”  in Laurent Sagart, Roger Blench and Alicia Sanchez-Mazas, eds. The Peopling of East Asia: Putting Together Archaeology, Linguistics, and Genetics, pp. 107-131. London: Routledge Curzon.
Ostapirat, Weera (2013). Austro-Tai Revisited. Paper Presented at the 23rd Annual Meeting of the Southeast Asian Linguistics Society, 29-31 May 2013, Chulalongkorn University.
Reid, Lawrence A. (2006). “Austro-Tai Hypotheses.” In Keith Brown (Ed.), The Encyclopedia of Language and Linguistics, 2nd Edition, pp. 609–610.
Sagart, Laurent (2005b). “Tai-Kadai as a Subgroup of Austronesian,” in L. Sagart, R. Blench, and A. Sanchez-Mazas (Eds.), The Peopling of East Asia: Putting Together Archaeology, Linguistics, and Genetics, pp. 177-181.
Sagart, Laurent (2019). “A Model of the Origin of Kra-Dai Tones.” Cahiers de Linguistique Asie Orientale. 48, 1: 1–29.
Thurgood, Graham (1994). “Tai-Kadai and Austronesian: The Nature of the Relationship.” Oceanic Linguistics 33: 345-368.

Repost: Genes and Language Match Well

Genes and Language Match Well

This post will look into whether or not genes and language line up well. The question may seem academic, but it is important for linguists in the battle for whether or not there is anything to the large macro-families that the “lumpers” are creating.

It’s yet another skirmish in the lumpers versus splitters battle in Historical Linguistics. Historical is the branch that deals with language families, language relationships, and reconstruction of old languages that are no longer spoken.

The debate has heated up in recent years due to the prominence of lumper theories publicized by the late Joseph Greenberg and his disciples, notably Merritt Ruhlen at Stanford University. Ruhlen and Greenberg use a technique called mass comparison which has come under a lot of wild and irrational abuse but seems to be a valid scientific method in the hands of an expert.

Greenberg used it to come up with the four major language families of Africa a long time ago, and his classification there has remained pretty solid ever since.

He since published a book called Language in the Americas, which broke down all Amerindian languages into three large families – Amerind, Na-Dene and Eskimo-Aleut. I have read that book many times, and I concur with its analysis. Unfortunately, a detailed examination of the evidence goes beyond the scope of this post.

Na-Dene and Eskimo-Aleut are not very controversial, though the position of Haida within Na-Dene is regarded as unproven. However, looking at evidence mustered by Alexander Manaster-Ramer, I believe that Haida is definitely Na-Dene, though possibly a sister to the entire group as it is so distant.

In the same way, the ancient Indo-European Anatolian language is now regarded as a separate branch of Indo-European – Indo-Hittite or Indo-Anatolian. My Indo-Europeanist sources told me that Indo-Hittite or Indo-Anatolian is now regarded as consensus in the field.

Bengston promotes a family called Dene-Caucasian that involves the North Caucasian languages of the Caucasus, Basque, Na-Dene, Sino-Tibetan, Burushaski in northern Pakistan and the Ket Family in Siberia. I can’t speak for the whole family, but the evidence is definitely interesting. I think that Bengston has proven a case for Ket, Basque, and the Caucasian languages being related, as I read a book on that subject.

Recently, Edward Vajda conclusively proved that the Ket language is related to the Na-Dene languages.

A Ket man in Siberia. His phenotype looks a bit Japanese. He doesn’t look like an Amerindian. The situation of the Ket is deplorable, as most live in serious poverty and do not see any hope for improving themselves. The Ket language is also in bad shape, as hardly anyone under 35 can speak it well, and 30% of the population regard speaking Ket as useless.
The USSR did a better job with minority tongues than Putin.
There is good evidence of a link between the Ket and the  Amerindians (broken link). The Selkup are a Samoyedic people who live near the Ket. There is also good evidence linking the peoples of the Altai with Amerindians. This doesn’t make a lot of sense, as the Selkup and Ket now live a long ways from the Altai region, but the Ket and Selkup are thought to have lived in the Altai long ago and came north later on.
Relating to the Ket, along with the Selkup nearby, the theory linking these groups to the Amerindians supports a single migration to the Americas 16,000 years ago, but it’s not at all definitive. According to this paper (broken link) linking the Ket with Amerindians, Proto-Caucasians are thought to have evolved in Central Asia. I would place it more near the Caucasus.

Click to enlarge. I believe that the latest evidence is showing that all of the various Altai peoples – Northern Turkics would be the various Altai groupings – the Altai, the Tofalar, the Khakass and the Shor – are related to the Amerindians. These are often referred to as Northern Turkics. They aren’t really Turks per se as in people from Turkey, but even the Turks from Turkey are thought to be partly related to these Northern Turkic tribes.

Northern Turkics are right on the border between Asians and Caucasians on gene charts, and some Amerinds are not so far genetically from that border either. If you look at the Cavalli-Sforza gene chart below, you can see that next to the Eskimo-Aleuts, the Chukchi, and the Northern Turkics are the people most closely related to the Amerindians.

It also looks like the Ket and Selkup came from what is now the Northern Turkic Altai region. Anthropologically, these various groups are either Uralics, South Siberian, Central Asian or North Asian Asiatics. The Altai region is where Russia, China and Mongolia all come together.

This is the first connection of a New World language family with an Old World language family.

Here is a Nenets woman from Siberia. She definitely looks Northern Chinese or Korean. They have a population of 44,000, and there are 31,000 speakers of the language. It’s really two languages – Forest Nenets and Tundra Nenets – but both are said to be endangered. I think at least Tundra Nenets will be around for a while though, as most kids are still learning it. The Nenets are Samoyedics like the Selkup, discussed above. The Selkup are related to the Amerindians.

It’s interesting that the Ket have also been linked genetically with the New World.

Here is a rare photo of Ed Vajda with two Ket women in Siberia described as “experts in the Ket language.” I’m not good at judging ages, but these women look to be about 40-60. If so, that is good, as I thought all of the speakers were elderly, and hardly anyone spoke the language well anymore. Ket has anywhere from 537-1,000 speakers. A related language, Yugh, is thought to have recently gone extinct. The rest of the Yeniseien languages went extinct about 150-250 years ago.

Greenberg and Ruhlen are the most vilified of the lumpers, but there are others who are following more orthodox methods of reconstruction to prove the existence of ancient language families, such as the late Sergey Starostin, his son George Starostin, John Bengston, the late Vladislav Markovich Illich-Svitych (a prodigy, dead at the young age of only 32), Aharon Dolgopolsky and Vitaly Victorovich Shevoroshkin.

The Starostins, Illich-Svitych, Dolgopolsky, and Shevoroshkin all worked on Nostratic, a vast family consisting variously of Indo-European, Uralic, Altaic, Kartvelian, Nivkh, Chukotko-Kamchatkan, Afro-Asiatic, Dravidian, and Eskimo-Aleut. I now think that Afroasiatic and Dravidian are sisters to Nostratic instead of part of the family per se because they are so far removed from the rest of the family.

I would accept IE, Uralic, Altaic, Chukotko-Kamchatkan and Eskimo-Aleut in Nostratic. The Altaic family is itself controversial, but I regard it as fact, having studied it. Altaic also includes Japanese and Korean. I would toss Yukaghir in with Uralic.

Nostratic has a lot more going for it than some of the other long-range proposals, and since these scholars are using classic reconstruction, it gets respect from splitters. Starostin’s webpage is a great resource for looking into long-range theories, especially Nostratic and Altaic.

Bengston, Shevoroshkin, and the Starostins all worked on Dene-Caucasian. This hypothesis seems a lot more controversial.

Click to enlarge. Here is a tree of Luigi Cavalli-Sforza’s human genetic families on the left and various human language families on the right, including some big families. The only one that is seriously out of place is Tibetan. This is because the Tibetans are a genetically North Chinese people who have moved down into Southern China in recent years. They cluster with South Chinese linguistically but NE Asians genetically.
All the rest lines up pretty well, including super-families like Nostratic and Eurasiatic (a Nostratic-like family created by Greenberg).
The hypothesized Austric family is interesting. I’m not sure if I buy this super-family or not, but I have not really looked into it.
With recent genetic evidence linking Indonesians and Vietnamese to Daic peoples of South China and SE Asia, it seems worth looking into. At the very least Austro-Thai, a language family consisting of the Austronesian and Tai-Kadai families. seems to have been proven in the last 10 years with the publication of a couple of important articles. Laurence Sagart is doing good work in this area.


Campbell, Lyle & Mithun, Marianne (Eds.) 1979. The Languages of Native America: An Historical and Comparative Assessment. Austin: University of Texas Press.
Campbell, Lyle. 1988. “Review of Language in the Americas, by Joseph Greenberg.” Language 64: 591-615.

Campbell, Lyle. 1997. American Indian Languages: The Historical Linguistics of Native America. New York: Oxford University Press.

Greenberg, Joseph. 1987. Language in the Americas. Stanford: Stanford University Press.

Greenberg, Joseph. 1989. “Classification of American Indian languages: a reply to Campbell.” Language 65:1, 107-114.

Ethnic Nationalists and Language Classification Mix Like Oil and Water

Mithridates: What’s for damn sure is that ethnic nationalists (Oh, the myriad varieties of them!!) are the #1 threat to any sane and sensible discussion on topics like… and language classifications…

I am not sure if you have read any of my linguistic work, but some of it has already been published. I had to deal with ethnic nationalists a lot (Turkish ethnic nationalists – some of the worst of them all), and it was definitely not pleasant. For instance, they insist that the (IMHO – 53) Turkic languages are all just dialects of Turkish! And good luck trying to disabuse them of that notion. They’re very aggressive and they’re even violent (check out recent videos), and that makes them even more scary.

Right now I am dealing with a Macedonian ethnic nationalist (all Balkan varieties are very unpleasant to say the least) and he was extremely unpleasant. He is trying to get me fired from my professor job LOL. I’m flattered that he thinks I’m obviously a university professor, but nope, I’m not. So I wish him luck getting me fired from a job I don’t have.

Beyond that, ethnic nationalists are the bane of language classification. There are so many “dialects” that are so obviously separate languages but we can’t split them because ethnic nationalists run the discourse in those countries. Idiotically, my field utterly unscientifically states that there is no way to tell a language from a dialect.

Oh yeah? We can put a man on the moon but we can’t develop a successful definitions of language and dialect? How absurd is that?

So we stupidly throw up our hands and say this is not a linguistic question (though obviously it is) and say the distinction between the two is a political matter (!), so we throw it over to the most dishonest  reprobates people on Earth next to out and out criminals, namely, politicians! Of course politicians  never lie or anything like that!

So really we should take all of our scientific questions over to politics and let politics answer these questions! Hell, politics won’t even give you a straight answer if you ask it what time or day it is. If a politician’s mouth is moving, he’s lying. It’s practically a requirement to score high on the psychopathy scale to be a politician. So let’s let these pathological lying sociopaths called politicians answer our scientific questions in Linguistics!

Ethnic nationalists have infiltrated language classification by petitioning to get languages removed from their countries, as they wish to believe that the only language in say Ruritania is Ruritanian, and all of the other languages, no matter how different, are dialects of Ruritanian!

So Basque is just a dialect of Spanish, right? And Suomi or Lappish is a dialect of Swedish. And Sorbian is a dialect of German. And Breton and Basque are dialects of French. As you can see, we could go on and on here.

There are probably 2,000 languages within the scope of “Chinese,” yet the Chinese government lies and says there is only one Chinese language. We linguists have to go along with this insanity because…why?

Ethnic nationalists dishonestly removed several Occitan languages and several North Germanic languages in Sweden, among other places. I can’t believe that SIL (the publishers of Ethnologue who are now in charge of handing out ISO codes for new languages) fell for this.

A Reworking of Chinese Language Classification

This is a huge work that I have lost track of. Really need some Chinese informants to work on this one some more. I look at this work and get a headache just looking at it. It’s 211 pages. This is one of the most extensive overviews of the Chinese languages ever published in English though, I will say that. Work in progress for ten years now. Download as pdf for best experience.

A Reworking of Chinese Language Classification

A Look at the Altaic Question, a Current Controversy in Linguistics

               Turkic    Tungusic*        Written Mongolian
1P sing.:
nominative      ban      bi               bi
oblique stem    man-     min-             min-
2P sing.:
nominative      san      chi    (<*ti)    si
oblique stem    san-     chiin- (<*tin)   sin-
(e.g. Evenki and Manchu)

The Altaic argument is one of the biggest controversies in current linguistics. It is said that Linguistics has decided that Altaic does not exist. Actually, the field has not decided that at all. The consensus in the field is that Altaic is still an open question. In other words, they are fighting about it.
The field is split up into Pro-Altaicists and Anti-Altaicists. It’s not true that the field has decided in favor of the Anti-Altaicists. The Antis say that there is no such thing as Altaic. The Pros said that Altaic exists, and here is the evidence. The consensus instead rejects both positions and says we don’t know if Altaic exists or not. There is a big difference between we don’t know if it exists (maybe it does and maybe it doesn’t) and it doesn’t exist. One statement is uncertainty and the other statement is negative.
According to Anti-Ataicists, every time a human can’t make up their mind about something yes or no, they actually are saying no. No they’re not! They’re not saying yes or no. They are rejecting both positions and saying instead that they are undecided. What the Anti-Altaicists are doing is akin to saying everyone who answers undecided on a political candidate poll is actually saying that want to vote against the person! The entire basis of political polling would change.
The Anti-Altaicists are typically quite vicious, while the other side is not. The safe position is Anti-Altaicism, so a lot of wimpy linguists too scared to stand up and fight have sought refuge in the negative position. Furthermore, Linguistics is like an 8th grade playground. Some positions are openly ridiculed. Pro-Altaicism is openly ridiculed, and taking that position is seen as prima facie evidence that a linguist is a crank, an idiot or a fool. I would imagine that if you told a hiring committee that you believed in Altaic, it would be harder to get hired than if you took the negative stand. And I could imagine that being pro-Altaic might keep you from getting tenure.
Not only are the Antis vicious (all of them are vicious, bar none), but many of them are complete idiots and fools, as seen above in the preposterous conflation of uncertain opinions with negative opinions above. The fools on Bad Linguistics Reddit are evidence of this. They all hate Altaic because they are wimps who are too afraid of a fight, so they take a safe position. They bashed me for saying Altaic was real, saying it was evidence of what a kook and crank I am, when in fact, Altaic exists is a completely acceptable position to take. Many famous linguists have supported Altaic in the past, and a number of top linguists currently support it.
Anti-Altaic papers are often vicious from an academic paper standpoint. In academic papers, you are supposed to be restrained and keep your strong opinions to yourself. Not so with anti-Altaicists. They are over the top insulting and ridiculing towards Altaicists.
Altaicists have accumulated quite a bit of evidence in support of their position. The pronouns above prove Altaic for me. All I have to do is look at those pronoun sets (and there are other pronouns that also line up precisely like above) and I know it’s real.
This is what Joseph Greenberg means when he says that proving whether language families exist and reconstructing proto-languages are two different things.
You figure out a language family by simple inspection. Greenberg uses the mass comparison method, and it has worked very well for him for African languages. His Amerindian languages proposals have not been well accepted, but it’s clear that there is a large family called Amerind. There is 1st person m and second person n all through the family, occurring ~450 times. Personal pronouns are rarely borrowed, and entire personal pronoun sets are almost never borrowed (Piraha did borrow all of its pronouns, but Piraha is bizarre in many ways).
Joanna Nichols, a current spokesperson for the conservative Linguistics Establishment as good as any other (and a fine linguist to boot) states that the current consensus is that there is no such thing as Amerind and that those 450 similar pronouns are all cases of borrowing. Wow! Personal pronoun sets (not just one pronoun but an entire paradigm) were borrowed 450 times in the Americas! That’s one of the most idiotic statements that one could make, but this is the current consensus of linguistic “science.” Dumb or what?
A much better position would be to say that Amerind is uncertain (maybe it exists, maybe it doesn’t), as the negative position is preposterous and idiotic right on its face. Nichols has also stated that all of the Altaic pronouns were borrowed.
That’s even more idiotic because unlike in the Americas, entire large pronoun paradigms exist in Altaic where they do not exist in Amerind. Paradigms, especially pronoun paradigms, are almost never borrowed, and paradigm evidence is considered excellent evidence of genetic relationship. English good, better, best is the same paradigm as German gut, besser, besten. That’s an odd way to set up comparatives, and the fact that that comparative set lines up perfectly is what is known as a paradigm. That one paradigm right there ought to be enough to prove the relatedness of English and German, even leaving out all other massive evidence for relatedness.
Greenberg says that after you decide that languages form a family, then you set about using the comparative method of reconstructing proto-languages, finding sound correspondences and whatnot. The current conservative or reactionary position of the field is that first you reconstruct the proto-languages and then and only then can you prove a language family. That’s absurd. They’re in effect doing everything ass backwards. Incidentally, long ago Edward Sapir agreed with Greenberg that language families were proven first by inspection and only later did reconstruction take place. Sapir also came up with the Amerind hypothesis decades before Greenberg. Sapir is quoted as saying:

Getting down to brass tacks, how are you going to prove Amerind 1st person m and second person n other than genetic relatedness?
– Edward Sapir, 1917?

Who was Edward Sapir? Only one of the greatest linguists in history.
I can look right there at that pronoun paradigm set and tell you flat out that those three language families are related. It’s not possible that all of those languages borrowed all of those pronouns. It didn’t happen. It didn’t happen because it couldn’t happen. It’s beyond the realm of statistical probability. A statement that is outside the realm of statistical probability is considered to be for all intents and purposes nonfactual. Ask anyone Statistics major.
Not only has Proto-Altaic been reconstructed at least in a tentative and initial form, but there are regular sound correspondences running through all of the comparative lexicon of the three proto-languages: Proto-Turkic, Proto-Tungusic and Proto-Mongolian.
Regular sound correspondences are another thing we look for. It would mean that every time you have VlV in Language A, you have VnV in Language B (V = vowel). We then say that Language A l -> Language B n. Regular sound correspondences are considered to be excellent evidence of genetic relatedness.
In fact, an entire etymological dictionary of Altaic has been produced, reconstructing a lot of Proto-Altaic lexicon along with the cognates in the daughter languages. This dictionary runs to over 1,000 pages, and it is a true work of art in the social sciences. The entire etymological dictionary has been rejected out of hand by the Anti-Altaicists. However, they have not directly attacked or tried to prove many of the etymologies wrong. They simply looked at it, said it’s junk, laughed at it and ridiculed it, and moved on.
This conservative or even reactionary mood has been the norm in Historic Linguistics for decades now. The field has become very stick in the mud about this.
However, in much of the rest of Linguistics, especially Sociolinguistics, Language Acquisition, and Applied Linguistics, the field has reached consensus on many a silly thing that makes little to no sense at all other than that it sounds very Politically Correct. Linguistics being a social science, PC and SJW Cultural Left culture has infected the field in an awful way.
You must understand that Cultural Left views did not just appear in a few select social sciences. Instead this ideology swept through the entire social sciences, sparing not a one. In terms of a March Through the Institutions for this ideology, it was akin to a rapid hostile takeover. Cultural Left and SJW views are now mandatory in Linguistics. If you refuse to go along, you will not get hired or get tenured. If your reputation is too bad, you may not be able to publish in academic journals or books.
Alas, my field has been poisoned with this Cultural Left toxin or venom like all the rest of them!

How I Determined Intelligibility For Turkic Lects

Steve: This is amazing. Well done. But how can you possibly know the degree of mutual intelligibility between two languages you don’t speak or know if something is a language or dialect when you don’t speak it? That seems strange. How is it worked out?

Linguists don’t speak all these languages we study. We just study languages, we don’t necessarily speak them. This is confused with the archaic use of the word linguist to mean polyglot. Honestly, many linguists do in fact speak more than one language, and quite a few of them have a pretty good knowledge of at least some of the languages that they study. But my mentor speaks only Turkish and English though he studies all Turkic languages. I don’t believe he has ever learned to speak any Turkic lect other than Turkish.
In reference to my paper here.
We are not looking for raw numbers. We just want to know if they can understand each other or not.
A lot of it is from talking to native speakers and also there was a lot of reading papers by other linguists. I also talked to other linguists a lot. Linguists typically simply state if two lects are intelligible or not. Also there is a basic idea among linguists of what the boundary is between a language and a dialect, and I used this knowledge a lot.
Can they understand each other? Yes or no. That’s pretty much about it. Also at some degree of structural difference, we can see the difference between a language and a dialect. It’s a judgement call, but linguists are pretty good at this.
There is a subsection of very loud linguists, mostly on the Internet, who like to screech a lot about this question cannot be answered by answered because of this or that red herring or some odd conundrums that work their way in. The thing is if you ask around enough, you will be able to get around all of the conundrums and you should be able to eventually reconcile all of the divergent responses to get some sort of a holistic or “big picture.” You finally “figure it out.” The answer to the question comes to you in a sort of a “seeing the answer as part of a larger picture” sort of thing.
The worst red herring is this notion that speakers from Group A will lie and say they do not understand speakers of Group B simply because they hate them so much. If this was such a concern, you would have think I would have run into it at some point. A much worse problem were ethnic nationalists who lie and say that they can understand neighboring tongues when they can’t.
The toxin called Pan-Turkism or Turkish ultranationalism comes into play here. It is almost normal for Turks to believe that there is only one Turkic languages, and it is called Turkish. All of the rest of the languages simply do not exist and are dialects of Turkish. I had to deal with regular attacks by extremely aggressive Ataturkists who insisted that any Turk could easily understand any other Turkic language. Actually my adviser told me that my piece would not be popular with the Pan-Turkics at all. I don’t really care as I consider them to be pond scum.
Granted, some of it was quite controversial and I got variable reports on intelligibility for some lects like Siberian Tatar vs. Tatar, the Altai languages, Kazakh vs. Kirghiz, Crimean Tatar vs. Turkish.
Where native speakers differ on such questions, often vociferously, you simply ask enough of them, talk to some experts and try to get a feel for that what best answer to the question is.
Some cases like Gagauz vs. Turkish probably need raw intelligibility testing. That’s the only one that is up in the air right now, but it is up in the air because the lects are so close. Intelligibility between Gagauz and Turkish is somewhere between  70-100%. In other words, they have marginal intelligibility at worst. My Gagauz expert who knows this language better than anyone though feels that Turkish intelligibility of Gagauz is less than 90%, which is where I drew the line at language and dialect.
It is also starting to look like Nogay is a simply a dialect of Kazakh instead of a separate language, but that might be a hard sell.
Some of these are seen as separate languages simply because they are spoken by different ethnies who do not want to be seen as part of the same group. Also they have different literary norms. Karapalkak is just a Kazakh dialect, but the speakers want to say they speak a separate language. Same with Bashkir, which is simply a dialect of Tatar. The case of Kazakh and Kirghiz is more controversial, but even here, we seem to be dealing with one language, yet the two dialects are spoken by different ethnies that have actually differentiated into two separate states, each with their own literary norm. Kazakhs wish to say they speak a language c called Kazakh and Kirghiz wish to say they speak a language called Kirghiz although they are probably really just one language.
We see a similar thing with Czech and Slovak. My recent research has proven that Czech and Slovak are actually a single language. But the dialects are spoken by different ethnic groups who claim different cultures and histories and they have actually divided into two different states, and each has its own literary norm.
It is here, where dialects become languages not via science by via politics, culture, history and sociology, that Weinrich’s famous dictum that “a language is a dialect with an army and a navy” comes into play.
Scientifically, these are all simply dialects of a single tongue but we call them languages for sociological, cultural and political reasons.

A Few Words on Language Endangerment

Carlos Lam: Congrats! However, isn’t language death a rather standard occurrence among societies?

It is, but we linguists don’t really like it. It is quite a debate going on, but the bottom line seems to be that ethnic groups and speaker groups have the right to ownership of their languages. We worry that a lot of speaker groups are being pressured into blowing up their languages prematurely. We like to study these languages and we are not real happy about seeing them vanish into the horizon. On the other hand, is cultural death a natural thing too? Both cultural death and language death are occurring at rates far beyond the normal background rates. English and some of the other major languages are like weapons of mass destruction in taking out languages. You really want a world with one language and one culture? I don’t.
The best position seems to be that speakers have the right to decide the fate of their languages. If speakers wish to continue speaking their languages, then governments and linguists should help them to preserve and continue to develop their languages. Quite a few groups do not seem to care that their languages are going are extinct or they are even driving or drove their languages extinct, and they have the full right to do so. In these cases, we will simply do salvage linguistics. There are many salvage linguistics projects going on in the world today.
You won’t get very far with linguists arguing that language death is a good thing. Most people don’t think so.
Occurring at the same time as language death is a lot of language revitalization. Even fully dead languages are being resurrected from the grave. Also in addition to language death, we are creating new languages all the time. In this piece, I created a total of net 13 new languages. And new languages are occurring on their own.
To give you an example. A group of Crimean Tatars moved from Crimea to Turkey about 200 years ago in the course of the Crimean War. They have been speaking Crimean Tatar in Turkey ever since, for 200 years now. But in that time, Crimean Tatar in Turkey and Crimean Tatar in Ukraine has diverged so much that Turkish Crimean Tatar is now, in my opinion, a fully separate tongue from the Ukrainian language. This is because in Turkey, a lot of Turkish has gone into Turkish Crimean Tatar which is not well understand in the Ukraine. And in the Ukraine, a lot of Russian has gone in which is not well understood in Turkey. Hence, Crimean Tatar speakers in Turkey and Ukraine can no longer understand each other well.
To give you another example, there are many Kazakh speakers in China. However, Kazakh speakers in China can no longer understand Standard Kazakh broadcasts from Kazakhstan because so many Russian loans have gone into Standard Kazakh that it is no longer intelligible with Chinese Kazakh speakers. I learned this too late for my paper, otherwise I would have split Chinese Kazakh off as a separate language.
There are many cases like this.
Further, many languages are being discovered. Sonqori, Western Khalaj, Todzhin, Duha, Dukha and Siberian Tatar are just a few of the new languages that I created. Khorosani Turkic was split into three different languages. Dayi was subsumed into one of the Khorosani Turkic languages. Altai was split from one into five separate languages, but the truth is that it is six languages, not five. Salar was split into Western Salara and Eastern Salar. Ili Turki was eliminated becuase it does not even exist. It is simply a form of Uighur. Kabardian and Balkar, Tatar and Bashkir, Kazakh and Kirghiz were some languages that were eliminated and subsumed into single tongues such as Tatar-Bashkir, Kazakh-Kirghiz, and Kabardian-Balkar. And on and on.
Languages and of course dialects are dying all the time, but new languages are being created by humans and by linguists as we continue our splitting projects. Many lects referred to as dialects are more properly seen as separate languages. Chinese is at least 450 separate languages, only 14 of which are recognized. German may be up to 130 separate languages, only 20 of which are recognized.
There are quite a few more languages to be created out there, but there is a lot of resistance to splitters like me from more conservative linguists and especially from linguistic nationalists. For while Chinese may well be over 1,000 languages, the Chinese government is anti-scientifically insistent that there is but one Chinese language and maybe 2,000 “dialects,” most of which are probably separate languages. The German government is quite resistant to the idea that there is more than one form of German, though I believe Bavarian and Swiss German have official status in Austria and Switzerland.

I Am Now a Published Author

You can download my first published work above. I was published for the first time this spring in a book called:

Before the Last Voices Are Gone: Endangered Turkic Languages, Volume 1: Theoretical and General Approaches

This is the first volume of a four volume set called:

The Handbook of Endangered Turkic Languages

The first volume alone runs to 512 pages. Articles are in English, Russian and Turkish, variably. It was published out of the International Turkish-Kazakh University in Istanbul, Turkey and the International Turkic Academy in Astana, Kazakhstan. These are two campuses that are part of one joint Turkey-Kazakhstan shared university.
I contributed one chapter that runs from pages 311-384 titled:

Mutual Intelligibility among the Turkic Languages

It’s 83 pages long and has ~100 references. It may have taken me 500 hours to write that chapter. Tell that to my enemies who claim I do not work, ok? When all is said and done, I figure I may make 75 cents an hour on this work. But this is how academic publishing works. There’s just no money in it. It’s all a labor of love. In addition, most work is done by professors who have to publish as part of their professorship (publish or perish), so in effect, their professor salary is covering their publishing.
That document had to go through two rather grueling peer reviews. I had to make many changes in it to get it to publication. The second peer review had to get past the top Turkologists in the world today, and I am amazed that I made it through review to be honest.
Most people publishing in academic books or journals are academics, professors working at universities. There are only a few of us independent scholars out there (I am an independent scholar because I am not at a university). Also most folks have PhD’s, and I only have a Masters, but there are some folks with Masters publishing academically.
In general, this is a rather selective game where everyone is hyperspecializing as is the trend nowadays. Although my mentor at the project calls me a Renaissance Man, I wonder if the autodidact/polymath is an endangered species if not extinct. Everyone has to specialize nowadays.
For instance, common knowledge in this particular field would be that the only folks who could publish in Turkology would be linguists with a PhD in Linguistics, preferably with a emphasis in Turkology. Beyond that, they may prefer say 5-10 years publishing in the field of Turkology in addition to a professorship in Turkic linguistics. You can see where this is headed. I am not knocking it. I am just pointing out that microspecialization is the game now.
What follows is that since I lack the PhD or professorship or any background at all in Turkology, I should not be allowed to be published in this field, or if by some error I am somehow mispublished, all of my work should be promptly ignored as done by a nonspecialist who could not possibly know what he is talking about. Needless to say, I don’t agree with that, and I carry on tilting at windmills like a good deluded Renaissance Man who never got the memo and wouldn’t read it if he did.
The odd thing is that I knew nothing about Turkology until I plunged into this mess. I had written a short piece of mutual intelligibility in Turkic, as MI is one of my pet subjects and put it up on Academia on my scholarly papers site, and a professor in Turkey happened to read it. He wrote to me telling me he agreed with me, he wanted me to expand it into a document, and they would publish it for me. So off I went, down the Turkic rabbit hole. If you study the very high IQ types (140+), they tend to go on “crazes” like this. They also lose interest after a bit, drop the craze and move on to some new craze. Dilettantism for the win.
I also have an anxiety disorder called OCD which is well controlled. A good side of it though is that you tend to do dive down rabbit holes a lot, and the OCD makes you burrow maniacally into the rabbit hole with the notion that one is going to become the world’s leading expert on whatever rabbit hole you are digging in now. So for one or two years, I went absolutely berserk into Turkic, whereas before I scarcely knew a thing about it. The end result can be read above.
The sad result is that either due to the savant stuff or the mental quirk, I also tend to lose interest in my rabbit holes after a bit. I follow them about halfway to China, make several revolutions around the molten core, and after a year or so, come up for air gasping with incipient Black Lung, and next thing you know, I am bored, and it’s onto a new craze. It’s a bit silly, but we all have our crosses to lug, and as eccentricities go, there are many worse things that dabbling, er hobbyism, er dilettantism, er polymathy, er autodidactism, er Renaissance Manism.
Most of you will probably not find this very interesting, as it is pretty specialized stuff that is mostly of interest to people in the specialty, linguists and those interested in the subject. It’s not exactly for the general reader. But if you have any interest in these languages, you might enjoy it.
I expanded Turkic from 41 to 53 languages, eliminated some languages, turned some into dialects, turned some dialects into full languages, combined languages into a single tongue, created some new languages out of scratch and did quite a bit of work on the history of the languages.
I also reworked the classification a bit because I thought it could be done better. Even though this work does not pay much, the pay is in fame if it is at all. My work will either be accepted by the field or rejected outright or somewhere in between. I have already earned the praises of some of the world’s top Turkologists, much to my surprise. If I get fame, well, I get quoted in papers, maybe invited to conferences, and maybe even referenced in Wikipedia. There are groupies in all status fields, and what the heck, there may even be linguist groupies. If not, there are always starry eyed coeds dreaming of professor types to mentor them. I am already working that angle as it is. Writer Game, Scholar Game, there’s Game for everything.
Or my work does not go over and maybe the field decides I do not know what I am talking about.
Crap shoot, like most of life’s endeavors. Roll em, and wish upon a star…snake eyes!
PS. The title of the series, Before the Last Voices Are Gone, was created by me. I think it has a nice little song.

Is Dravidian Related to Japanese?

Thirdeye writes:

The Tamil-Japonic connection isn’t quite as off the wall as one might think at first glance. There’s apparently a strong Andaman-Indonesian language connection. The convention of repeat plurals seems to have found its way to Japan. There’s also some similarity between the Finno-Ugric languages, which are Uralic outliers in a sea of Indo-European languages, and Dravidian languages that have a remnant in Pakistan. Contact between proto-Dravidian-Uralic and Altaic languages is a real possibility.

If Uralic is close to anything, it is close to Altaic and Indo-European and probably even closer to Chukto-Kamchatkan, Eskimo-Aleut, Yukaghir and Nivkhi. Yukaghir may actually be Uralic itself, or maybe the family is called “Uralic-Yukaghir.”
There is no connection between Austronesian (Indonesian) and the Andaman Islanders. Austronesian is indeed related to Thai though (Austro-Tai); in my opinion, this has been proven. If the Andaman languages are related to anything at all, they may be related to some Papuan languages and an isolate in Nepal called Nihali. A good case can be made connecting Nihali with some of the Papuan languages.
Typology is not that great of way to classify. Typology is areal and it spreads via convergence. What you are looking in search genetic relationship among languages more more than anything else is morphology. After that, a nice set of cognates.
There is probably no connection between Dravidian and Uralic in particular. Dravidian is outside of most everything in Eurasia. It if is close to anything, it might be close to Afro-Asiatic. There also looks to be a connection with Elamite.
Dravidian and Afro-Asiatic are probably older than the rest of the Eurasian languages, and they were located further to the south. Afro-Asiatic is very old, probably ~15,000 YBP.

Mutual Intelligiblility in the Romance Family (Reading)

Just a personal anecdote. I have been reading a lot of Italian lately (with the help of Google Translate). I already read Spanish fairly well. I have studied French, Portuguese and Italian, and I can read Portuguese and French to some extent, Portuguese better than French.
But I confess that I am quite lost with Italian. This is worse than French and worse than Portuguese. A couple weeks of wading through this stuff hasn’t made me understand it any better.
Portuguese and Galician are said to be so close that they are a single language. I don’t agree with that at all, but they are very close, much closer to Spanish and Portuguese. Intelligibility may be on the order of 80-90%.
Nevertheless, the other day I tried to read a journal article on Galician. It looked like it was written in Portuguese, and who would write in Galician anyway? I copied the whole thing into Google Translate and let it ride. I waded through the whole article, and I must say it was a disaster. I had a very hard time understanding many of the main points of the article.
Then I remembered that Translate works on Galician now, so I decided on an off chance that the guy may have written the piece in Galician for some nutty reason. I ran it through Translate using Galician as target. The article went through perfectly. You could understand the whole thing. It was then that I realized how far apart Portuguese and Galician really are.
You can try some other experiments.
Occitan is said to be nearly intelligible with Spanish or maybe even French, better if you know both. There’s no Google Translate for Occitan yet, but I had to deal with a lot of Occitan texts recently. I couldn’t make heads or tails of them despite by Romance reading background. So I tried using Translate to turn them into Spanish or French. French was a total wreck, and there was no point even bothering with that. Spanish was much better, but even that was a serious mess.
Now we come to the crux. Catalan and Occitan are said to be so close that they are nearly one language. Translate now works in Catalan. So I ran the Occitan texts through Translate using Catalan. The result was a serious mess, but you could at least understand some of what the Occitan texts were about. But no way on Earth were those the same languages.
People keep saying that if you can read Spanish, you can read Portuguese. It’s not true, but you can see why people say it. Try this. Take a Spanish text and run it through Translate using the Portuguese filter. Now take a Portuguese text and run it through Translate using the Spanish filter. See what a mess you end up with!
Despite the fact that I can read Spanish pretty well, I have tried to read texts in Aragonese, Asturian, Extremaduran, Leonese and Mirandese. These are so close that some even say that they are dialects of Spanish. But even if you read Spanish, you can’t really read any of those languages, and they are all separate languages, I assure you. Sure, you get some of it, but not enough, and it’s a very frustrating experience.
There are texts on the Net in something called Churro or Xurro. It’s a Valencian-Aragonese transitional dialect spoken around Teruel in Aragon in Spain. It also has a lot of Old Castillian and a ton of regular Castillian in it. Wikipedia will tell you it’s a Spanish dialect. Running it through both the Spanish and Catalan filters didn’t work and ended up with train wrecks. I doubt if Xurro is a dialect of either Catalan or Spanish. It’s probably a separate language.
There is another odd lect spoken in the same region called Chappurriau. It is spoken in Aguaviva in Teruel in the Franca Strip. The Catalans say these people speak Catalan, but the speakers say that their language is not Catalan. Intelligibility with Catalan is said to be good. So effectively this is a Catalan dialect.
I found some Chappurriau texts on the Net and ran them through Translate using Catalan as the output. The result was an unreadable disaster, and I couldn’t really figure out what they were saying. Then I tried the Spanish filter, and that was even worse. I am starting to think that maybe Chappurriau is a separate language as its speakers say and not a Catalan dialect after all.
I conclude that the ability to cross read across the Romance languages is much exaggerated.
Not only that, but many Romance microlanguages, transitional dialects and lects that are supposedly dialects of larger languages may actually be separate languages.

Mutual Intelligibility of Languages in the Slavic Family

A more updated version of this paper with working hyperlinks can be found on Academia.edu here.
There is much nonsense said about the mutual intelligibility of the various languages in the Slavic family. It’s often said that all Slavic languages are mutually intelligible with each other. This is simply not the case.
Method: It is important to note that the percentages are in general only for oral intelligibility and only in the case of a situation of a pure inherent intelligibility test. An inherent pure inherent intelligibility test would involve a a speaker of Slavic lect A listening to a tape or video of a speaker of Slavic Lect A.
Written intelligibility is often very different from oral intelligibility in that in a number of cases, it tends to be higher, often much higher, than oral intelligibility. Written intelligibility was only calculated for a number of language pairs. Most pairs have no figure for written intelligibility.
A number of native speakers of various Slavic lects were interviewed about mutual intelligibility, language/dialect confusion, the state of their language, its history and so on. In addition, a Net search was done of forums where speakers of Slavic languages were discussing how much of other Slavic languages they understand. These figures were tallied up for each pair of languages to be tabulated and were then all averaged together. Hence the figures are averages taken from statements by native speakers of the languages in question.
Complaints have been made that many of these percentages were simply wild guesses with no science behind them. This is not the case, as all figures were derived from estimates by native speakers themselves, often a number of estimates averaged together.
True science would involve scientific intelligibility testing of Slavic language pairs. The problem is that most linguists are not interested in scientific intelligibility testing of language pairs.
Serbo-Croatian (Shtokavian) has 55% intelligibility of Macedonian (varies from 25-90%), 27% of Slovenian, 25% of Slovak, 20% of Ukrainian, 13% of oral Bulgarian and 25% of written Bulgarian, 10% of oral Russian and 22% of written Russian, 10% of Czech, and 5% of Polish.
Chakavian has 82% intelligibility of Kajkavian.
Kajkavian has 82% intelligibility of Chakavian.
Bulgarian has 80% intelligibility of Macedonian, 41% of Russian, and 5% of Polish and Czech.
Macedonian has 65% oral and written intelligibility of Bulgarian.
Czech has 94% intelligibility of Slovak, 12% of Polish, and 5% of Russian and Bulgarian.
Polish has 22% intelligibility of Silesian, 12% of Czech, 6% of Russian, and 5% of Bulgarian.
Russian has 85% intelligibility of Rusyn, 74% of oral Belorussian and 85% of written Belorussian, 60% of Balachka, 50% of oral Ukrainian and 85% of written Ukrainian, 36% of oral Bulgarian and 80% of written Bulgarian, 38% of Polish, 30% of Slovak and oral Montenegrin and 50% of written Montenegrin, 12% of oral Serbo-Croatian, 25% of written Serbo-Croatian, and 10% of Czech.
Belarussian has 80% intelligibility of Ukrainian and 55% of Polish.
Ukrainian has 82% intelligibility of Belarusian and Rusyn and 55% of Polish.
Slovak has 91% intelligibility of Czech.
Eastern Slovak has 82% intelligibility of Rusyn and 72% of Ukrainian.
Saris Slovak has 85% intelligibility of Polish.
Reactions: So far there have been few reactions to the paper. However, a Croatian linguist has helped me write part of the Croatian section, and he felt that at least that part of the paper was accurate. A Serbian native speaker felt that the percentages for South Slavic seemed to be accurate.
A professor of Slavic Linguistics at a university in Bulgaria reviewed the paper and felt that the percentages were accurate. He was a member of a group of linguists who met periodically to discuss the field. He printed out the paper and showed it to his colleagues at the next meeting, and they spent some time discussing it.
Now onto the discussion.
There is much nonsense floating around about Serbo-Croatian or Shtokavian. The main Shtokavian dialects of Croatian, Serbian, Montenegrin and Bosnian are mutually intelligible.
However, the Croatian macrolanguage has strange lects that Standard Croatian (Štokavian) cannot understand.
For instance, Čakavian Croatian is not intelligible with Standard Croatian. It consists of at least four major dialects, Ekavian Chakavian, spoken on the Istrian Peninsula, Ikavian Chakavian, spoken in southwestern Istria, the islands of Brač, Hvar, Vis, Korčula, and Šolta, the Pelješac Peninsula, the Dalmatian coast at Zadar, the outskirts of Split and inland at Gacka, Middle Chakavian, which is Ikavian-Ekavian transitional, and Ijekavian Chakavian, spoken at the far southern end of the Chakavian language area on Lastovo Island, Janjina on the Pelješac Peninsula, and Bigova in the far south near the border with Montenegro.
Ekavian Chakavian has two branches – Buzet and Northern Chakavian. Buzet is actually transitional between Slovenian and Kajkavian. It was formerly thought to be a Slovenian dialect, but some now think it is more properly a Kajkavian dialect. There are some dialects around Buzet that seem to be the remains of old Kajkavian-Chakavian transitional dialects (Jembrigh 2014).
Ikavian Chakavian has two branches – Southwestern Istrian and Southern Chakavian. The latter is heavily mixed with Shtokavian.
Some reports say there is difficult intelligibility between Ekavian Chakavian in the north and Ikavian Chakavian in the far south, but speakers of Labin Ekavian in the far north say they can understand the Southeastern Istrian speech of the southern islands very well (Jembrigh 2014).
Čakavian differs from the other nearby Slavic lects spoken in the country due to the presence of many Italian words.
Chakavian actually has a written heritage, but it was mostly written down long ago. Writing in Chakavian started very early in the Middle Ages and began to slow down in the 1500’s when writing in Kajkavian began to rise. However, Chakavian magazines are published even today (Jembrigh 2014).
Although Chakavian is clearly a separate language from Shtokavian Croatian, in Croatia it is said that there is only one Croatian language, and that is Shtokavian Croatian. The idea is that the Kajkavian and Chakavian languages simply do not exist, though obviously they are both separate languages. Recently a Croatian linguist forwarded a proposal to formally recognize Chakavian as a separate language, but the famous Croatian Slavicist Radoslav Katičić argued with him about this and rejected the proposal on political, not linguistic grounds. This debate occurred only in Croatian linguistic circles, and the public knows nothing about it (Jembrigh 2014).
Kajkavian Croatian, spoken in northwest Croatia and similar to Slovenian, is not intelligible with Standard Croatian.
Kajkavian is fairly uniform across its speech area, whereas Chakavian is more diverse (Jembrigh 2014).
In the 1500’s, Kajkavian began to be developed in a standard literary form. From the 1500’s to 1900, a large corpus of Kajkavian literature was written. Kajkavian was removed from public use after 1900, hence writing in the standard Kajkavian literary language was curtailed. Nevertheless, writing continues in various Kajkavian dialects which still retain some connection to the old literary language, although some of the  lexicon and grammar are going out (Jembrigh 2014).
Most Croatian linguists recognized Kajkavian as a separate language. However, any suggestions that Kajkavian is a separate language are censored on Croatian TV (Jembrigh 2014).
Nevertheless, the ISO has recently accepted a proposal from the Kajkavian Renaissance Association to list the Kajkavian literary language written from the 1500’s-1900 as a recognized language with an ISO code of kjv. The literary language itself is no longer written, but works written in it are still used in public for instance in dramas and church masses (Jembrigh 2014). This is heartening, although Kajkavian as an existing spoken lect also needs to be recognized as a living language instead of a dialect of “Croatian,” whatever that word means.
Furthermore, there is a dialect continuum between Kajkavian and Chakavian as there is between Kajkavian and Slovenian, and lects with a dialect continuum between them are always separate languages. There is an old Kajkavian-Chakavian dialect continuum of which little remains, although some of the old Kajkavian-Chakavian transitional dialects are still spoken (Jembrigh 2014).
Kajkavian differs from the other Slavic lects spoken in Croatia in that is has many Hungarian and German loans (Jembrigh 2014). Kajkavian is probably closer to Slovenian than it is to Chakavian.
Nevertheless, although intelligibility with Slovenian is high, Kajkavian lacks full intelligibility with Slovenian. Yet there is a dialect continuum between Slovenian and Kajkavian. Kajkavian, especially the Zagorje Kajkavian dialect around Zagreb, is close to the Stajerska dialect of Slovene. However, leaving aside Kajkavian speakers, Croatians have poor intelligibility of Slovenian.
Chakavian and Kajkavian have high, but not full mutual intelligibility. Intelligibility between the two is estimated at 82%.
Molise Croatian is a Croatian language spoken in a few towns in Italy, such as Acquaviva Collecroce and two other towns. A different dialect is spoken in each town. Despite a lot of commonality between the dialects, the differences between them are significant. A koine is currently under development. The Croatians left Croatia and came to Italy from 1400-1500. The base of Molise Croatian was Shtokavian with an Ikavian accent and a heavy Chakavian base similar to what is now spoken as Southern Kajkavian Ikavian on the islands of Croatia. Molise Croatian is not intelligible with Standard Croatian.
Burgenland Croatian, spoken in Austria, is intelligible to Croatian speakers in Austria, Czech Republic, Slovakia and Hungary, but it has poor intelligibility with the Croatian spoken in Croatia.
Therefore, for the moment, there are five separate Croatian languages: Shtokavian Croatian, Kajkavian Croatian, Chakavian Croatian, Molise Croatian, and Burgenland Croatian.
Serbian is a macrolanguage made up to two languages: Shtokavian Serbian and Torlak or Gorlak Serbian.
Shtokavian is simply the same Serbo-Croatian language that is also spoken in Croatia, Montenegro and Bosnia. It forms a single tongue and is not several separate languages as many insist. The claim for separate languages is based more on politics than on linguistic science.
Torlak Serbian is spoken in the south and southwest of Serbia and is transitional to Macedonian. It is not intelligible with Shtokavian, although this is controversial.
Torlakians are often said to speak Bulgarian, but this is not exactly the case. More properly, their speech is best seen as closer to Macedonian than to Bulgarian or Serbo-Croatian. The Serbo-Croatian vocabulary in both Macedonian and Torlakian is very similar, stemming from the political changes of 1912; whereas these words have changed more in Bulgarian.
The Torlakian spoken in the southeast is different. It is not really either Bulgarian or Serbo-Croatian, but instead it is best said that they are speaking a mixed Bulgarian-Serbo-Croatian language. In the towns of Pirot and Vranje, it cannot be said that they speak Serbo-Croatian; instead they speak this Bulgarian-Serbo-Croatian mixed speech.
It’s also said that Serbo-Croatian can understand Bulgarian and Macedonian, but this is not true. However, the Torlak Serbians can understand Macedonian well, as this is a Serbo-Croatian dialect transitional to both languages.
Intelligibility in the Slavic languages of the Balkans is much exaggerated.
Slovenian speakers find it hard to understand most of the other Yugoslav lects except for Kajkavian Croatian. Serbo-Croatian intelligibility of Slovenian is 25-30%.
A lect called Čičarija Slovenian is spoken on the Istrian Peninsula in Slovenia just north of Croatia. This is a Chakavian-Slovenian transitional lect that is hard to categorize, but it is usually considered to be a Slovenian dialect.
Bulgarian and Macedonian can understand each other to a great degree (65-80%) but not completely. However, the Ser-Drama-Lagadin-Nevrokop dialect in northeastern Greece and southern Bulgaria and the Maleševo-Pirin dialect in eastern Macedonia and western Bulgaria are transitional between Bulgarian and Macedonian. The Aegean Macedonian dialects mostly spoken in Greece, such as the Lerinsko-Kostursko and Solunsko-Vodenska dialects, sound more Bulgarian than Macedonian.
Russian has a decent intelligibility with Bulgarian, possibly on the order of 50%, but Bulgarian intelligibility of Russian seems lower. Nevertheless, Bulgarian-Russian intelligibility seems much exaggerated. Some Russians and Bulgarians say they understand almost nothing of the other language. Nevertheless, most Bulgarians over the age of 30-35 understand Russian well since studying Russian was mandatory under Communism.
However, Bulgarian-Russian written intelligibility is much higher. Bulgarian and Russian are close because the Ottoman rulers of Bulgaria would not allow printing in Bulgaria. Hence, many religious books were imported from Russia, and these books influenced Bulgarian. Russian influence only ended in 1878.
Serbo-Croatian and Bulgarian have 10-15% oral intelligibility, however, there are Bulgarian dialects that are transitional with Torlak Serbian. Written intelligibility is higher at 25%. Macedonian and Bulgarian would be much closer together except that in recent years, Macedonian has been heavily influenced by Serbo-Croatian, and Bulgarian has been heavily influenced by Russian.
This difference is because Bulgarian is not spoken the same way it is written like Serbo-Croatian is. However, Bulgarians claim to be able to understand Serbo-Croatian better than the other way around. There is a group of Bulgarians living in Serbia in the areas of Bosilegrad and Dimitrovgrad who speak a Bulgarian-Serbian transitional dialect, and Serbs are able to understand these Bulgarians well.
Serbo-Croatian has variable intelligibility of Macedonian, averaging ~55%, while Nis Serbians have ~90% intelligibility with Macedonian. Part of the problem between Serbo-Croatian and Macedonian is that so many of the basic words – be, do, this, that, where – are different, however, much of the rest of the vocabulary is the same. Serbo-Croatian speakers can often learn to understand Macedonian well after some exposure.
Most Macedonians already are able to speak Serbo-Croatian well. This gives rise to claims of Macedonians being able to understand Serbo-Croatian very well, however, much of this may be due to bilingual learning. In fact, many Macedonians are switching away from the Macedonian language towards Serbo-Croatian.
The Macedonian spoken near the Serbian border is heavily influenced by Serbo-Croatian and is quite a bit different from the Macedonian spoken towards the center of Macedonia. One way to look at Macedonian is that it is a Serbo-Croatian-Bulgarian transitional lect. The intelligibility of Serbo-Croatian and Macedonian is highly controversial, and intelligibility studies are in order. Croats say Macedonian is a complete mystery to them.
Czech and Polish are incomprehensible to Serbo-Croatian speakers (Czech 10%, Polish 5%), but Serbo-Croatian has some limited comprehension of Slovak, on the order of 25%.
Serbo-Croatian and Russian have 10-15% intelligibility, if that, yet written intelligibility is higher at 25%.
Serbo-Croatian has only 20% intelligibility of Ukrainian.
Slovenians have a very hard time understanding Poles and Czechs and vice versa.
It’s often said that Czechs and Poles can understand each other, but this is not so. Much of the claimed intelligibility is simply bilingual learning. Czechs claim only 10-15% intelligibility of Polish.
The intelligibility of Polish and Russian is very low, on the order of 5-10%. Polish is not intelligible with Kashubian, a language related to Polish spoken in the north of Poland. Kashubian itself is a macrolanguage made up of two different languages, South Kashubian and North Kashubian, as the two have difficult intelligibility.
Silesian or Upper Silesian is also a separate language spoken in Poland, often thought to be halfway between Polish and Czech. It may have been split from Polish for up to 800 years, where it underwent heavy German influence. Polish lacks full intelligibility of Silesian, although this is controversial (see below). Some Poles say they find Silesian harder to understand than Belorussian or Slovak, which implies intelligibility of 20-25%.
The more German the Silesian dialect is, the harder it is for Poles to understand. In recent years, many of the German words are falling out of use and being replaced by Polish words, especially by young people. Poles who know German and Old Polish can understand Silesian quite well due to the Germanisms and the presence of many older Polish words, but Poles who speak only Polish have a hard time with Silesian.
Many Poles insist that Silesian is a Polish dialect, but this is based more on politics than reality. In fact, people in the north of Poland regard Silesian as incomprehensible. 40% of Silesian vocabulary is different from Polish, mostly Germanisms. The German influence is more prominent in the west; Polish influence is greater in the east. Many Silesian speakers now speak a watered down version of Silesian which is more properly seen as a Polish dialect with some Silesian words. Pure Silesian appears to be a dying language.
Silesian itself appears to be a macrolanguage as it is more than one language since as Opole Silesian speakers cannot understand Katowice Silesian, so Opole Silesian and Katowice Silesian are two different languages.
Cieszyn Silesian or Ponaszymu is a language closely related to Silesian spoken in Czechoslovakia in the far northeast of the country near the Polish and Slovak borders. It differs from the rest of Silesian in that it has undergone heavy Czech influence. Some say it is a part of Czech, but more likely it is a part of Polish like Silesian.
People observing conversation between Cieszyn Silesian and Upper Silesian report that they have a hard time understanding each other. Cieszyn Silesian speakers strongly reject the notion that they speak the same language as Upper Silesians. Ponaszymu also has many Germanisms which have been falling out of use lately, replaced by their Czech equivalents. Ponaszymu appears to lack full intelligibility with Czech. In fact, some say the intelligibility between the two is near zero.
Lach is a Czech-Polish transitional lect with a close relationship with Cieszyn Silesian. However, it appears to be a separate language, as Lach is not even intelligible within itself. Instead Eastern Lach and Western Lach have difficult intelligibility and are separate languages, so Lach itself is a macrolanguage. Lach is not fully intelligible with Czech; indeed, the differences between Lach and Czech are greater than the differences between Silesian and Polish, despite the fact that Lach has been heavily leveling into Moravian Czech for the last 100 years.
Czechs say Lach is a part of Czech, and Poles say Lach is a part of Polish. The standard view among linguists seems to be that Lach is a part of Czech. However, another view is that Lach is indeed Lechitic, albeit with strong Czech influence.
It is often said that Ukrainian and Russian are intelligible with each other or even that they are the same language (a view perpetuated by Russian nationalists). It is not true at all that Ukrainian and Russian are mutually intelligible, as Russian only has 50% intelligibility of Ukrainian. For example, all Russian shows get subtitles on Ukrainian TV. Yet some say that the subtitles are simply put on as a political move due to Ukraine’s puristic language policy. Ukrainian and Russian only have 60% lexical similarity. Polish and Ukrainian have higher lexical similarity at 72%, and Ukrainian intelligibility of Polish is ~50%+.
However, there are dialects in between Ukrainian and Russian such as the Eastern Polissian and Slobozhan dialects of Ukrainian that are intelligible with both languages. Complicating the picture is the fact that many Ukrainians are bilingual and speak Russian also. Ukrainians can understand Russian much better than the other way around. Nevertheless Ukrainian intelligibility of Russian is hard to calculate because presently there are few Ukrainians in Ukraine who do not speak Russian. Most of the Ukrainian speakers who do not speak Russian are in Canada at the moment.
In addition, the Slobozhan dialects of Ukrainian and Russian such as (Slobozhan Ukrainian and Slobozhan Russian) spoken in Kantemirov (Voronezhskaya Oblast, Russia), and Kuban Russian or Balachka spoken in the Kuban area right over the eastern border of Ukraine are very close to each other. Slobozhan Russian can also be called Kuban Russian or Balachka.
It is best seen as a Ukrainian dialect spoken in Russia – specifically, it is markedly similar to the Poltavian dialect of Ukrainian spoken in Poltava in Central Ukraine. Although the standard view is that Balachka is a Ukrainian dialect, some linguists say that it is actually a separate language closely related to Ukrainian. An academic paper has been published making the case for a separate Balachka language. In addition, Balachka language associations believe it is a separate language. Intelligibility between Balachka and Ukrainian is not known. Russian only has 60% intelligibility of Balachka.
However, Balachka is dying out and is now spoken only by a few old people. Most people in the region speak Russian with a few Ukrainian words.
Slobozhan Russian is very close to Ukrainian, closer to Ukrainian than it is to Russian, and Slobozhan Ukrainian is very close to Russian, closer to Russian than to Ukrainian. Slobozhan Ukrainian speakers in this region find it easier to understand their Russian neighbors than the Upper Dnistrian Ukrainian spoken in the far west in the countryside around Lviv. Upper Dnistrian is influenced by German and Polish.
The Russian language in the Ukraine has been declining recently mostly because since independence, the authorities have striven to make the new Ukrainian as far away from Russian as possible by adopting the Kharkiv Standard adopted in 1927 and jettisoning the 1932 Standard which brought Ukrainian more in line with Russian. For instance, in 1932, Ukrainian g was eliminated from the alphabet in order to make Ukrainian h correspond perfectly with Russian g. After 1991, the g returned to Ukrainian. Hence, Russians understand the colloquial Ukrainian spoken in the countryside pretty well, but they understand the modern standard heard on TV much less. This is because colloquial Ukrainian is closer to the Ukrainian spoken in the Soviet era which had huge Russian influence.
The intelligibility of Belarussian with both Ukrainian and Russian is a source of controversy. On the one hand, Belarussian has some dialects that are intelligible with some dialects of both Russian and Ukrainian. For instance, West Palesian is a transitional Belarussian dialect to Ukrainian. Some say that West Palesian is actually a separate language, but the majority of Belarussian linguists say it is a dialect of Belarussian (Mezentseva 2014). Belarussian and Ukrainian have 85% similar vocabulary.
Russian has high intelligibility of Belarussian, on the order of 75%. Belarussian is nonetheless a separate language from both Ukrainian and Russian.
From some reason, the Hutsul, Lemko, and Boiko dialects of the Rusyn language are much more comprehensible to Russians than Standard Ukrainian is. Intelligibility may be 85%.
The Lemko dialect of Rusyn has only marginal intelligibility with Ukrainian. Lemko is spoken heavily in Poland, and it differs from Standard Rusyn in that it has a lot of Polish vocabulary, whereas Standard Rusyn has more influences from Hungarian and Romanian.
The Rusyn language is composed of 50% Slovak roots and 50% Ukrainian roots, so some difficult intelligibility with Ukrainian might be expected. It has also been described as a transitional dialect between Polish and Slovak. Eastern Slovak has ~80% intelligibility of Rusyn.
Pannonian Rusyn is spoken by a group of Rusyns who migrated to northwestern Serbia (the Bachka region in Vojvodina province) and Eastern Croatia from Eastern Slovakia and Western Ukraine 250 years ago. Pannonian Rusyn is actually a part of Slovak, and Rusyn proper is really a part of Ukrainian. Pannonian Rusyn lacks full intelligibility of Rusyn proper. Not only that, but it is not even fully intelligible with the Eastern Slovak that it resembles most.
The intelligibility of Czech and Slovak is much exaggerated. It is true that Western Slovak dialects can understand Czech well, but Central Slovak, Eastern Slovak and Extraslovakian Slovak dialects cannot.
It is also said that West Slovak (Bratislava) cannot understand East Slovak, so Slovak may actually two different languages, but this is controversial. Western Slovak speakers say Eastern Slovak sounds idiotic and ridiculous, and some words are different, but other than that, they can basically understand it. Other Western Slovak speakers (Bratislava) say that Eastern Slovak (Kosice) is hard to understand. Bratislava speakers say that Kosice speech sounds half Slovak and half Ukrainian and uses many odd and unfamiliar words. Intelligibility testing between East and West Slovak would seem to be in order.
Much of the claimed intelligibility between Czech and Slovak was simply bilingual learning. Since the breakup, young Czechs and Slovaks understand each other worse since they have less contact with each other. In the former Czechoslovakia, everything was 50-50 bilingual – media, literature, etc. Since then, Slovak has been disappearing from the Czech Republic, so the younger people don’t understand Slovak so well.
Intelligibility problems are mostly on the Czech end because they don’t bother to learn Slovak while many Slovaks learn Czech. There is as much Czech literature and media as Slovak literature and media in Slovakia, and many Slovaks study at Czech universities. When there, they have to pass a language test. Czechs hardly ever study at Slovak universities.
Czechs see Slovaks as country bumpkins – backwards and folksy but optimistic, outgoing and friendly. Czechs are more urbane. The written languages differ much more than the spoken ones.
The languages really split about 1,000 years ago, but written Slovak was based on written Czech, and there was a lot of interlingual communication. A Moravian Czech speaker (Eastern Czech) and a Bratislavan Slovak (Western Slovak) speaker understand each other very well. They are essentially speaking the same language.
However, in recent years, there has also been quite a bit of bilingual learning. Young Czechs and Slovaks talk to each other a lot via the Internet. There are also some TV shows that show Czech and Slovak contestants untranslated (like in Sweden where Norwegian comics perform untranslated), and most people seem to understand these shows.
All foreign movies in both the Czech Republic and Slovakia are translated into Czech, not Slovak.
Far Northeastern Slovak (Saris Slovak) near the Polish border is close to Polish and Ukrainian. Intelligibility data for Saris Slovak and Ukrainian is not known. Saris Slovak has high but not complete intelligibility of Polish, possibly 85%. Eastern Slovak may have 72% intelligibility of Ukrainian.
Southern Slovak on the Hungarian border has a harder time understanding Polish because they do not hear it much. This implies that some of the high intelligibility between Slovak and Polish may be due to bilingual learning on the part of Slovaks.
Russian has low intelligibility with Czech and Slovak, maybe 30%.


Jembrigh, Mario. Croatian linguist. December 2014. Personal communication.
Mezentseva, Inna. English professor. Vitebsk State University. Vitebsk, Belarus. December 2014. Personal communication.
If you think this website is valuable to you, please consider a contribution to support the continuation of the site. Donations are the only thing that keep the site operating.

A Reclassification of the Dutch Language

Warning! This post is quite long – it runs to 126 pages. Frequently updated – last updated May 24, 2015.

Where the Dutch language begins and where it ends is an important question. Ethnologue splits Low Franconian-Low Saxon (whatever that is) into 15 languages – Flemish, Dutch, Zeelandic, Afrikaans, Achterhoeks, Drents, Gronings, Plautdietsch, Sallands, Low Saxon, Stellingwerfs, Twents, Veluws, Westphalian and East Frisian Low Saxon. Instead of the confusing Low Franconian-Low Saxon, we will henceforth refer to the same as “Macro-Dutch.”

This treatment will lump together many of the Dutch Low Saxon lects as Dutch, put East Frisian Low Saxon into Dutch, put Westphalian and German Low Saxon into German, move Limburgish out of German to Dutch where it belongs, and create a dozen new Macro-Dutch languages.

An important question is the position of Frisian languages in all of this. Currently Ethnologue has them in Anglo-Frisian. Gooskens 2004 makes a good case that Frisian is better analyzed as Macro-Dutch than Anglo-Frisian based on Levenshtein distance. She is probably correct, but I am going to leave Frisian outside of Dutch until I can analyze it better.

Anyway, genetically, Frisian is a part of an Anglo-Frisian family (Gooskens 2004). However, Frisian has drifted far away from English due to massive influence from Dutch such that it now is closer to Dutch than the Scandinavian languages are to each other (Gooskens 2004). It depends on if you wish to analyze Frisian based on its genetic history or on which language it is closest to.

One thing that ought to be dispensed with immediately is the notion that German, Dutch, Flemish and Afrikaans are intelligible with each other. The truth is that Hochdeutsch speakers can at worst barely understand a word of any of them and at best have only limited intelligibility.

Neither is German intelligible to Dutch speakers, even after 3-4 years of studying German. This even holds for Low German, which is often held to be intelligible with Dutch. It’s not, even after 3-4 years of study and even to speakers of Dutch-German border lects in the Netherlands that are presumably closer to Low German than the rest of Dutch. After 3-4 years of German, Dutch speakers have only 55% intelligibility of Low German, and the ones on the border have only 59% intelligibility of Low German (Gooskens in publication).

Nor are Frisian and Dutch mutually intelligible, another common claim. They have combined intelligibility of 61% (Gooskens 2005). Neither are Afrikaans and Dutch mutually intelligible. Combined intelligibility of the two languages is 55%, the same as Spanish and Portuguese (Gooskens 2005).

The Dutch either have a nationalist complex or are possible simply ignorant or indifferent on the question of what constitutes “Dutch.” They take a very conservative, nationalist view of the language question. To the Dutch, every language spoken in the Netherlands and some spoken outside of it is Dutch. Brabantian, Flemish, Veluws, Afrikaans, Limburgish, Bergish, Guelderish, Kleverlandish and Dutch Low Saxon are often all considered to be dialects of Dutch.

To be fair to the Dutch, I’m making a similar claim here, but instead of calling all of the above dialects of Dutch, I will call them separate languages under an umbrella called Macro-Dutch which subsumes them all.

The Dutch do recognize Limburgs and Low Saxon as minority languages.

Spain, Germany, Italy, France and Sweden do not recognize the languages under the umbrellas of Macro-Spanish, Macro-German, Macro-Italian, Macro-French and Macro-Swedish umbrella.

Spain does not recognize Asturian, Aragonese or Extremaduran. France does not recognize the many langues d’oil. Italian does not recognize Piedmontese, Ligurian, Lombard, Venetian, Emigliano, Romano, Neapolitan or Sicilian. Sweden does not recognize Scanian, Gutnish, Jamska or Dalecarlian. Germany does not recognize Bavarian, Swabian, High Franconian, Low German, Westphalian, Upper Saxon, Ripuarian or Pfaelzisch.

Probably the reasons that these languages are not recognized is due to the national consolidationist efforts behind a standard language and the fears of splintering the standard into substandard forms and the separatism that may ensue. So the Dutch are simply following in standard European modernist tradition.

This has resulted in problems and violations of language rights for speakers of other Low Franconian lects. For instance, Zeelandic is definitely a separate language, not a dialect of Dutch. Zeelandic speakers petitioned to have their language recognized as a minority language nine years ago, but the Dutch government has refused to grant this request.

The truth may disturb many Dutch speakers. For Dutch is not just the 15 languages confusingly listed in Ethnologue; it is actually 30 separate languages, which I will attempt to demonstrate below.

Method: Various “Dutch” and “Low Franconian” lects were analyzed on the basis of mutual intelligibility with Standard Dutch to see if they warranted treatment as separate languages. A rough guide was >90% intelligibility = Dutch dialect and <90% intelligibility = separate “Macro-Dutch” language. There are reasons for choosing 90% as a metric. Below 90%, and it gets difficult to discuss complex or technical subjects. Also, 90% seems to be where Ethnologue splits dialects from languages these days, and they are in charge of giving out ISO codes.

Other lects in Ethnologue’s treatment were analyzed to determine whether they belonged in “Macro-German” or “Macro-Dutch.” Westphalian and German Low Saxon were moved to Macro-German; the rest were moved to Macro-Dutch.

Anecdotal reports and scientific studies were reviewed, and native speaker informants were interviewed. Where intelligibility estimates are controversial, scientific intelligibility studies could always settle the matter. The creole was not counted.

Results: Ethnologue’s Low Franconian-Low Saxon was expanded from 15 into 32 languages based on mutual intelligibility. Below, separate languages are in bold, while dialects are in italics. Dutch, like Arabic, Italian, German, Chinese and so many others, is a macrolanguage.

Discussion: This work is merely a working hypothesis intended to be discussed and criticized by scholars and interested parties. I would be interested in criticism on a peer review basis. Criticism must be both constructive and friendly, otherwise it will be summarily rejected. This is very much a work in progress.

Fig. 1 List of the major dialects and languages of the Low Countries. 1. South Hollands 2. Kennemerlands 3. Waterlands/Waterländisch 4. Zaans 5. West Frisian dialect – North Hollands 6. Utrechts-Alblasserwaards 7. Zeelandic 8. Westhoeks 9. West Flemish and Zeelandic Flanders Flemish 10. Transitional dialect between West and East Flemish 11. East Flemish 12. Transitional dialect between East Flemish and Brabantian 13. South Gelders 14. North Brabantian and North Limburgish 15. Brabantian 16. Transitional dialect between Brabantian and Limburgish 17. Limburgish 18. Veluws 19. Gelders-Overijssels 20. Twents-Graafschaps 21. Twents 22. Stellingswerfs 23. South Drents 24. Middle Drents 25. Kollumerlands 26. Gronings and North Drents 27. Frisian 28. Bildts, Stadsfries, Midlands, Amelands. Click to enlarge.

Dutch Creoles

In recent years, there were five Dutch creoles spoken in Indonesia, Guyana and the US Virgin Islands. It appears that four of the five are extinct, and one is barely alive.

Berbice Creole Dutch is barely alive, spoken in Guyana by only four speakers. There are another 15 with limited competence. It is spoken in the Berbice River region of the country. About 1/3 of the words and most of the morphology is from the Nigerian Bantu language Izon, a language with 1 million speakers. The rest of the lexicon is mostly from Dutch. 10% of the words are borrowings from Guyanese Creole English and Arawak, an Indian language still spoken in Guyana.

Click to enlarge. Extremely detailed map lists all of the major lects of Holland and Belgium, including Low Franconian, Low German, Middle German, West Frisian and langues d'oil.
Click to enlarge. Extremely detailed map lists all of the major lects of Holland and Belgium, including Low Franconian, Low German, Middle German, West Frisian and langues d’oil.

Low Franconian Languages and Dialects

Standard Dutch, Algemeen Nederlands or AN (henceforth, AN) is a major world language spoken by all 15 million residents of the Netherlands and an additional 7 million speakers elsewhere. Although one might suspect that Dutch goes all the way back to the oldest Old Franconian, actually, the lects closest to Old Franconian are French Flemish, West Flemish and Zeeland Flemish. Dutch proper seems to have broken off sooner.

Dutch has many dialects, but they are all more or less intelligible. There are two forms of Dutch in general – Hollandic and Brabantian. Both are part of AN. Modern Belgian Dutch is much more Brabantian than Hollandic.

There is also Brabantian Netherlands Dutch, a dialect of Netherlands Dutch, and Brabantian Belgian Dutch, a dialect of Belgian Dutch or Vlaams (Grondelaers 2009).

Surinamese Dutch is a Dutch dialect, easily intelligible with AN, that is spoken in Suriname. It has 280,000 speakers, or 60% of the population. It is the official language of Suriname.

Netherlands Dutch is the Dutch dialect spoken in the Netherlands, differentiating with Belgian Dutch. It is widely understood throughout the country, especially the Standard Dutch variety of this dialect that has been popularized in the Netherlands since the 1960’s.

Netherlands Brabantian Dutch is a Dutch dialect spoken in North Brabant Province in the Netherlands (Grondelaers 2009). It is easily intelligible with AN. This dialect has about 2.45 million speakers.

Belgian Brabantian Dutch is the same thing as the Verkavelingsvlaams described below. It is spoken in North Brabant Province and in Antwerp Province by about 3.4 million speakers. It is being replaced by French in Brussels, but it is still widely spoken elsewhere.

Stadsfries is a mixed dialect spoken in certain urban areas of Friesland such as the towns of Leeuwarden, Dokkum, Bolsward, Sneek, Stavoren, Harlingen and Franeker. Originally Frisian speakers, they gave up Frisian for Dutch about 500 years ago. The vocabulary is mostly Dutch with Frisian pronunciation. AN speakers can understand this dialect pretty easily. Lately it is seriously declining and has low prestige, hence it is becoming a sociolect spoken mostly by low-income people in the cities.

Snekers is a Stadsfries dialect spoken in the Friesland city of Sneker. It traces back to 1600 or so when locals abandoned West Frisian for Hollandic speech as an elite gesture, since Hollandic was not spoken much outside of the Holland Provinces. By 1800, the rest of the city had modeled their elitist behavior after the rich and the whole city spoke Snekers. It continued to be a highly valued speech until 1900. People kept speaking it a lot until WW2.

The disdain towards Frisian, seen as peasant speech, continues in many Snekers speakers to this day. In the 20th Century, many rural people moved to the city, and many foreigners moved there too. Snekers became a speech used only by Sneker natives among themselves. They spoke Dutch or sometimes Frisian to newcomers. Nowadays, Snekers is dying. The youth have taken it up, but they speak a watered down version that is probably intelligible to AN speakers.

Hollandic Dutch is the other large dialect of Dutch besides Brabantian. Hollandic is spoken in the provinces of North Holland and South Holland by about 6 million speakers. This dialect is intelligible with AN. Hollandic Dutch is the variety that is closest to AN. It is divided into two lects, North Hollandic Dutch and South Hollandic Dutch.

IJmuidens is a dialect spoken in by the lower classes in IJmuiden, the third largest port in the Netherlands, in North Holland. The dialect is probably readily intelligible with AN.

Haarlems is the dialect spoken in Haarlem in North Holland, especially by the lower classes. It does not differ much from Amsterdams or AN. This area has long had the reputation for being the place where the purest Dutch is spoken, although this is no longer true anymore. Nowadays, the purest Dutch is spoken in places like Dronten on the Dutch polders in the IJsselmeer.

Nijmeegs is a very interesting dialect spoken in the city of Nijmegen in eastern Gelderland. Although strictly speaking it should be a South Gulderish dialect, it has heavy Hollandic features such that it may well be intelligible to AN speakers. Until the late 1800’s, residents of the city were speaking a typical South Gulderish dialect. However, in the late 1800’s, the upper class of the city began speaking a Randstad dialect similar to Amsterdams and Haags.

The lower classes quickly began speaking the same dialect, and the traditional dialect of the city disappeared, as it was poorly valued anyway. Nijmeegs still has some East Brabantian, Limburgish and Achterhoeks features, but it also lacks many characteristic Limburgish and Brabantian features of surrounding dialects.

Amsterdams is the dialect of the city of Amsterdam, spoken by the lower classes in the city. It is still spoken in the city, especially in certain neighborhoods. Although it is located in North Hollands, Amsterdams is more of a South Hollands dialect. A book published in 1874 found an astounding 19 different dialects spoken in the city.

Although it is still spoken, Amsterdams is associated with lower-classes, street toughs, etc, such that many Amsterdammers try to unlearn the dialect in order to improve their career chances. Amsterdams has many Yiddish words due to the fact that a large Jewish community has traditionally lived there. Amsterdams is intelligible with AN.

Haags is a South Hollandic dialect spoken by the lower classes in The Hague. It is easily intelligible with AN. The dialect is dying out and undergoing serious leveling, but since the 1980’s there has been a movement to bring back the dialect, and more residents of the city are speaking it, often with intentionally exaggerated features. Its syntax is similar to AN and is quite different from the nearby Rotterdams and Leids dialects.

Gouds is a South Hollandic dialect spoken in the city of Gouda, 20 miles northeast of Rotterdam. In many ways it is similar to AN. With mass immigration and compulsory education in AN, the real Gouds is hardly heard anymore.

Rotterdams is the South Hollandic dialect spoken in the city of Rotterdam. It differs little from AN. This is because the standard for Hollandic dialects, dating back to 1600, was the Rotterdams dialect. Its influence spread throughout the region, first to the upper classes and then to the lower classes as they imitated the speech of the rich.

The Rotterdams dialect does have many unique features, mostly due the waves of immigrants who have come to the city, each bringing their own language which added to the Rotterdams dialect. In the 1800’s, there was a large influence from Brabantian and Zeelandic speakers. In the 1900’s, the influences have become more varied, as speakers of Arabic and the  Papiamento or Surinamese creoles added their words to the mix. It is still heard throughout the Rotterdam region and in the cities of Spijkenisse, Hellevoetsluis and Capelle aan den IJssel to the east and southwest.

Bildts is a mixed Frisian-Dutch lect spoken in the Het Bildt, a polder region in Friesland northwest of Leeuwarden that dates back to the 1500’s. Many immigrants came from the South Holland area to this part of Friesland to help create the polders. Their South Hollandic lects mixed with the Frisian spoken by the local farm workers to create this interesting mixed dialect.

Intelligibility between Bildts and AN is not known, but in a dialect map published in 1974 showed Bildts the furthest of all from AN (Berns 1991). On the basis of that study, Bildts may indeed be a separate language, but better intelligibility data would be nice.

Midslands is a North Hollandic dialect, similar to Stadsfries, that is still spoken in on Terschelling Island off the coast of Friesland in the village of Midsland. It has Hollandic and Frisian influences. Intelligibility data is lacking.

Amelands is a another dialect like Midslands and Stadsfries. It has mostly Hollandic vocabulary with Frisian grammar. There are four villages on the island, each with their own dialect. Nevertheless, all dialects are intelligible with each other.

The dialect developed in the 1700’s when Hollandic migrants moved to the island, probably for trade, and the locals gave up their Frisian speech for Hollandic. The process was not complete, and Amelands was the result. It is still very widely used. 85% of youth continue to speak Amelands. Intelligibility with AN is not known.

Westfries is a highly divergent dialect of Dutch spoken in West Friesland that is not to be confused with the West Frisian language. It is dying out and is only spoken by about 8% of the population. There are many subdialects, often one for every village or town they often differ considerably.

There is some confusion about the difference between this Dutch dialect and the West Frisian language proper. It has heavy Frisian influence. A better way to describe it might be to say that it is a mixed language of Dutch and West Frisian, almost a “creole.” It could also be described as Dutch with a heavy Frisian substrate.

Westfries was apparently a Frisian language for centuries until it died out about 200 years ago. It appears to have transformed from a full Frisian language to a form of Dutch. The strong variety is still used in cabaret performances.

Another way to look at it is that Westfries is one of the last of the more pure Hollandic dialects. Most of the rest of Hollandic has undergone serious leveling such that most of the peculiar features, such as the Frisian substrate that characterized all Hollandic, have washed out. AN speakers reportedly have a hard time understanding Westfries, and it is about as distant from AN as Zeelandic. There appears to be more than one language inside Westfries, since it’s not uncommon for speakers of varying Westfries lects to not understand each other.

Westfries consists of two parts. One, the Westfries language, which consists of Island Westfries. And two, Land Westfries, which is part of the North Hollandic language.

Island Westfries or Eland Westfries is a major split in Westfries. This is spoken on the islands and former islands of Texel, Vlieland and Wieringen and on land in the city of Enkhuizen. Island Westfries has poor intelligibility with the more common Land Westfries due to its archaic character, hence it may be a separate language.

Wierings is an Island Westfries dialect spoken on the former island of Wieringen. It is very close to Tess, the dialect of Texel Island. Wierings is rapidly disappearing and is only spoken by the older generation. Younger people speak a weak Wierings which looks more like Land Westfries. There is a navy base on Wieringen, so many non-islanders have come to live there.

Tessels is an Island Westfries dialect spoken on the island of Texel in North Holland that is so different from the rest of Island Westfries that it must be a separate language. It is still widely spoken, especially in the rural areas, but it is not much spoken in the larger cities. There are different varieties of Tessels spoken in the towns of Oudeschild, De Cocksdorp, Den Hoorn and Oosterend. The dialects differ greatly, and speakers from different towns do not necessarily understand each other fully, hence intelligibility is somewhat marginal among the dialects.

North Hollandic is a language spoken in North Holland Province. It consists of the Land Westfries, Zaans and Waterlands dialects. The situation is confusing, as there is also North Hollandic Dutch, a dialect of AN.

Land Westfries is a dialect of North Hollandic Dutch, a major split in the Westfries language. This variety is less conservative and has been influenced more by Dutch. The more archaic varieties of Island Westfries have poor intelligibility with Land Westfries, hence it may be a separate language.

Kennemerlands is a North Hollandic Dutch lect spoken in Kennermerland around the cities of Haarlem and Beverwijk. It arose in the Middle Ages due to contact between Frisian speaking fishermen and speakers of North Hollandic Dutch. Towards the north, it looks more like Westfries and the Zaans dialect. It is best analyzed as a transitional dialect between North Hollandic and Westfries. It is unintelligible to AN speakers, and is apparently a separate language.

Durkers or Egmonds is a strange dialect, often analyzed as either Westfries or Kennemerlands, spoken on Egmond aan Zee in the north of North Hollands Province. In this treatment, we will analyze it as Kennemerlands. It is not intelligible with AN (Anonymous January 2010)

Zaans-Waterlands is a North Hollandic lect spoken in North Holland Province. It is composed of two dialects, Zaans and Waterlands.

Zaans is an archaic North Hollandic dialect spoken in the Zaan, an old settled and industrial area between Amsterdam and Haarlem. It is spoken in the city of Zandam and in the towns of Wormerveer, Krommenie and Zaandijk. It apparently arose out of Westfries. Zaans has difficult intelligibility with AN.

Waterlands is a Zaans-Waterlands dialect that is spoken between the Zaan and the IJsselmeer, the inland sea in the Netherlands. This dialect is very archaic, though it is similar to Zaan and Westfries. It has difficult intelligibility with AN.

Volendams is a Waterlands dialect that is extremely divergent. It is unintelligible with AN, and even other Waterlands speakers have a hard time understanding it, so it is probably a separate language.

The city of Volendam was isolated for centuries, and this gave rise to its strange language. This isolation, combined with immigration of speakers of other odd dialects from fishing villages around the Zuiderzee, helped shape Volendams. Volendams received huge immigration in 1859 following the evacuation of the former Zuiderzee island of Schokland due to fierce storms. The Schokland residents spoke a strange dialect called Schokkers which was basically a Low Saxon dialect similar to Urkers.

Markens is a very unusual Waterlands dialect that is spoken on the former island of Marken. It also received large input from the fleeing residents of Schokland. Markens is one of the most unusual dialects in the Netherlands and has been the object of many studies. It has difficult intelligibility with AN, but intelligibility with the rest of Waterlands is not known.

Markens appears to have a heavy base of Frisian or even Old Frisian. It appears to be undergoing dialect leveling under the pressure of the mass media and immigration, and young people typically do not speak pure Markens.

Goois is a North Hollandic dialect spoken in Het Gooi, a region in the far southeast of North Hollands. Cities in this region include Naarden, Bussum, Huizon, Blaricum, Laren and Hilversum. Opinions on this dialect are varied. One view is it is a Dutch-Low Saxon transition dialect, mostly in the far east of Blaricum, Laren and Hilversum. That would be transitional to West Veluws. This view sees the rest of the area as Hollandic. There is also influence from the Utrechts dialects. The dialect is still alive, especially in the three eastern cities discussed above.

South Hollandic is a lect spoken in South Hollandic Province. A similar situation is going on here as with Brabantian and North Hollandic. As there is Brabantian Dutch and North Hollandic Dutch and Brabantian and North Hollandic languages, so there is South Hollandic Dutch and the South Hollandic language. The South Hollandic language is mostly gone now, as dialect leveling has moved most of the dialects to South Hollandic Dutch. However, it remains alive in the form of the Strandhollands and East IJsselmonds dialects.

Aalsmeers is a dialect spoken in the city of Aalsmeer in southern North Holland near the border with South Holland. Traditionally, it was a Strandhollands dialect, but it has lost most of its Strandhollands features and is probably not a part of that group anymore. It has a similar genesis with the Strandhollands language, in that it was formed by immigrants from the Frisian-speaking north moving down to the area long ago.

However, due to geographical isolation (they were cut off on three sides by marshes or lakes and only accessible via a sliver of land) they were cut off from the rest of Strandhollands and the convergent evolution with it ended. There was also a group of Mennonites who came down from Friesland and settled in the area.

Immigrants probably kept speaking Frisian here longer than in other places. In general, this dialect is best seen as transitional between North and South Hollandic. The original Aalsmeers dialect is nearly extinct. Intelligibility data with AN is not known.

Strandhollands is a very conservative dialect of the Hollandic language spoken in the fishing villages in the area of Sheveningen and Katwijik aan Zee in the Holland Provinces. Intelligibility in general is marginal at best and hardly possible at worst between this lect and AN (Anonymous January 2010), hence it is a separate language.

This is a very archaic South Hollandic language that has preserved many old features, while the rest of South Hollandic behind the dunes has trended towards Hollandic Dutch. Strandhollands retains many features of Medieval Dutch. It is interesting that the standard dialect of The Hague is close nearby.

It emerged about 400 years ago and its provenance is obscure. Probably fishermen from elsewhere on the coast, such as Friesland and and the Zuiderzee moved into the area to take up fishing. The language has a strong Frisian substrate. Probably the isolation of the villages helped to keep the lect different from surrounding evolving lects.

The Strandhollands dialects become more intelligible with AN, in general, as one moves to the south. The least comprehensible ones are generally in North Holland Province. Intelligibility data between this and the rest of South Hollandic, especially East IJsselmonds, is needed.

Wijk aan Zee is a Strandhollands dialect spoken in the fishing village of Wijk aan Zee that has poor intelligibility with AN (Anonymous January 2010). The town is located west of Beverwijk.

Zandvoort is a Strandhollands dialect that is hardly comprehensible to AN speakers (Anonymous January 2010). It is spoken in Zandvoort on the coast west of Haarlem.

Noordwijks is a Strandhollands dialect spoken in the fishing village of Noordwijks an Zee in South Holland Province. Intelligibility with AN is somewhat marginal (Anonymous January 2010). Noordwijks is probably the easiest Strandhollands lect for AN speakers to understand.

Katwijks is a Strandhollands lect spoken in the fishing village of Katwijks an Zee in South Holland Province. It is based on an archaic version of Leids, the dialect of the city of Leiden. Katwijks, like Zandvoort and Wijk aan Zee to the north, is barely comprehensible to AN speakers (Anonymous January 2010).

Schevenings is a Strandhollands dialect spoken in the fishing village of Scheveningen in South Holland Province. It has marginal intelligibility with AN (Anonymous January 2010). This dialect is said to be based an archaic version of Haags, the dialect of The Hague.

Zoetermeers is a very divergent South Hollandic dialect spoken in the city of Zoetermeer 10 miles east of the Hague. This was always an isolated farming village, so it was not effected much by the trends effecting the Haags dialect a short while away. In the 1960’s, the population grew from 10,000 to 120,000 as immigrants flooded into the Hague region. Hence, only a few locals speak the dialect anymore.

Westhoeks is spoken in the Westhoek in northwest North Brabant. It’s a Hollandic dialect spoken in Brabant. No one is sure why. They are Protestants, and this may have something to do with it, but it’s more likely a case similar to Bildts, where many Hollandic speaking immigrants moved to the area after the polders were created in the 1600’s and afterward. Intelligibility with the rest of South Hollandic is not known.

Westhoeks is divergent enough from the rest of South Hollandic to be given its own category in many analyses. It has some influence from Dordts, the old dialect of Dordrect not far to the north.

Fijnaarts is a Westhoeks dialect spoken in the village of Fijnaart in North Brabant.

Dordts is a South Hollandic dialect spoken in the city of Dordrect that is intelligible with the rest of South Hollandic. It has heavy Zeelandic and Brabantian influences. In the 20th Century, it underwent dialect leveling under the influence of the much less divergent Rotterdams dialect in Rotterdam. The strongest Dordts is now heard in the center of the city.

IJsselmonds is a South Hollandic lect spoken south of Rotterdam on the old island of IJsselmond, now reclaimed from the sea. The former island can now be seen via satellite as #9 on this map. In general, it is south of Rotterdam between the Niewe Maas and the Spijkenisse Rivers. The region is now heavily industrial, particularly gone over to shipbuilding. The lect is quite a bit different from both AN and Rotterdams. It has two main variants, West and East IJsselmonds.

West IJsselmonds has come under severe Rotterdams influence and can hardly be heard in its pure form anymore. It is only barely alive in the town of Pernis.

East IJsselmonds is extremely divergent from AN and Rotterdams and cannot be understood outside the region. It has mostly undergone dialect leveling and in general is rarely heard. The youth speak a watered down version that is intelligible with AN. Only in the city of Hedrik-Ido-Ambrecht can the true lect be heard on an everyday basis. Given that it’s unintelligible outside the region, it may be a separate language. Intelligibility data between this and the rest of South Hollandic, especially Strandhollands, is needed.

Ambachts is the last remaining holdout of the East IJsselmonds language. This is a deeply conservative dialect, the most conservative of the language, such that the lect of one village may differ greatly from the next. It has striking influences from the Umbrechts-Alblasserwards dialect group to the east.

Baorendrechts is a deeply conservative East IJsselmonds dialect that is spoken in the city of Barendrecht. It has been mostly superseded by AN these days.

Bulessers is another deeply conservative East IJsselmonds dialect spoken in the city of Bolnes. It is almost extinct, under heavy pressure from AN.

Zwindrechts is an East IJsselmonds dialect spoken in Zwijndrecht. It has undergone serious dialect leveling due to the effects of industrialization but can still be heard, mostly in farmers. It has some Dordts influence.

Rekkarkeks is a South Hollandic lect spoken in the city of Ridderkerk, halfway between Rotterdam and Dordrect. This is a very unusual lect that is very different from AN. Hence is has poor to marginal intelligibility with AN, and thus, it may well be a separate language.

It is located just to the east of the East IJsselmonds language, hence its unusualness is probably due to its East IJsselmonds features. It is barely alive and has only a few speakers left. A diluted version is still quite alive. Intelligibility data with the East IJsselmonds language is urgently needed.

Hoekschewaards is a South Hollandic dialect spoken on a former island southwest of Dordrecht, between the Spijkenisse River and the Haringvliet Channel. The city of Numansdorp is located in this region. This dialect has strong IJsselmonds and Albasserwards tendencies. These are much stronger than the Dordts influences. It has three divisions, West Hoekschewaards, East Hoekschewaards and Gravendeel. It is still very much alive, though it is coming under heavy influence from Rotterdams and AN.

West Albasserwards is spoken in the Western part of the Albasserwards, east of Rotterdam about halfway to the Utrecht border. The dialect is dying out in many areas, and there is little interest in preserving it. However, in many of the rural areas, a strong dialect is still alive.

In the eastern part of the Albasserwards, the dialect is like that of Utrecht, but in the west it is quite Hollandic, although it has some Utrecht influences. The dialect differs even from village to village. It is spoken in cities such as Sliedrecht and Papendrecht. The Papendrecht dialect is almost gone due to heavy immigration.

Slierechs is the very divergent West Albasserwards dialect spoken in the city of Sliedrecht. People here have taken more interest in their dialect than elsewhere in the region, and there are regular CD’s and books issued on it.

Utrechts-Alblasserwaards is a dialect group of Hollandic dialects spoken in Utrecht Province, far southeast South Hollands and a small part of Gelderland. To the south there are dialects heading into Brabantian and to the east, there are more dialects heading into South Gulderish. The dialect has low prestige, and there is little interest in it, even among speakers. Nevertheless, it is still learned by children, and there are 330,000 speakers of this dialect.

Utrechts is spoken by the lower classes of the city of Utrecht, capital of Utrecht Province. Nowadays it is spoken more in the rural areas around the city than in the city itself, but even in the city, it is still spoken in certain districts. There is a lot of immigration into the city and emigration out of it, so the dialect is dying.

Vijfheerenlands is an Utrechts-Alblasserwaards dialect spoken in the Vijfheerenland region in the southeast of South Holland. This area includes the cities of Vianen, Meerkerk, Leerdam and Lexmond.

Eemlands is a confusing set of dialects spoken in the eastern part of Utrecht and has strong Veluws influence. Some say that they are Utrechts-Alblasserwaards dialects, and others say that they are West Veluws. The best analysis is that they are transitional between the two varieties, in other words, that they are Low Franconian-Low Saxon transitional dialects. They are spoken in Soest, Amersfoort and Bunschoten. Amersfoort and Bunschoten tend to be considered more West Veluws, and Soest tends to be seen as more Utrechts. With the exception of Bunschoten, these dialects are highly endangered.

Geldersevalleis is a set of dialects spoken in the Gelders Valley, 2/3 of which is in Gelderland and 1/3 of which is in Utrechts. The towns of Ede, Wageningen and Veenendaal are located in this region. These dialects are very hard to characterize, as they have West Veluws, Utrechts and South Guelderish tendencies. They are seriously declining and becoming more Hollandized.

West Veluws is a strange dialect usually collated with Dutch Low Saxon, but which is in fact a Low Franconian dialect. Practically speaking it is best seen as transitional between Low Franconian and Low Saxon. For the most part it is intelligible with AN, but as one moves to the north and east of the West Veluws area, West Veluws gets harder for AN speakers to understand. This dialect has heavy Dutch influence. In most places, this is a dying dialect, and it is not spoken much by young people anymore.

Even the forms of West Veluws still spoken in the home are coming under increasing AN influence. It is spoken in Amersfoort, Spackenburg, Bunschoten, Nijkerk, Barneveld, Putten, Voorthuizen, Ermelo, Elspeet, Uddel, Leuvenum, Harderwijk, Hierden, Nunspeet, Lunteren, Otterlo and Huenderlo. In Nijkerk, Amersfoort, Spackenburg and Bunschoten in the west of the West Veluws region, the dialect is nearly dead.

Brabantian is actually a separate language. It is distinct from Netherlands Brabantian Dutch, which is merely a dialect of Dutch (Grondelaers 2009). The real hardcore Brabantian is dying out, but it is highly divergent, and Dutch speakers say it is incomprehensible. Intelligibility is far lower than for Zeelandic. However, Verkavelingsvlaams speakers can understand Brabantian pretty well, since Verkavelingsvlaams is very Brabantian.

Brabantian is dying out in the Netherlands, but it is still spoken in Tilburg and in the rural areas of Nord Brabant. There is quite a bit of confusion about what is the pure Brabantian and what is Brabantian Dutch, but the key is intelligibility. Brabantian Dutch is easily comprehensible to an AN speaker, and the real Brabantian is not at all. Other than South Brabantian, which is a separate language, all of the Brabantian dialects are mutually intelligible.

North Central Brabantian is a dialect of Brabantian that is spoken in the Netherlands and Belgium in a strip that runs along the border around the towns of Ravels, Tilburg, Loon op Zant, Waalwijik, Vlifjmen, Huesderf and Drunen.

Tilburgs is a hard North Central Brabantian dialect that is still widely spoken in the city of Tilburg in the southern part of the Netherlands. It is intelligible with the rest of Brabantian (Anonymous January 2010).

East Brabantian is spoken in the eastern part of North Brabant. It is one of the main Brabantian divisions. The various divisions of East Brabantian include Kempenlands, North Meierjis, Peellands, Geldrops and Heeze en Lendes.

It includes the towns of Eindhoven, Veldhoven, Vught, Boxtel, Oirshchot, Best, Acht, Middelbeers, Eersel, Waalre, Mierlo, Luijksgestel, Bergelijk, Aalst, Heeze, Leende, Son, Helmond, Berjeijk, Schijndel, Lieshout, Beek, Gemert, Aarle-Rixtel, Aasten, Someren, Liessel, Duerne, Bakel, Mill, Veghel, Volkel, Uden, Nistelrode, Heesch, Zeeland, Boekel, Sint Michielsgestel in the Netherlands and Arendonk and Lommel in Belgium. East Brabantian is intelligible with the rest of Brabantian (Anonymous January 2010).

Northern Kempens is a hard East Brabantian dialect spoken in an area on the border of Belgium and the Netherlands in eastern Antwerp and western Limburg Provinces in Belgium and north into the Netherlands. Major cities and towns in the region include Turnhout, Arendonk, Eersel, Oirshchot, Hilvarenbeek, Retie, Oisterwijk, Boxtel, and Eindhoven. It is an area of poor soil with many marshes, bogs and forests. Lately, it is primarily a tourist region. Northern Kempens is intelligible with the rest of Brabantian (Anonymous January 2010).

Arendonk is a very specific, apparently highly diverse and possibly archaic Northern Kempens Brabantian dialect spoken near Turnhout close to the Dutch border. It is said to be unintelligible outside of the nearby area. Hence, it may well be a separate language.

Northwest Brabantian is a Brabantian dialect spoken in the Netherlands and Belgium. It is spoken in Breda and the surrounding region to south into Belgium.

Cities in which it is spoken include Breda, Baarle-Hertog, Oosterhout, Steenbergben, Made, Raamsdonksveer, Roosendaal, Putte, Geertruidenberg Hoogstraten, Brecht, Moerdjik, Oudenbosch, Bergen Op Zoom, Huijbergen, Rijsbergen and Woesndrecht in the Netherlands and Woostwezel, Meer, Ekeren, Merksom, Kapellen, Lillo, Stabroek, Meerle and Rijkevorsel in Belgium.

This dialect was created from the Eighty Years War. After the war, this Brabantian-speaking region was essentially depopulated, and afterward, a large movement of immigration from the Antwerp region occurred, spreading the tendencies of the Antwerps dialect. Northwest Brabantian consists of three major dialects, Antwerps, Baronies and Markiezaats. Antwerps is spoken in Antwerp and north to the Netherlands border. Baronies is spoken in the area around Breda and Markiezaats is spoken in the west over by Zeeland.

Bredaas is a Northwest Brabantian dialect spoken in the city of Breda that is dying out. It is mostly spoken in certain areas and with the older generation. It tends to re-emerge around Carnival time though.

Markiezaats is spoken in the west of North Brabant around the cities of Bergen op Zoom and Steenbergen. It extends over to the Drimmelen region to the northeast and generally includes everything west of Breda.

Antwerps is a hard Brabantian dialect spoken in Antwerp, Belgium. It is intelligible with all the rest of Brabantian (Anonymous January 2010). This dialect is widely disliked in Belgium because it is neither Flemish nor a Dutch dialect, and hence is poorly understood.

It is often heard in the Belgian media, but it is rarely subtitled, and this is the cause of the frustration with non-Antwerps speakers. East Flemish speakers say that they cannot understand it. This language is spoken in Antwerp. In a study, 51% of East Flemish speakers said that they wanted subtitles when listening to Antwerps speakers on TV (De Houwer 2008). Antwerps was regularly heard on TV until recently.

This dialect is one of the most influential in terms of inputs towards the creation of Verkavelingsvlaams. Verkavelingsvlaams at the moment is heavily based on the Antwerps dialect. There is some uncertainty regarding the intelligibility of Antwerps with surrounding lects. Students who recently went to school in Antwerps say that they could not understand students who came from villages in the Antwerps area. It is not known what lects the villagers were speaking.

Wase is the name for a group of Brabantian dialects spoken in the Waseland in the far northeast of East Flanders. The capital of this region is the city of St. Niklaas. The area was originally wide fields bounded by willow trees. It flooded and was drained a few times. Many turnips are grown here.

Maaslands is a dialect of Brabantian that is spoken in a narrow strip in North Brabant south of the Maas River. It is spoken in the towns of Empel, Maren, Lith, Herpen, Oijen, Megen, Ravenstein, Oss and Grave, all of them along the Maas River.

Bosch is a Maaslands dialect spoken in Hertogenbosch, a large city a bit south of the Maas River in North Brabant. The dialect is still pretty well alive, but its use varies throughout the city, with some areas speaking a lot of Bosch and other areas in which it is rarely heard. Due to immigration and the fact that it has become a commuter town, the dialect has been declining for some time now.

Nederbetuws is a confusing dialect, usually included in South Guelderish, spoken in the Lower Betuws in Gelderland. It actually has heavy Brabantian features. The dialects of the river cities of Tiel and Culemborg are quite different. It is spoken in the towns of Tiel, Culemborg, Buren, Geldermalsen, Wadenoijen, Ophemert, Waardenburg, Herwijnen and Gorinchem. This is mostly a rural area, with a lot of livestock, fruit orchards, vegetables and greenhouses.

South Brabantian is a very divergent lect within Brabantian that is very hard for other Brabantian speakers, even those from nearby Antwerp Province, to understand (Anonymous January 2010). Therefore, it may well be a separate language. It is spoken in Brabant Province in Belgium and around the capital of Brussels. This area has retained the most extreme and archaic Brabantian features. It is under heavy pressure from Verkavelingsvlaams, especially in the cities and less so in the countryside.

The least intelligible variety seems to be spoken from Brussels west to the East Flanders border, especially in the rural areas and near the southern and western borders.

Brussels in the name for a group of South Brabantian lects that were traditionally spoken in Brussels, and still are by a small number of old people. In the past 200 years though, the language of the capital shifted to French. The remaining Brabantian speakers shifted to some form of Dutch, and many today speak some Dutch standard, usually VRT. At any rate, the original Brussels South Brabantian lects are now almost extinct, spoken only by the older generation, most of whom are also bilingual in French.

Traditionally, Brussels lects were very diverse and were not intelligible with Antwerps Brabantian or Leuvens South Brabantian from about 1650 on. Increasing French influence after the Eighty Years War which ended in 1648 resulted in a closing off of Brussels to most outside influence and increasing French influence on the Brussels lects. It was still the most widely used language in Brussels until the French occupation around 1800.

It then began to decline as more residents started speaking French. In part this was an urban elitist effect, as the local rural areas all spoke Brabantian dialects, and the city became increasingly French speaking, especially the upper class. To sum up, to speak French meant you sounded like an aristocrat and to speak Brabantian meant you were talking like a farmer.

During the 1800’s there was a big debate in Brussels about which form of Dutch to make the official language – some common Flemish form or something more like Netherlands Dutch? People could not make up their minds, and this gave people one more reason to just speak French instead.

Brussels is almost extinct, and only some older Brusseliers speak it. Apparently no one else, including almost everyone in Brussels, can understand them. As Brussels is barely understood even in the city, clearly it must not be understood outside the city either. Hence, Brussels may be a separate language. But intelligibility data with the rest of South Brabantian would be nice to have.

Marols is a divergent Brussels dialect traditionally spoken in the colorful Marollen district, traditionally a poorer, rundown working class area, that was recently full of drug dealers and bums, but is now undergoing gentrification. Marols is a strange mixture of Spanish, Yiddish, Walloon and Brabantian. The Yiddish and Spanish is from many Spanish Republicans and Polish Jews moving to this district just before WW2. Marols is rarely heard these days, and intelligibility with the rest of Brussels is not known.

Liekert is a South Brabantian dialect spoken in Liedekerke, Belgium in Brabant Province on the border with East Flanders. It is unintelligible with the rest of even Flemish Brabantian, including Antwerps.

Leuvens or Leives is a South Brabantian dialect spoken in the city of Leuven in Belgian Brabant. Many immigrants moved to the city after WW2, and use of the dialect reduced dramatically. Intelligibility between Leuvens and the rest of South Brabantian is not known.

Ninove is apparently a South Brabantian dialect spoken in the city of Ninove in the east of East Flanders. It is probably close to Liekert, and hence is very hard for even Flemish to understand.

Elingen is a South Brabantian dialect spoken in the town of Elingen on the border with Hainaut Province. It is not intelligible at all with Brabantian proper (Anonymous January 2010).

Aalsters is a South Brabantian dialect that is very hard for even the Flemish to understand. It is spoken in the city of Aalst in East Flanders, Belgium, on the border of Brabant Province. It is also spoken in Opwijks, Asses and Tenants over the border in Brabant Province.

Tiens is a South Brabantian dialect spoken in Tienen in Eastern Brabant, Belgium. It has Limburgish tendencies. It is dying out and tends to be spoken more by the working classes, but is still pretty widely spoken. Intelligibility with the rest of South Brabantian is not known.

Afrikaans is a separate language, recognized by Ethnologue. It is spoken in South Africa by 13.2 million people, including 6.45 million native speakers and 6.75 million second language speakers. 12-16 million people have basic knowledge of the language.

A study noted that Dutch speakers have 59% intelligibility of Afrikaans (Gooskens 2005), while Afrikaans speakers have 51% intelligibility of Dutch. The combined intelligibility estimate is 55%, close to distance between Spanish and Portuguese. Afrikaans split off from Dutch in about 1675 when Dutch settlers began settling in South Africa. The first written Afrikaans is dated to 1795.

Zeelandic or Zeêuws is a separate language, recognized by Ethnologue as a different Low Franconian language from Dutch. Zeelandic is not easily understood by AN speakers. It is spoken in Zeeland Province and in South Holland Province on the island of Goeree-Overflakee. This area is south of Rotterdam. It is best thought of as transitional between Dutch and West Flemish.

There are a variety of dialects, Walcheren, Zuid-Beveland and Goeree-Overflakee among others. Toward the north, Zeelandic looks more Hollandic or Dutch, and towards the south, it looks more Flemish. The dialects of Zeelandic Flanders are really outside of the definition of Zeelandic and are best described as East and West Flemish instead.

Although it is clearly a separate language from Dutch, Dutch nationalism mandates that it be seen as a dialect and not a separate language, hence the Dutch government refuses to recognize it as a separate language. The language is still in pretty good shape, though it is declining.

It still has 220,000 speakers. In some rural villages, up to 90% of the children still speak Zeelandic. The dialects of the larger cities are going extinct, yet Zeelandic is still in good shape in the rural areas. Surveys conducted in the 1990’s showed that 60% of residents of the area still spoke Zeelandic on an everyday basis. All Zeelandic dialects are intelligible with each other except South Beveland, which is possibly a separate language. Intelligibility between Zeelandic and West Flemish is not known, but may be high.

Along with French Flemish and West Flemish, Zeelandic is part of Southwest Low Franconian. These languages are said to be the remains of the oldest of Old Franconian.

Burgerzeeuws is a Dutch dialect spoken in Zeeland. Though it ought to be part of the Zeelandic language, it is not. It is originally Zeelandic, spoken in the cities of Zeeland, which was then replaced with Hollandic by status conscious upwardly mobile people. Like Stadsfries, this language developed in the 1600’s. It is especially spoken in Middelburg and Vissingen.

In the 1990’s, only 1/3 of urban Zeelanders spoke Zeelandic, compared to 2/3 in the province as a whole. This dialect is still alive though, even among the youth, especially in conservative Christian circles. In some areas this dialect is scorned, while in others it is valued. Burgerzeeuws has unknown intelligibility with AN, but it is probably easier to understand than Zeelandic proper.

Oostvoorns is a Zeelandic dialect spoken in the far north of the region that is actually spoken outside of Zeeland proper in the area called Oostverne just to the north. Some say that this dialect is actually Hollandic and not Zeelandic. It’s probably best seen as a transitional Zeelandic-Hollandic dialect. Intelligibility with AN is not known, but it’s probably better understood to AN speakers than the rest of Zeelandic.

Goerees is a Zeelandic dialect spoken in the Goeree region of Zeeland. The dialect of the fishing village of Ouddorp is quite different, with many unique words. It is quite a bit different from the rest of Zeelandic. This dialect is still widely spoken.

Flakkees is a Zeelandic dialect spoken in the region of Overflakee, east of Goeree. It is spoken in Ooltgensplaat, Middelharnis and Sommelsdijk. Flakkees is divided into three subdialects – West Flakkees, East Flakkees and Brabants Flakkees. Flakkees is still very widely spoken.

Schouwen-Duivelands is a Zeelandic dialect spoken in the Zeelandic region of Schouwen-Duivelands. In some places such as Bruinisse the dialect is in great shape, with 90% of youth even speaking it. In other places such as Burgh, Haamstede and Zierikzee it is undergoing decline due to tourism.

Thools is a Zeelandic dialect spoken on the former island of Tholen is Zeeland. It is undergoing some decline due to widespread immigration but is still widely spoken. There is a sharp barrier between Thools and the North Brabant area just to the east. The city of Oud Vesssemer speaks a mixed North Brabantian-Zeelandic dialect.

Walchers is a Zeelandic dialect spoken on the former island of Walcheren in Zeeland. It is spoken in the towns of Domburg, Westkapelle, Koudekerke, Arnemuiden and Oost Souburg. The dialect of the fishing village of Westkapelle is very different, with many unique words. In Westkapelle and Arnemuiden, the dialect is still doing very well. In other places it is under heavy pressure from tourism and immigration.

South Bevelands is a Zeelandic lect spoken in the Zuid Bevelands area of Zeeland. This area is still very rural, so the lect is in great shape. South Bevelands was scarcely touched by Hollandization during the Golden Age of Holland, hence its archaic character.

South Bevelands is extremely diverse, varying wildly from one village and town to the next to the point that communication is so seriously impaired that residents from different towns typically use AN to communicate rather than their town lects. On the face of it, it’s tempting to split off every town as a separate language, but that seems wild and threatens chaos, and until we get more data, it’s thankfully premature.

However, since South Bevelands is not even intelligible within itself, it can’t possibly be intelligible with the rest of Zeelandic, hence it may well be a separate language.

Land of Cadzands is a Zeelandic dialect spoken in the far south of the Netherlands in Zeelandic Flanders. It is properly seen as a Zeelandic dialect transitioning to West Flemish.

Dutch Low Saxon is a group of lects related to Dutch and German that are very hard to classify, especially in terms of their relationship with Low German in Germany and with Low Franconian (Macro-Dutch) in the Netherlands.

I originally put Dutch Low Saxon in with Low German and added it to my German reclassification. However, after thinking this over for a year now, I now believe that Dutch Low Saxon belongs much more in Macro-Dutch than in Macro-German. Nerbonne 1996 makes a convincing case that Dutch Low Saxon is more properly seen as Macro-Dutch than as Macro-German in a scientific paper analyzing Levenshtein distances between Dutch lects.

There is an argument floating around that all of Dutch Low Saxon is intelligible with all of German Low Saxon. This is certainly not true. Looking at Veluws to Schleswigsch, those two languages are not intelligible with each other at all. In fact, even Groningen and Veluws are not intelligible within the Netherlands alone.

Arguing against the notion of Dutch Low Saxon as being a Dutch dialect, many Dutch say that Dutch Low Saxon is not intelligible with Dutch. There is marginal intelligibility of around 90% between Dutch and Dutch Low Saxon (Zweers 2009). And some Dutch Low Saxon lects, for instance Veluws and Groningen, are not fully intelligible with each other either (Smith 2008).

Dutch Low Saxon includes four groups: Friso-Saxon, Westphalian, Gelders-Oaveriessels and Plautdietsch.

Friso-Saxon is a group of Low Saxon lects spoken in Groningen that have all been heavily influenced by the East Frisian language. These lects are Gronings-East Frisian Low Saxon, Stellingwerfs, Westerkwartiers, Kollumerpompsters, Kollumerlands, Middaglands, Middle Westerkwartiers, South Westerkwartiers, Hogelandsters, Stadsgronings, Westerwolds, Veenkoloniaals and Oldambtsters.

It is often stated that Friso-Saxon is intelligible with general Low Saxon across the board across the border in Germany. This is not true; it is only intelligible with East Frisian Low Saxon, which is not part of the greater German Low Saxon language. For instance, Gronings, Westerwolds and Veenkoloniaals have only 57% intelligibility of Bremen Low Saxon in Germany (Gooskens 2009). Friso-Saxon is broken into four principal groups: Groningen, East Frisian Low Saxon, Westerkwartiers and Stellingwerfs.

What is difficult is dividing up Dutch Low Saxon into different languages. Ethnologue has gone too far, with proper Dutch Low Saxon divided into eight separate languages – Gronings, Veluws, Sallands, Drents, Stellingwerfs, Twents, Achterhoeks and Plautdietsch. We have reduced this complexity quite a bit here, by reducing Dutch Low Saxon to Friso-Saxon, Stellingwerfs, Urkers and Plautdietsch – four languages, and a reduction of Ethnologue’s classification by 5 languages.

Gronings-East Frisian Low Saxon is a Friso-Saxon language, consisting of two parts, Gronings in the Netherlands and East Frisian Low Saxon across the border in Germany.

East Frisian Low Saxon is a Friso-Saxon dialect spoken in the East Frisian peninsula of northwestern Lower Saxony, Germany. It is intelligible with Gronings in the Netherlands. However, it has only 57% intelligibility with Bremen Low Saxon (Gooskens 2009). It has 230,000 speakers. There are still rural areas around here where the majority of people under age 40 speak the language. 50% of the population still speaks the dialect on a daily basis.

This dialect has an East Frisian substratum. There is dialectal diversity between the western and eastern branches. There are also speakers of this dialect in Iowa, about 500 of them, mostly over age 50. The classic variety of East Frisian Low Saxon probably looks something like this. Dialects include Hinte, Ems (Emsfriesisches), Weser (Weserfriesisches), Jeverländer, Harlingerländer, Ommelands and Mooringer.

Hinte East Frisian Low Saxon (Hintener) is a divergent dialect of East Frisian Low Saxon, but intelligibility data with the rest of East Frisian Low Saxon is not known. It is spoken in the town of Hinte in Germany on the Dutch-German border. Hinte is spoken in Eastern Friesland (Ostfriesland) in Lower Saxony in Germany and Groningen is on the Dutch side. It is somewhat similar to Twents.

Westerkwartiers is a group of Friso-Saxon dialects spoken in the far southwest of Groningen Province. This is the group of Friso-Saxon dialects that most resembles West Frisian. A good characterization of this group would be to say it is transitional from Gronings to West Frisian. The cities of Leek, Zuidhorn and Marum speak this dialect. The group includes Kollumerpompsters, Kollumerlands, Middle Westerkwartiers, South Westerkwartiers and Middaglands.

Kollumerpompsters is a Friso-Saxon Westerkwartiers dialect spoken in the city of Kollumerpomp and the surrounding area in the far east of Friesland. The municipality of Kollum speaks this dialect.

The Gronings group of dialects that are spoken in all of Groningen Province, some of Drenthe Province, and a bit of Friesland Province in far northeastern Netherlands. They have 320,000 speakers. They have a heavy Old Frisian (East Frisian) substrate.

Along with Limburgish, it is the group spoken in the Netherlands farthest from Dutch. Yet Gronings is intelligible with East Frisian Low Saxon across the border in Germany. Gronings is very close to Drents, but it is far from Achterhoeks, Twents and Stellingwerfs, and is not fully intelligible with Stellingwerfs or Veluws. Gronings appears to have good intelligibility of Drents (Felder 2015). Dutch speakers have 89-92% intelligibility of Gronings. But other Dutch speakers say that Gronings is often very hard to understand and sometimes they cannot understand anything at all of it (Felder 2015).

The original language of Groningen was Frisian, but there was a mass movement of Saxons from Drenthe to the area. They mostly settled in the city of Groningen, but then they radiated out from there. In addition, many East Frisian speakers came from across the border in Germany. This had to do with the reclamation of peat land in Groningen. The East Frisian language was supplanted by Low Saxon long ago, before the 1500’s. Traces of East Frisian still exist, but only in morphology and syntax and not in phonology (Heeringa 2004).

Gronings consists of North Drents, Hogelandsters, Stadsgronings, Westerwolds, Veenkoloniaals and Oldambtsters.

Hogelandsters is a Friso-Saxon dialect spoken in the far north of Groningen in a region called Hogeland. This is said to be the “purest” Gronings of all, and it is the hardest for AN speakers to understand. The cities of Leens, Ulrum, Baflo, Uithuizen, Bedum, Winsum, Loppersum and Uithuizermeeden are located in this region.

Stadsgronings is the Friso-Saxon dialect spoken in the city of Groningen itself. It is close to North Drents. The dialect is dying out in the city itself due to immigration of large numbers of students from outside the region who do not speak Gronings.

However, many people still speak Gronings in the city and some are more or less Gronings monolinguals who do not speak ABN well. These tend to be people age 40+ (Felder 2015).

Noordenvelds or North Drents is hard to analyze, but it is best analyzed as Friso-Saxon and not Drents proper. This dialect is close to Stadsgronings. It is spoken in the north of Drenthe Province in the towns of Roden, Norg, Eelde and Vries by 38,000 people. This is nearly the same speech as Stadsgronings (Felder 2015).

Oldambtsters-Reiderlands is a Friso-Saxon dialect spoken in a part of Groningen called Oldambt. It is related to Veenkoloniaals and Hogelandsters and has heavy Westphalian influence. Oldambtsters has a close relationship with the Rheiderlander dialect of East Frisian Low Saxon across the border in Germany; in fact, it is basically the same dialect. East Frisian was spoken here until 1400.

This dialect is steadily declining, but holds out best in the rural areas. German is still widely spoken in this part of the Netherlands, especially in the city of Winschoten. It is spoken in Winschoten, Scheemda, Noordbroek, Heiligerlee, Beerta and Nieuwe Schans.

Veenkoloniaals is a Friso-Saxon dialect spoken in eastern Groningen on the border between Groningen and Drenthe Provinces and over the border into Drenthe. This dialect came into being due to peat mining in the area. In recent years it has been expanding a lot, probably because it is closer to AN than neighboring lects.

Veenkoloniaals is close to Drents but even closer to Stellingwerfs. Veenkoloniaals lacks full intelligibility with Dutch. Veenkoloniaals is quite close to Stadsgronings and almost sounds like the same lect. There are a few differences between the two. This is a harder Gronings that is even harder for ABN speakers to understand than Stadsgronings (Felder 2015).

Westerwolds is another Friso-Saxon dialect. that, like Veenkoloniaals, is spoken in eastern Groningen. Westerwolds is not fully intelligible with Dutch and has heavy influence from East Frisian Low Saxon spoken in Germany. Although it is Friso-Saxon, it is closer to Westphalian than to Frisian. It has a particularly close relationship to Ems Low Saxon spoken in Germany.

Lately it has been losing ground to Veenkoloniaals. It is spoken in a small corner of far southeast Groningen on the German border in the towns of Stadskanaal, Musselkanaal, Ter Appelkanaal, Ter Appel and Vledderveen. ABN speakers say that this is an extremely hard form of Gronings that is very hard to understand, even harder to understand than Veenkoloniaals (Felder 2015).

Stellingwerfs is a Friso-Saxon language spoken in the municipalities of Weststellingwerfs and Oststellingwerfs in southeastern Friesland Province on the border with Drenthe and Overijssel Provinces and over the border into Drenthe and Overijssel.

It is spoken in towns such as Appelscha, Noordwolde, Tjalleberd, Luinjeberd, Donkerbroek, St. Johannesga, Rotsterhaule, Rotstergaast, Delfstrahuizen, Uffelte, Diever, Vledder, Echten, Steenwijk, Giethoorn, Tuk, Willemsoord, Oldemarkt, Kuinre, Smilde, Wolvega, Oldeberkoop, Oldeholtpa, Nijeholtpa, Dwingeloo and Oosterzee.

Frisian speakers moved into the formerly Drents-speaking area when peat-digging began. This began the process of Frisianization. Stellingwerfs is not usually put into Friso-Saxon, but Heeringa 2004 makes a good case for putting it into Friso-Saxon (Fig. 4, p. 97).

One way to look at Stellingwerfs is to see it as a Drents variety intermixed strongly with a Frisian layer (Heeringa 2004). The process of Frisianization began as early as the 1200’s. Stellingwerfs probably has over 300,000 speakers in two dialects, East Stellingwerfs and West Stellingwerfs. Stellingwerfs is not close to Gronings, Drents, Twents or Achterhoeks, and it is not fully intelligible with Dutch, nor with Gronings and Veluws.

Gelders-Oaveriessels is a dialect group within Dutch Low Saxon. It includes Urkers, Sallands, Drents and East Veluws. This group is also sometimes called West Dutch Low Saxon. This group has heavier Dutch (Low Franconian) influence than the rest of Dutch Low Saxon. The two other groups have heavy Frisian and Westphalian Low German influence respectively. The Dutch influence is primarily an archaic version of Hollandic from the 1600’s.

East Veluws is a Gelders-Overijssels Dutch Low Saxon dialect spoken in the Veluwe, a formerly heavily forested and swampy region along a ridge in northern Gelderland Province. This region has a lot of wildlife and used to be very popular with hunters. There are proposals to turn much of this region into a national park.

Although it is a part of Dutch Low Saxon, Veluws is marginal within this family (Smith 2009), with West Veluws looking a lot like Low Franconian (“Dutch”) proper, and East Veluws looking more like a typical Dutch Low Saxon. West Veluws and East Veluws can understand each other, and East Veluws and Twents are mutually intelligible. East Veluws is more intelligible with Dutch than any other type of Low Saxon, probably due to its close connection to West Veluws, a Low Franconian lect; however, East Veluws tends to have marginal intelligibility with AN.

Veluws is one of the lects where Low Saxon and Low Franconian are very close, similar to Gronings and East Frisian Low Saxon, except that Veluws in closer to Low Franconian, and Gronings is closer to Low Saxon. Nevertheless, Veluws is not fully intelligible with Stellingwerfs or Gronings. There are probably 300,000 speakers of all varieties of Veluws, but there are fewer Veluws speakers than speakers of Gronings, Stellingwerfs and Twents.

East Veluws is spoken in the towns of Apeldoorn, Doernspijk, Oldebroek, Elberg, Hattem, Heerde, Epe, Ernst, Vaasen, Het Loo, Twello, Gorssel, Brummen, Doesburg, Eerbeek and Dieren.

Sallands is a Gelders-Overijssels Dutch Low Saxon dialect spoken in the Salland region in the western part of Overijssel Province. Sallands has fewer than 300,000 speakers. Sallands lacks full intelligibility with Dutch, but is intelligible with Twents. Based on linguistic distance (Fig. 3) it may not be intelligible with Groningen. There is a transitional Sallands-Twents dialect spoken on the border with the northwest of the Twents-speaking area (ter Denge 2009). There is a lot of variability in Sallands.

Sallands is spoken in Zwolle, Zutphen, Nijverdal, Vroomshoop, Kloosterhaar, Marienberg, Hardenberg, Gramsbelgen, Lutten, Heemse, Witharen, Ommen, Oudleusen, Den Ham, Vilsteren, Dalfsen, Kampen, Heino, Lemereveld, Ittersum, Wijhe, Windesheim, Heeten, Olst, Espelo, Holten, Wesepe, Diepenveen, Lettele, Deventer, Bathmen, Genemuiden, Zwartsluis and Blokzijl.

Zwols is a Sallands dialect spoken in Zwolle, the capital of Overijssel Province. It has some similarities to Urkers nearby. 61% of the population still speaks Zwols. Nowadays, it is mostly spoken in the older districts. It contains many colorful slang expressions.

Dêmpters is the name of the Sallands dialect spoken in Deventer.

Zutphens is a transitional Achterhoeks-Sallands dialect that is spoken in Zutphen, a city in Gelderland. It is interesting because it has many Hollands features. Zutphens is still very heavily spoken by the population of the city.

Drents is a Dutch Low Saxon dialect that is in a group of its own. It has over 240,000 speakers in in Drenthe Province, where it is spoken by about 1/2 the population, and it also has some speakers in Overijssel. In towns like Zuidwolde, the majority of people even aged 30-40 continue to speak Drents as the main everyday language.

Every town and village has its own dialect. Drents is quite far from Twents, Achterhoeks and Stellingwerfs, but it is very close to Gronings and intelligible with Twents. Drents is not intelligible with Dutch.

It is spoken in Assen, Rolde, Geiten, Annen, Anlo, Eext, Klooverstervee, Gasselte, Borger, Grollo, Buinem, Elp, Amen, Beilen, Odoorn, Schoonloo, Hijken, Emmen, Valthermond, Zoordsleen, Sleen, Hoogeveen, Noordbarge, Dalen, Coevorden, Schoonebeek, Eursinge, Zuidwolde, Nieuw Amsterdam, Klazienaveen, Nieuw Schoonebeek, Zwartemeer, De Krim, Linde, Staphorst, Ruinen, Balkbrug, Meppel, Dedemsvaart, Rouveen, Den Hulst and Havelte.

Urkers is a very divergent Gelders-Overijssels Dutch Low Saxon lect spoken in the small city of Urks, formerly an island in the Zeelandic Sea. It is a very conservative Protestant town with no less than 17 churches, where 97% of the population goes to church every week for about three hours a day. Women marry young, and cohabitation is unheard of.

Urkers is utterly incomprehensible to AN speakers, and on structural and intelligibility grounds, there is justification for making it a separate language. Further, a linguistic analysis based on Levenshtein distance suggests that Urkers is best analyzed as a separate language in its own right, apart from all other Dutch lects (Heeringa 2004).

Westphalian Dutch Low Saxon is a branch of Dutch Low Saxon. It contains two dialects, Twents and Achterhoeks, is heavily Germanized and collates with the Westphalian Low German spoken across the border in Germany. Twents is one of the most divergent of all of the Dutch Low Saxon lects from AN, especially the dialects spoken in Vriezenzeen, Rijssen and Wierden.

Twents is a Westphalian Dutch Low Saxon dialect with 328,000 speakers, or 62% of the population of Twents, a region in Overijssel.

Every town has its own dialect, but all dialects are mutually intelligible. Twents is not close to Stellingwerfs or Gronings, but it is intelligible with Drents, Sallands, Achterhoeks (ter Denge 2009) and East Veluws. Based on linguistic distance (Fig. 3) it may not be intelligible with Groningen.

In the northwest of the Twents region, there is a transitional Sallands-Twents dialect that has a largely Twents vocabulary with a Sallands inflection. In the towns of Rijssen and Enter, there is a variety of Twents spoken that uses diphthongs where other varieties have monophthongs. This may be a remnant of an earlier Westphalian variety that may have been generalized throughout the Twents region. On the border with the Achterhoeks region, there is no clear dialect border, as Twents and Achterhoeks slide into each other (ter Denge 2009).

Many Dutch speakers find Twents unintelligible.

Twents is spoken in the towns of Vriezenveen, Almelo, Rijssen, Hengelo, Borne, Enschede, Oldenzaal, Tubbergen, Ootmarsum, Weerselo, Reutum, Denekamp, Deurningen, Losser, Lonneker, Glanerbrug, Usselo, Boekelo, Haaksenbergen, Diepenheim, Goor, Delden, Markelo and Wierden.

Achterhoeks is a Westphalian Dutch Low Saxon dialect. Achterhoeks is far from Drents, Gronings and Stellingwerfs but is intelligible with Twents (ter Denge 2009). Based on linguistic distance (Fig. 3) it may not be intelligible with Groningen. Achterhoeks is not intelligible with Dutch. Achterhoeks is in very good shape, and is widely used as an everyday language.

Achterhoeks is spoken Northern Gelderland east of East Veluws in towns such as Doetinchem, Terborg, Silvolde, Ulft, Dinxperlo, Alten, Winterswijk, Meddo, Groenle, Lichtenvoorde, Eibergen, Neede, Borculo, Ruunlo, Zelhem, Hengelo, Lochem, Laren, Almen and Vorden. Interestingly, Achterhoeks speakers in Dinxperlo can communicate with speakers of Westphalian German Low Saxon in Suderwick, Germany, across the border.

Plautdietsch is a Dutch Low Saxon language that originated in the Netherlands, but then spread to other parts of the world. It forms a subgroup of its own and is quite divergent from the rest of Dutch Low Saxon. It is not intelligible with many other Low German languages, Standard German, or Pennsylvania German. Plautdietsch has 50% intelligibility with Hutterite German.

This language was originally a Friesland Dutch Low Saxon lect, but they moved to Prussia after they were persecuted for their religion, and later they moved to the US. This is the language of the Mennonites worldwide.

Map showing the various lects spoken in Belgium, in the Dutch language. 1. West Flemish. 2. East Flemish. 3. Brabantian. 4. Limburgish. 5. Low German. 6. Ripaurian. 7. Luxembourgish. 8. Lorraine. 9. Champenois. 10. Walloon. 11. Picard.
Fig. 2. Map showing the various languages spoken in Belgium, in the Dutch language. 1. West Flemish. 2. East Flemish. 3. Brabantian. 4. Limburgish. 5. Low Dietsch. 6. Ripaurian. 7. Luxembourgish. 8. Lorraine. 9. Champenois. 10. Walloon. 11. Picard. 1-5 are varieties of Dutch, 6-7 are varieties of German and 8-11 are varieties of French. Click to enlarge.

Flemish or Vlaams is a separate language, recognized as such by Ethnologue. Flemish has anywhere from 30% (Zweers 2009) to 66% (Van Bezooijen 1999) intelligibility with AN. However, it is more complicated than that, for in truth, Flemish is more than one language. The primary split is between West Flemish and East Flemish. It’s now widely acknowledged by most that West Flemish and East Flemish are not completely mutually intelligible.

Hinrichs undated makes a strong case for the inclusion of Flemish as a recognized regional language in section III of the European Charter for Regional or Minority Languages based on linguistic distance to AN. The distance between Flemish and AN is as great as between Low Saxon and Dutch, and Low Saxon is recognized.

Fig. 3. Map of the major Dutch languages, including Hollandic, East Flemish, West Flemish, Zeelandic, Brabantian and Limburgish. Click to enlarge.

VRT-Nederlands, BRT-Nederlands, VT-Nederlands or BT-Nederlands are abbreviations for the form of AN spoken in Belgium. It may be thought as “Dutch with a Fleming accent.” It is easily intelligible with AN, and is increasingly heard on Belgian TV. Further, many Flemings can also speak this language, which is pretty much what they are taught in school under the rubric of “Dutch” classes. There is tremendous confusion between this dialect and “Flemish.”

This dialect is simply a dialect of Dutch or AN. The varieties subsumed under Flemish are completely different languages altogether. This dialect is making increasing inroads in Belgian life and some Flemish speakers are becoming alarmed about this.

Standard Flemish, Verkavelingsvlaams, Vlaamse Tussentaal, VT or Soap Vlaams (henceforth VT) is a koine developed recently in Belgium that is understood by all Flemish speakers and is used often on TV. It is a mixture both of an artificially created Standard Flemish and the local dialects, and AN speakers find it quite incomprehensible. It is nearly the same as Belgian Brabantian. It probably has around 3.4 million speakers in Belgium. VT is fully intelligible with the Brabantian language.

West Flemish or West Vlaams is a highly divergent Low Franconian language that, along with French Flemish and Zeelandic, is part of Southwest Low Franconian and is the closest to the original Old Franconian. This group of languages is interesting because they have retained features of Ingvaeonic or North Sea Germanic features. Ingvaeonic is the postulated language that gave birth to Old English, Old Saxon and Old Frisian, possibly 2,000 YBP. It was spoken what is now the Netherlands, northwest Germany and Denmark. There are also influences from langues d’oil, not so much French proper as Picard, which is spoken adjacent to the West Flemish region.

West Flemish is spoken in Zeelandic Flanders in the Netherlands, West Flanders Province in Belgium and French Flanders in Nord Province in France (see map Fig. 1). East Flemish speakers have a hard time understanding West Flemish, especially the variety spoken in France. For example, West Flemish speakers regularly get subtitles on Belgian TV. Studies have shown that speakers of Antwerp East Flemish cannot understand the West Flemish of Oostende, Diksmuide, or Kortrijk, cities in West Flanders Province (De Houwer 2008).

West Flemish has 1 million regular speakers in West Flanders in Belgium and 70,000 in Zeelandic Flanders for a total of 1.07 million speakers. It also has a few speakers in Flemish Zeeland in the Netherlands.

Brugs is a West Flemish dialect spoken in and around the city of Bruges. It is quite divergent from other West Flemish dialects and even other Flemish find it hard to understand. However, precise intelligibility with West Flemish per se and not Flemish per se (whatever that means) is needed before we can determine whether or not it is a separate language. Brugs is declining in recent days and is being replaced with a more widely spoken Flemish, possibly VT.

Kortrijks is a West Flemish dialect spoken in the city of Kortrijk in the southeast of West Flanders. It is also spoken in the towns of Kuurne, Wevelgem, Ledegem, Moorslede, Muelebeke, Tiens and Izegem. Past Tiens, it starts turning into the Brugs dialect. Past Moorslede, it starts turning into the Ypres dialect.

Ypres is a South Flanders dialect spoken in and around the city of Ypres in the south of West Flanders. It is different from Kortrijks.

Waregems is a dialect spoken in the West Flanders city of Waregem. It is different from Kortrijks and is unique in some ways. It is best seen as a West Flanders dialect heading out towards the East Flanders language. There is an entire area on the border between West Flanders and East Flanders where the dialects may be hard to characters as belonging to either the West Flanders or East Flanders languages. There is a suggestion that only those from the immediate area can understand Waregems well, but until we get better data, it is premature to split it.

Vlaemsch or French West Flemish is a highly divergent West Flemish lect spoken in France that has been diverging from the rest of West Flemish for over 300 years since Louis XIV annexed it to France around 1680. Vlaemsch is full of French loan words, and other West Flemish speakers (such as Oostende West Flemish speakers) have a hard time understanding it, so it is probably a separate language.

Though it is recognized by the French government as a minority language (as “Dutch”), it gets no support from them and has been declining for centuries. It has 60,000 speakers, 20,000 of whom use it every day. The vast majority of Vlaemsch speakers are over age 60. Vlaemsch will probably go extinct in a matter of decades.

East Flemish or East Vlaams is a separate language spoken mostly in East Flanders in Belgium but also in Zeelandic Flanders in the Netherlands. It is not intelligible with AN. For example, the East Flemish speakers in Zeelandic Flanders have a hard time understanding the Brabantian Dutch speakers across the Schelde River. Also, East Flemish speakers have a hard time understanding West Flemish.

West Flemish speakers moving to Ghent in large numbers have created so many problems that the city council took action against them for “speaking a language that no one could understand,” that is, West Flemish.

Not only is East Flemish a separate language, but there is tremendous dialect diversity inside of East Flanders. In fact, it appears that East Flanders is more than one language. East Flemish probably has about 1.1 million speakers, almost all in Belgium, but that figure may be inflated. The true number of speakers is hard to determine. There are 1.4 million residents in the area, but they cannot all speak East Flemish.

Gents is a highly divergent East Flemish lect spoken in Ghent, Belgium that appears to be a separate language. It is considered very hard to understand even by other East Flemish speakers, so it may be a separate language. To South Brabantian speakers, it may as well be Greek.

In fact, there are two different dialects of Gents, one on the west side of the city and another on the east side. In addition, the dialects of the villages around Ghent are also said to be different from Gents itself. Intelligibility data for the various dialects in and around Ghent is not known. This language has many features of a “language island,” in that it differs markedly from surrounding East Flemish lects. Gents has a strong French influence and many French loans.

Dendermonds is another highly divergent East Flemish lect spoken in the city of Dendermode. Studies indicate that other East Flemish speakers have a hard time understanding it (De Houwer 2008), so it may well be a separate language. Dendermode is about 1/2 way between Antwerp and Ghent. This language has heavy Brabantian influence, and that is why it is so different from the rest of East Flemish.

Lokers is an East Flemish dialect spoken in the city of Lokeren in the northeast of East Flanders on the border with Brabantian. Here East Flemish is transitioning to a group of Brabantian dialects called Wase, spoken in the Waseland. This dialect may be close to Dendermonds.

Limburgish is an East Low Franconian language that is spoken in the Netherlands and Belgium. It is a separate language and is not intelligible with other forms of Low Franconian nor with any Low German. As a part of Meuse-Rhenish, it is transitional between Low Franconian (Dutch) and Low German (German).

Limburgish and Dutch had very different geneses – Limburgish came from Old East Low Franconian, and Dutch came from Old West Low Franconian. It has 1.6 million speakers. Each village and city has its own dialect, but they are all mutually intelligible. There are as many as 580 different Limburgish dialects.

Although Limburgish is said to be intelligible with Ripuarian, the truth is that it is not inherently intelligible with it. There are however some Limburgish and Ripuarian dialects on the borders of the two that are transitional between Ripuarian and Limburgish. See the South Guelderish and the Low Dietsch entries here for more on those transitional languages.

Limburgish is one of the Meuse-Rhenish languages. It is often claimed that Limburgish is intelligible with German, but this is not so. The intelligibility situation with regard to Limburgish and AN is confusing.  Some say that Limburgish has marginal intelligibility with AN (Zweers 2009), but other Dutch speakers say that they can barely understand a word of Limburgish. A study concluded that Dutch speakers have about 89% intelligibility of Limburgish.

The real pure Limburgish is not intelligible with Standard Dutch at all, but what is most often spoken nowadays is a sort of a Dutch-Limburgish mixed language that is intelligible to most AN speakers. However, there are still some speakers of the real pure Limburgish around.

This Wikipedia article on Limburgish is wrong. It groups all of Bergish, South Guelderish, Southeast Limburgish and Dutch Limburgish into one “variety” or dialect, and then refuses to call that variety a language.

However, “Limburgish” is composed of at least four languages. Bergish is a separate language, not intelligible with Southeast Limburgish (60% intelligibility), South Guelderish, or Dutch Limburgish. Neither is Southeast Limburgish intelligible with Limburgish. And Venlo may well be a separate language all of its own.

Greater Limburgish, including Limburgish, SE Limburgish, Kleverlandish and Bergish.
Click to enlarge. Greater Limburgish, including Limburgish, SE Limburgish, Kleverlandish and Bergish.

Geleens is an East Limburgish dialect that is spoken in the city of Geleen in Limburg Province in the Netherlands. It differs quite a bit from the dialect of Sittard, even though the two cities have recently merged.

Sittards or Zittesj is an East Limburgish dialect that is spoken in Sittard in Limburg Province, the Netherlands. It’s quite different from Geleens. It is closest to dialects right across the German border, but otherwise it is a transitional Middle Limburgish-South Limburgish dialect, similar to Roermond.

Heerlen Dutch is a Limburgish-Dutch creole or dialect of Dutch spoken in the city of Heerlen in Limburg Province, the Netherlands. In the 1800’s, there were many coal miners in this area and everyone spoken Heerlen Limburgish. As the mines expanded, people came to work from all over the Netherlands and even the Kerkrade region of Germany.

None of them spoke Heerlen, and many didn’t even speak Limburgish. Later a sort of creole based on AN and Heerlen arose. What we have now is a Dutch dialect with a Heerlen base and a strong Limburgish flavor, not really a Limburgish dialect per se. Heerlen Dutch is apparently intelligible with AN.

Hasselts or Hessels is a Limburgish dialect spoken in Hasselt in Belgian Limburg. Dialect leveling has been occurring in the past 50 years as rural residents of the surrounding villages moved to Hasselt. It is best analyzed as a Belgian Limburgish dialect transitional with Brabantian.

Maastrichts is a Limburgish dialect spoken in the city of Maastricht in Dutch Limburgish. It has 60,000 speakers and hence is the largest Limburgish dialect. It is still widely spoken in the city. Maastrichts differs significantly from the dialects of the neighboring villages.

Horsters is the Limburgish dialect spoken in the city of Horst in Dutch Limburg. Some say that everything north of Venlo is outside of Limburgish proper and into South Guelderish. That’s an interesting argument, but we will leave it in Limburgish for now, especially since Limburgish isoglosses extend to just north of Horst. Some see it as transitional between Limburgish and South Guelderish, Kleverlandish and North Limburgish.

Tegels is is a Limburgish dialect spoken in the city of Tegelen in Dutch Limburg. Although it is very close to Venlo, Tegels speaks a typical Limburgish dialect, while Venlo is North Limburgish and is probably a separate language altogether.

They are so different because Tegelen was ruled by the Duchy of Gulik for 750 years, while Venlo was under the Duchy of Gelders for 400 years. The Duchys did not end their rule of both cities until around 1800 or so. Tegelen did not go to the Netherlands until 1817, when it was traded to Netherlands from Germany in exchange for the Dutch city of Henzogenrath, which was traded to Germany.

Weerts or Wieërts is a Limburgish dialect spoken in the city of Weert in Dutch Limburg. It is a Middle Limburgish dialect. Weerts, together with another Limburgish dialect spoken in Hamont in Belgian Limburg and a dialect of Bavarian, has more vowels than any other lect on Earth – 28 of them. The area around Weerts has many forests, sand dunes, bogs and marshes. This part of the Netherlands is also very Catholic. In the far north, it tends to be a lot more Protestant.

Hamont is a Limburgish dialect spoken in Hamont, on the border with the Netherlands in Belgian Limburg.

The map below (Fig. 3) is quite interesting. As we can see below, Limburgish is further removed from Dutch than Veluws, Afrikaans, and Dutch Low Saxon. Much of Dutch Low Saxon is also further from Dutch than Afrikaans.

Map of the Netherlands and Belgium, in Dutch, showing the positions of various lects. The map shows Dutch Low Saxon, Hollandic, Brabantian, Kleverlandish, Limburgish, Low German, Zeelandic, West Flemish, East Flemish and French Flemish and Afrikaans in South Africa. It also shows Indonesia, Suriname and the Dutch Antilles as Dutch speaking regions.
Fig. 3. Map of the Netherlands and Belgium, in Dutch, showing the positions of various lects. Higher numbers are farther from AN; lower numbers are closer to AN. As you can see, some varieties of West Flemish, Limburgish and Dutch Low Saxon are very far from AN.Afrikaans is also quite a ways away. The map shows Dutch Low Saxon, Hollandic, Brabantian, Kleverlandish, Limburgish, Low German, Zeelandic, West Flemish, East Flemish and French Flemish and Afrikaans in South Africa. It also shows Indonesia, Suriname and the Dutch Antilles as Dutch speaking regions. Click to enlarge.

South Low Franconian is the name for a lect spoken in Germany just east of the Limburgs Province in the Netherlands. Dialects include Jlabbacher Platt of central Mönchengladbach, Föschelner Platt of Fischeln in Krefeld, and Dremmener Platt of Dremmen near Heinsberg. The intelligibility of these German lects with the rest of Meuse-Rhenish is unclear, and it may be a separate language altogether. The closest in intelligibility would be to Bergish, Venloos and Southeast Limburgish in that order.

Southeast Limburgish (SE Limburgish) is a East Low Franconian language made up of a number of dialects that are transitional between Limburgish and Ripuarian. It has a close relationship with Limburgish. Some call SE Limburgish/Low Dietsch/Aachen German by an alternate name – Limburgish-Ripuarian of the Three Countries Area.

Some classifications put this language into Ripaurian, but it is possibly better analyzed as Limburgish or better yet Ripuarian-Limburgish transitional. The classification is important since if it is Ripaurian, this language is “German,” and if it is Limburgish, it is “Dutch.” But if we see it as Ripuarian-Limburgish transitional, this language may most properly be characterized as a Dutch-German transitional lect.

It is spoken in Belgium around Eupen, including Welkenraedt, Lontzen, Raeren, La Calamine, Eynatten, Gemmenich, and Moresnet; in the Netherlands between Ubach and Brunssum in the towns of Kerkrade, Bocholtz and Vaals, where it is known as Waals; and in a large area in North Rhine-Westphalia between the cities of Aachen and Eschweiler in the towns of Stolberg, Wurselen, Eilendorf and Kohlscheid. To the east over by Duren (Dürener Platt), we start moving into Ripuarian proper. It is also spoken in the far upper Eifel region around the Hurtgen Forest (Tulipan 2013).

It is a separate language, unintelligible to those outside the region. Most if not all Southeast Limburgish lects appear to be intelligible with each other (Tulipan 2013).

Bocholtzer is a SE Limburgish dialect spoken in the towns of Bocholtz, Bocholtzerheide and Baneheide in Limburg Province. It is still very widely spoken in the area. Intelligibility is about 90% with Stolberg German (Tulipan 2013).

Aachen German or Aachener Platt is a SE Limburgish dialect spoken in this same general region in Aachen, North Rhine-Westphalia on the border with Belgium. Aachen German has 60% intelligibility with Bergish, the form of Limburgish spoken across the border (Harms 2009). The common notion is that Aachen German and Bergish are the same language. Since they are not intelligible, this is not the case.

Intelligibility with Stolberg German is excellent (Tulipan 2013). Aachen German intelligibility with Ripaurian is variable, but averages 40% (Köhler 2015). Aachen German has 50% with Dürener Platt, 30% intelligibility with Kolsch, and 25% with Eupener Platt.

Stolberg German is a SE Limburgish dialect spoken in Stolberg, Germany, near Aachen. It is intelligible with Aachen German, though it has more Ripuarian influences. and 90% intelligibility with Kirchröadsj, Vaals, etc. Other than with Kirchröadsj and Vaals, etc. intelligibility is not good with the rest of the lects spoken in the Netherlands, including Limburgish proper. Stolberg German is still widely spoken (Tulipan 2013).

Kirchröadsj is a SE Limburgish dialect spoken in Kerkrade in the Netherlands. It is often put into Ripuarian, but we will put it in SE Limburgish instead. Kirchröadsj is not fully intelligible with Kölsch. But it along with Vaals and related lects is about 90% intelligible with Stolberg German (Tulipan 2013).

Low Dietsch is a lect, often thought to be a SE Limburgish dialect, that is made up of a number of subdialects that are transitional between Limburgish and Ripuarian. However, Low Dietsch is better seen as a separate language because intelligibility with Southeast Limburgs is poor (Köhler 2015). When people say that Limburgish and Ripuarian are mutually intelligible, what they mean is that there are languages like Low Dietsch and Southeast Limburgish that are transitional between Limburgish and Ripuarian.

Around Eupen a Low Dietsch dialect called Eupener Platt (Eupen German) is spoken. Eupener Platt has only 25% intelligibility with Aachen German. Aachen Platt speakers say that Eupener sounds funny, like a mixture of Platt, French and English (Köhler 2015). Intelligibility is difficult with Stolberg German (Tulipan 2013).

Low Dietsch has been slowly dying out for a long time, since World War 1, almost a century, and it is not spoken much anymore. However, in recent years it is undergoing a Renaissance, and it is now being spoken more, even by young people, who seem to be spearheading the resurgence (Tulipan 2013). Eupener Platt has high but not full intelligibility with Kolsch (~70%) and the Middle Limburgish spoken in Heeren.

The following is an example of Eupener Platt.


De Ammerekaaner
By Siegfried Theissen

Wi de Ammerekaaner no Öëpe koëmte – iich gelöüf, et woër veerenvärrtech off voëvenvärrtech – wonnde ver ä gene Wéërt. Wi ver no hoërte dat-te Ammerekaaner ä gene Hollefter, a ge Schokkelaates, en gruëte Käüche oppgemaakt hoë, léïpe véër Kaïnder dahään, waïl aïnder es fertaut hoë, dadd-et ta Panneköük ömmesöss güëf. Änn taatsächlech, jédderéïne kräch esuvoël Panneköük, wi-e draage koss!

Änn véër Kaïnder krächte ouch noch en Taafel Schokkelaat, gätt watt fer allt lang neet mië geséë hoë. Dé Schokkelaat woër esu schwarrt wi di ammerekaanesche Köch.

Di Schwarrte doschde suwisuë märr Dénnsmättje schpéële! Obb-ene gouwe Daach gäng derr Vadder métt, änn éïne van di Schwarrte, dé gätt Döttsch koss, waïl-e e gannts Joër bi de Döttsche gevange gewässt woër, vrodde ann derr Vadder, off-e neet föël Gaïlt ferdeene wöül. Derr Vadder woër natüürlech mésstrouwesch änn saat: „ Watt möss-ech da davöër doë?“. – „Véër Schwarrte, saat-é Schwarrte, wäärde van de wétte Offtséëre esuë schléët behaïndelt, ver wäärde ouch esuë schléët betallt, dadd-iich nou oug ens gätt ferdéïne wéll!

Iich hann ene ganntse Kammjong voll Tsigerätte geklaut, änn dé wéll ech nou vöër voëvduusent Frang verkoupe. Et möss waal hü noch séë, waïl möëre wäärde ver versatt!“ Derr Vadder ho jo di voëvduusent Frang geschpaart, mä e saat, e möss terösch métt sinn Vro drövver kalle.

De Modder saat: „ Dat-tönnt fer! Esunne Kammjong Tsigerätte éss en Milljuën wäärt! Di Tsigerätte verkloppe ver ä Oëke, änn dé Kammjong wäärt fer béï ene Buër kwiit.“ Mä derr Vadder woër te bang. E woss neet, wu e dé Kammjong aunderschtélle köss, änn-e saat ouch: „Wänn de Ammerekaaner es schnappe, da schéëte di es, of-fer koëme joërelang ä gene Topp.“ Do saat-e Modder: „No hä ver ens Milljonäär wäärde könne, änn no hass-tou géïn Kuraasch!“ Mä derr Vadder saat märr: „Dou haas-tech förrege Wéëk allt genoch gelaïst!“

Iich woss néït, watt-e damétt maïnt, änn do vertaut de Modder: „ Ä gen Gosspertschtroët sönnd ouch Ammerekaaner änne su Huus, änn jéddesch Kiër wi ech da verbéïkoëmt, vrodde esunne Schwarrte: ‚ No Cognac? I give Cigarettes and Chocolate for Cognac!’ Iich ho allt lang géïne Konnjakk mië, mä ech ho waal noch en léëch Konnjakkflaïsch, médd-et Étikätt änn dréï Schtääre dropp.

Iich di Bubbel voll Tië gedoë, derr Schtopp dropp, alles fië togepläkkt änn no di Ammerekaaner. Wi di di Flaïsch soëge, paggde di mech en Schtang Tsigerätte änn dréï Taafele Schokkelaat änn en Tüüt, änn ië di di Flaïsch oppmaake kosste, léïp iich ewäkk, datt mech de Vokke vloëge. Wänn di mech kréëge häë, di häë miich kaut gemakkt! Mä saïtämm bänn ech neet mië dörrech gen Gosspertschtroët gegange!“

Hôessëlts is a Low Dietsch dialect spoken in Belgian Limburg in the small city of Hoeselt. It’s dying out, but a dictionary of it was recently published.

Dutch to German transition dialects - Kleverlandish in the north, Niederfrankisch in the middle and Ripuarian in the south
Fig. 4 Dutch to German transition dialects – Kleverlandish in the north, Niederfrankisch in the middle and Ripuarian in the south. Click to enlarge.

Aeres, Æres or Ourish is a West Central German Central Franconian language spoken around the German-Dutch border area that is closely related to, but very different from, Limburgish. It is spoken in several villages in the Dutch provinces of Gelderland and Overijssel and in the German state of Nordrhein-Westfalen.

It has 600 speakers, but there were formerly many more. Most speakers are elderly. Some say it is part of Dutch Low Saxon, others that is close to Limburgish, and others that it is close to Frisian, so its classification is quite confused. Some people say that the whole idea of this language is a fraud since good sources are hard to find, but this seems questionable. On the other hand, the existence of this language has not been well proven.

South Guelderish/Kleverlandish is a Low Franconian language consisting of South Guelderish spoken in Netherlands and and Kleverlandish spoken in Germany. It is part of Meuse-Rhenish, and hence is transitional between Low Franconian (Dutch) and Low German (German).

Dialects include Rheden, Cleves (Kleve, Kleef), Oberhausen, Essen-Werder, Venlo, Venray, Liemers, Cuijk, Groesbeek and Zevenaar, and also the dialects of Northern Limburgish. The Cuijk dialect is typical. South Guelderish has a very heavy Frisian substratum. Based on its distance to AN alone (see Fig. 3) it must have difficult intelligibility with AN, probably along the lines of Zeelandic.

Overbetuws is a South Guelderish dialect spoken in the Upper Betuws region of Gelderland. Cities in this area include Valberg, Elst and Zetten. It was widely spoken until recently, when it began to decline. It is similar to Liemers.

Liemers is a South Guelderish dialect transitional to Achterhoeks that is spoken in the Liemers region in the far east of Gelderland east of Arnhem to the German border. It is spoken in the towns of Didam, Zevenaar, Lobith and Wehl. This dialect is basically South Guelderish transitional to Achterhoeks Low Saxon.

Kleverlandish is South Guelderish spoken in Germany along the border with the Netherlands. Kleverlandish lects are quite a bit different from South Gulderish, but intelligibility data is lacking.

This dialect is often referred to as Kleverländisch. It is spoken southeast of Munster along the border with the Netherlands and north of Cologne in North Rhine-Westphalia.

Kleverlandish is not intelligible with Bergish (Harms 2009), as one is an analogue of North Limburgish and the other an analogue of South Limburgish. Venlo Kleverlandish is incomprehensible to most Dutch speakers. Kleverlandish is still widely spoken in Wesels, Germany, at least by the older generation (Anonymous 2009).

Venloos is an extremely divergent Dutch lect spoken in the city of Venlo in the center of Limburg Province. In the north of Limburg, Limburgish is no longer spoken, and the lect changes to more of a Gulderish/Brabantian type.

Venloos is interesting because it is so different. It seems to be transitional between Limburgish, Ripuarian German, and Gulderish/Brabantian. On purely structural grounds, there are suggestions that it is a separate language, but since we are dividing only on intelligibility and not structural grounds here, that won’t cut it. In the linguistic literature, statements are made to the effect, “If Limburgish is a separate language, then Venloos must surely be also.”

Venloos is regarded as particularly incomprehensible by many AN speakers, much more so than Limburgish. Venloos may well be a separate language, as it appears to be poorly understood outside of the Venlo region. Venloos is still very widely spoken in Venlo, even by young people. The Heinisch dialects next to the Dutch border in Viersen (Viersener Platt), Breyellsch Platt of Breyell in Nettetal and Jriefrother Platt of Grefrath are intelligible with Venloos.

A map of the Kleverlandish language, including all of its dialects.
A map of the Kleverlandish language, including towns where it is spoken.

Bergish or Neiderrbergisch is a form of Low Rhenish that is analogous to Limburgish. This is Limburgish spoken on the other side of the border in Germany, but the variety in Germany is a separate language.

There are two high level splits in Neiderrbergisch, Südniederfränkisch or Bergisch and Ostbergisch. However, both appear to be intelligible, so they are dialects of a single language (Harms 2009). The following nonbolded entries are all dialects of Neiderrbergisch Low Rhenish.

Ostbergisch or East Bergisch is spoken around Mülheim an der Ruhr, Saarn and Gummersbach. Gummersbach is a dialect of this language. All dialects are intelligible with Düsseldorver Platt Bergish (Harms 2009). Ostbergisch has a close relationship with the Sallands Gelders-Overijssels Dutch Low Saxon dialect spoken in Zutphen, however, the two are not completely intelligible. Dialects include Duisburg and Wuppertal.

Mülheim an der Ruhr is the classic form of Ostbergisch spoken in Mülheim an der Ruhr, Nordrhein-Westfalen (North Rhine-Westphalia), Germany. It is quite different, but it is still intelligible with the other dialects.

Saarn Mülheim an der Ruhr is spoken in the Saarn District of Mülheim an der Ruhr, Nordrhein-Westfalen (North Rhine-Westphalia), Germany, but it differs considerably from the standard version of Ostbergisch. Nevertheless, it is fully intelligible with the other dialects.

Bergish is one of two high level splits in Neiderrbergisch. It is definitely not intelligible with Cleves Kleverlandish (Harms 2009). This language is based on Low Rhenish but has acquired a heavy Ripuarian layer such that speakers feel that their speech somewhat resembles the Ripuarian language Kölsch, which is nearby (Harms 2009).

There are various dialects of this language, including Krieewelsch, spoken in central Kresweld, Ödingsch of Uerdingen in Krefeld, Metmannsch Platt of Mettmann, Düsseldorver Platt of northern and central Düsseldorf, Vogteier, spoken in Nieukerk, Solinger Platt of Solingen, Remscheder Platt of Remscheid, Rotinger Platt of Ratingen, and Wülfrother Platt of Wülfrath which is located between Düsseldorf and Wuppertal. Solingen, Krieewelsch and Wülfrath are all mutually intelligible (Harms 2009). It is also spoke in Neuss, Remscheid, Mochengladbach and Heinsberg.

Düsseldorver Platt is intelligible with Ostbergisch but not with South Guelderish, Limburgish or Aachen German. Düsseldorver Platt has 60% intelligibility with Aachen German. Düsseldorver Platt is not fully intelligible with any of the various lects spoken in the Netherlands (Harms 2009).

Düsseldorver Platt is mostly only spoken by older people these days, who nevertheless keep it very well alive. Middle-aged people have passive competence, but often not active, and young people may lack either, though some can hear the language.

Solinger Platt is a form of Bergish spoken in Solingen, North Rhine-Westphalia, Germany. The link leads to a description of it and a transcription of a short story in the dialect. It is fully intelligible with Düsseldorver Platt (Harms 2009).

The Meuse-Rhenish lects, including SE Limburgish, Limburgish, Bergish, and Kleverlandish.
Click to enlarge. The Meuse-Rhenish lects, including SE Limburgish, Limburgish, Bergish, and Kleverlandish.


Anonymous. Wesels Kleverlandish native speaker, Wesels, Germany. Personal communication. July 2009.

Anonymous. Antwerps, AN and Verkavelingsvlaams speaker, Antwerp, Belgium. Personal communication. January 2010.

Berns, J.B. 1991. “De Kaart van de Nederlandse Dialecten”, in Herman Crompvoets and Ad Dams, eds., Kroesels op de Bozzem. Het Dialectenboek, Waalre:24-27

DeEllis, Jonathon. Dutch-English translator and former Venlo resident for 10 years. January 2010. Personal communication.

Felder, Lianne. May 2015. Resident of Groningen City, Netherlands, ABN speaker. Personal communication.

Gooskens, Charlotte & Heeringa, Wilbert. 2004. The Position of Frisian in the Germanic Language Area. In: Gilbert, D. &  Schreuder, M. &  Knevel, N. (eds.), On the Boundaries of Phonology and Phonetics, 61-87. Klankleergroep, Faculty of Arts, University of Groningen, Groningen. Dedicated to Tjeerd de Graaf.

Gooskens, Charlotte and Kürschner, Sebastian. 2009. On the Low Saxon Dialect Continuum – Terminology and Research. In Lenz, Alexandra N.; Gooskens, Charlotte and Reker, Siemon (Eds.). Low Saxon Dialects Across Borders – Niedersächsische Dialecte Über Grenzen Hinweg, Zeitschrift fur Dialektologie und Linguistik. Beihefte 138:9-27.

Gooskens, Charlotte; Kürschner, Sebastian and van Bezooijen, Renée. Intelligibility of Low and High German to Speakers of Dutch. Dialectologia (submitted for publication, not yet published).

Grondelaers, Stef. Linguist, the Netherlands. Personal communication, August 2009.

Harms, Biggi. Düsseldorf Bergish native speaker. Personal communication. March 2009.

Heeringa, Wilbert. 2004. Dialect Variation in and Around Frisia: Classification and Relationships. Us Wurk; Tydskrift foar Frisistyk 53(4).

Heeringa, Wilbert. Jan. 2004. Measuring Dialect Pronunciation Differences Using Levenshtein Distance (Chapter 9). PhD Dissertation, University of Groningen.

Gooskens, Charlotte and van Bezooijen, Renée. 2005. How Easy Is It For Speakers of Dutch To Understand Spoken and Written Frisian and Afrikaans, and Why? In: J. Doetjes and J. van de Weijer (eds). Linguistics in the Netherlands 22:13-24.

Houwer, Annick; Remael, Aline and Vandekerckhove, Reinhild. July 2008. Vandekerckhove Intralingual Open Subtitling in Flanders: Audiovisual Translation, Linguistic Variation and Audience Needs. Journal of Specialized Translation 10.

Hinrichs, Erhard; Gerdemann, Dale and Nerbonne, John. Undated. Measuring Linguistic Unity and Diversity in Europe. Project Proposal. Rijksuniversiteit Groningen.

Köhler, Pascal. Eschweiler German and German native speaker. Personal communication. January 20, 2015

Nerbonne, J. W.; Heeringa, E.; van den Hout, P.; van der Kooi, S. Otten, and van de Vis, W. 1996. Phonetic Distance Between Dutch Dialects. In: G. Durieux, W. Daelemans, and S. Gillis (eds.). CLIN VI, Papers from the Sixth CLIN Meeting. Antwerpen. University of Antwerp, Center for Dutch Language and Speech, 185-202.

Smith, Norval. Linguistics professor, the Netherlands. Personal communication. March 2009.

ter Denge, Martin. Twents native speaker, Rijssen, the Netherlands. Personal communication. November 2009.

Tulipan, Laszlo. Stolberg German native speaker, Stolberg, Germany. Personal communication. April 2013.

van Bezooijen, Renée and van den Berg, Rob. 1999.
Taalvariëteiten in Nederland en Vlaanderen: Hoe Staat Het Met Hun Verstaanbaarheid? Taal en Tongval 51(1): 15-33.

Zweers, Steven. Dutch native speaker, the Netherlands. Personal communication. March 2009.

This research takes a lot of time, and I do not get paid anything for it. If you think this website is valuable to you, please consider a a contribution to support more of this valuable research.

A Reworking of the Classification of Philippine Language Group

I really don’t know much about the classification of the Phillipine languages, other than that they are all Austronesian, and they are fairly close to one another. How close, I haven’t the faintest idea. The ones listed as different languages are certainly not intelligible though.
The Austronesian family outside of Taiwan is also huge, and it is also fairly close-knit, in fact, it is probably one of the closest large collections of related languages on Earth.
These include all of the languages of the Philippines, most of the languages of Indonesia and Malaysia, some New Guinea languages (on the coast), most Melanesian languages (some appear to speak Papuan languages), and all Polynesian and Melanesian languages. Yes, indeed. Hawaiian, Samoan and Maori are all related Tagalog, Bahasa Indonesia and Bahasa Malaysia. Even more shocking is that there are quite a few cognates. You might be surprised looking through a list of basic words in Tagalog and Hawaiian.
I don’t enough about Philippine languages to figure out if the author is onto something here.
He has also has a couple of maps of what the Philippines languages would have looked like without Spanish colonization and what they look like with Spanish colonization (actually existing situation), but I don’t know enough about that to make sense of it either. But it’s interesting.

Weird Language – Yola

Note: Repost from the old blog.
Extinct language, related to English! Straight up from Old English, from there to a branch of Middle English via a group of Normans who went to Ireland in 1169. There are some texts on the page showing Yola with English alongside. Strange how one’s language can change so much in just 700 years. It’s almost unrecognizable and would surely be unintelligible.
You folks want to know how new languages are created? Check out this example.
Check out also the strange language called Fingalian.
Last speaker – Martin Parle, Carnsore Point, late 1870’s.

Check Out Leonese

This is a sample of the Leonese language spoken in northwestern Spain. If you know Spanish or Portuguese, you might want to check it out, as it is quite similar to both.
The Leonese language is part of the complex known as Asturian-Leonese, said to be a single language. However, it is actually several languages. Leonese and Asturian are surely separate languages. In addition, Fala and Mirandese, other members of this purported languages, are in the opinion of Ethnologue also separate languages. However, Fala and Mirandese may well be intelligible with Galician.
All of these languages are more or less on the border or Portuguese and Spanish. More precisely, they are on the border or Spanish and Galician, and Galician is surely a separate language from Portuguese. Galician is sort of in between Spanish and Portuguese.
Here is a very rare sample of Leonese. Leonese is a dying language, and is only spoken by probably about 25,000 people max in the Leon near the border with northeastern Portugal. The language is seriously dying out, but there are efforts to revive it. At the moment, probably only Western Leonese, spoken near the Portuguese border, is viable. Central and Eastern Leonese are nearly extinct.
It is important to note that Extemaduran, in my opinion a separate language spoken by 500,000 speakers in Extremadura, is nothing but a very old Eastern Leonese dialect stranded in Extremadura. Mirandese in Portugal is best seen as an Extremaduran dialect stranded in Portugal, with heavy Galician admixture. Fala is more or less a similar thing. Fala is utterly incomprehensible to a Spanish speaker.
Barranquenho, really a separate language spoken in the small border town of Barranca, Portugal, is a mix between Alentejan Portuguese and Extremaduran Spanish, and thus is similar. Alentajo itself is probably a separate language from Portuguese proper. Barranquenho is essentially incomprehensible to a Portuguese speaker.
Listening to Leonese, I found it much harder to understand than Asturian, though it sounded quite similar. Leonese seemed to be closer to Portuguese than Asturian.
There is a huge language revival movement going on with Leonese, but with no state support, it’s hard to see how far it goes.

A Look At the Catalan Language

Updated September 25, 2011.
Catalan is a Romance language that is most closely related to Occitan. Although Occitan-Catalan started forming in 700-800, Occitan and Catalan are usually thought of as splitting from 1000-1300. However, scholars such as María del Candau de Cevallos and others present evidence that Catalan was already breaking away from Catalan-Occitan as early as the 700’s-800’s.
An alternate method is to see Catalan as part of something called Ibero-Romance together with the Romance languages of the Iberian Peninsula and to put Occitan in Gallo-Romance together with French and related tongues. It’s better to just avoid this and create a whole new category called Catalan-Occitan.

The Catalan-speaking world. Catalan is mostly spoken in Catalunya and Valencia in Spain, a bit in Aragon in Spain, and also in far southwestern France in Rousillon. The three shaded islands on the map are the Balearics. The tiny shaded area on the island at the far right represents Alghuerese Catalan spoken in Alghuero, Sardinia.
The Catalan-speaking world. Catalan is mostly spoken in Catalunya and Valencia in Spain, a bit in Aragon in Spain, and also in far southwestern France in Rousillon. The three shaded islands on the map are the Balearics. The tiny shaded area on the island at the far right represents Alghuerese Catalan spoken in Alghuero, Sardinia.

There is a common notion running about that Catalan speakers can understand Occitan. Although surely it differs with exposure, in general, Catalan speakers have a hard time understanding Occitan. Intelligibility between the two languages is probably on the order of 50%. But after only a few weeks of close contact and some intense coaching, they should be able to understand each other pretty well. On this basis, Occitan and Catalan are surely not dialects of a single tongue. However, Catalan and Occitan are very closely related languages.
The same type of folks (I call them “everyone can understand everyone” people or lumpers) also insist that Castillian and Catalan are mutually intelligible. If this were the case, there would be no grounds for a political fight in Catalunya from the Castillian speakers who do not wish to have Catalan shoved down their throats.
The truth is that Castillian speakers can only understand about 40% of written Catalan. Some estimates are that spoken Catalan and Spanish have less than 60% intelligibility. The actual figure may be even less. Catalan is surely not a dialect of Castillian.
There are claims that Catalan and Portuguese are mutually intelligible. This is not the case.
Catalan is also not intelligible with Aragonese. In the Medieval Period, Aragonese and Castillian were considered to be unintelligible to Catalan speakers in the Catalan region. Aragonese is not even intelligible within itself. Why would they be able to understand Catalan too?
Catalan, when spoken, sounds like a cross between Castillian and French.
There is a lot of intense language politics swirling around Catalan. It is the language of an autonomous region of Spain called Catalunya. The fascist Franco tried to kill the language by forbidding its use.
Spanish nationalists are just as horrible as French nationalists, if not worse. As an example, there is a tiny part of Portugal that Spain has occupied for hundreds of years. As per a treaty of 1812, Spain was required to hand over this bit of territory. In the 197 years since then, they have flatly refused to do so. An imperialist Spain continues to occupy a few small islands of frankly Moroccan territory off the coast of Morocco in defiance of Moroccan insistence that they are Moroccan territory.
After the fascists were toppled, Spain was arm-twisted into making Galician, Basque and Catalan into official languages. During the dictatorship, Galician and Catalan were referred to as dialects of Castillian. Recently, Aranese, an Occitan dialect, was also recognized. There are other languages in Spain such as Asturian, Leonese, Murcian, Andalucian, Extremaduran and Aragonese. These are not yet recognized by the imperialist Spanish state.
There are problems in Catalunya. At home, about 1/2 the population speaks Catalan and 1/2 speaks Castillian. However, 95% can understand Catalan, 81% can read Catalan, 78% can speak Catalan and 62% can write Catalan. The Catalan government, understandably, has been mandating the amount of use of Catalan on billboards, the percentage of foreign films translated into Catalan, the number of hours of school instruction that must be in Catalan and the hours of foreign language study in Catalan or Castillian.
For this, Castillian speakers have called them “fascist,” but it’s only normal for them to try to save their language, which is not necessarily doing all that well.
In Andorra, the official language is Catalan, and this is also the most widely spoken language. It is the only officially independent Catalan speaking country on Earth. French and Castillian are also widely spoken.
All dialects of Catalan are said to be mutually intelligible.
However, people say that about the Occitan lects, about Dutch and German, about the Scandinavian languages, about Spanish and Portuguese, on and on, so that is not very reliable.
Further, there is a strong politicization movement similar to Occitan whereby a language in trouble wants to see its various lects as unified under a single language. The notion is that splitting will further endanger a troubled language. Hence, there is a tendency for Catalan nationalists to scream that they can easily understand every variety under the sun. That’s ultimately a politicized response, and it is not scientific.
It’s only natural to wonder whether Catalan is more than one language, so an investigation was undertaken.
Method: Literature and reports were examined and Catalan-speaking informants were interviewed to determine the intelligibility of the various dialects of Catalan. >90% intelligibility was considered to be a dialect of Catalan. <90% intelligibility was considered to be a separate language. The emphasis was on intelligibility rather than structural factors. Overtly political argumentation was ignored.
Results: The result of this investigation was to split Catalan from 1 to 2 languages. Below, separate languages are in bold, and dialects are in italics.
Discussion: Catalan is a very tight-nit language family. The vast majority of Catalan lects can more or less understand each other with few problems. The Blaverist Movement is politically motivated and is not linguistically justified.
A great map of all of the languages and dialects of SW Europe. It's in Spanish, but you should be able to understand it anyway. All of the Catalan dialects are listed here in dark green.
An excellent map of the languages of southwest Europe. Catalan languages and dialects are in dark green.

There are many dialects of Catalan.
Some are: Rousillonese (Northern Catalán), Valencian (Valenciano or Valencià), Balearic (Balear, Insular Catalan, Mallorqui, Menorqui and Eivissenc), Central Catalan, Alghuerese, Northwestern Catalan (Pallarese, Ribagorçan, Lleidatà and Aiguavivan).
Northern Catalan is actually spoken in France by about 100,000 speakers. It receives no support from the Jacobin French state. Northern Catalan is a very divergent Catalan dialect, although Catalan speakers say that they can understand it just fine. It has a lot of French influence in the lexicon. Northern Catalan sounds very much like French to Southern Catalan speakers. About 40% of the population can speak the language.
Rousillonese is the main dialect of Northern Catalan spoken in France. It’s in better shape than many say it is, but the future prospects are probably not too good.
Rousillon is close to the Occitan language Languedocien.
There is a tremendous to-do over Valencian. Valencian activists, the Blaverists, insist that Valencian is a separate language from Catalan. This is a political issue, not a linguistic one. Linguistically, it is long settled. Valencian is simply a dialect of Catalan, and the two varieties have about 93% intelligibility. There is no scientific grounds for splitting Valencian into a separate language.
Balearic, Alghuerese and Rousillon (Northern or French) Catalan are much further from Central Catalan than Valencian is.
Balearic is spoken in the Balearic Islands and is said to be quite different. Majorca Catalan is somewhat hard to understand for Valencians. It is even hard for Barcelonans to understand. Central Catalan speakers say they go to the islands and communicate without problems, however others say that the old Catalan language of Ibiza is hard for Barcelonans to understand. Some Balearic speakers, like Valencians, say they speak a separate Catalan language.
Intelligibility between Balearic and Catalan Proper is said to be about the same as between Catalan and Valencian, which would mean that Balearic is a dialect of Catalan. We will tentatively split this off due to reports of intelligibility issues, but this remains very controversial. The best way to sort this out would be through intelligibility studies as have been done with Valencian.
Central Catalan is the main variety and is the most widely spoken. This is the variety of Barcelona, and this is what the literary language is loosely based on. Catalan TV usually uses this dialect.
Northwestern Catalan is extremely divergent.
Ribagorçan is transitional to the Aragonese language and is sometimes called a dialect of Aragonese. The truth is that the eastern part is Catalan transitional to Aragonese, the western part is Aragonese transitional to Catalan and the central part is Benasques.
Pallarese is also spoken in the same area and is said to be very different.
Aiguavivan is spoken in high valleys of Pyrenees and is very different. Related varieties called Chapurriau are spoken in Castellote, Torrevelilla and Matarraña nearby in Aragon and across the border in Valencia. These are mixtures of Old Castillian, Castillian, Valencian, Aragonese and a bit of Catalan. The Valencian element predominates. Although these lects are intelligible with Catalan proper, the speakers insist that they do not speak Catalan.
Benasquese is spoken in the same region as Aiguavivan and is often said to be a Catalan dialect. It is not. It is either a transitional lect between Catalan and Aragonese, a divergent Aragonese dialect, or a separate language in between Aragonese and Catalan. At any rate, however we wish to characterize Benasquese, it is not a Catalan dialect.
All of NW Catalan appears to be intelligible with the rest of Catalan.
At last we come to Algherese, spoken in Sardinia in the town of Alghero. This language is dying out, but there are still 20-30,000 speakers, mostly older people.
Many say that structurally, this is by far the most divergent variety of Catalan, created when Catalans landed on the island over 500 years. Algherese has been split from Catalan for over 500 years now. The lect sounds like Medieval Catalan and furthermore, lots of Sardinian language has gone in. Catalan speakers say it sounds like Italian.
Reports indicate that Catalan travelers to Alghero can still understand Algherese quite well, albeit as a somewhat Medieval form of Catalan.
However, the venerable Encyclopedia of Endangered Languages treats Algherese as a separate language, as all of the lects listed are treated as languages. However, this treatment will rely on intelligibility alone, and on that basis, Algherese is a dialect of Catalan, not a separate language.


Candau de Cevallos, María del C. 1985. Historia De La Lengua Española. Potomac, Md.: Scripta Humanistica.
Gulsoy, Joseph. 1982. “Catalan”, Chapter in Posner, Rebecca, Green, John N. Trends in Romance Linguistics and Philology, Volume 3. La Hague, Paris, New York: Mouton.
Moseley, Christopher. 2007. Encyclopedia of the World’s Endangered Languages. Abiding, UK: Routledge, Taylor and Francis Group.

This research takes a lot of time, and I do not get paid anything for it. If you think this website is valuable to you, please consider a a contribution to support more of this valuable research.

A Reclassification of the Occitan Language

Updated May 29, 2015. Long, runs to 65 pages.

Map of Occitania showing the major dialect divisions including Vivaro-Alpine.
Map of Occitania showing the major dialect divisions including Vivaro-Alpine.

According to Ethnologue, Occitan is currently 1 language. This reanalysis will expand Occitan from 1 language to 22 languages.
Occitan, or Langue d’Oc, is spoken in general in a swath across the south of France. It goes a bit into Spain in the Pyrenees and into far northwestern Italy. There is Occitan an outlier in Italy.
There are various classification methods for Occitan. One is to differentiate between Langue d’Oil (French) and Langue d’Oc (Occitan). I do not agree that Occitan is particularly close to French. Occitan is about as far from French as Spanish and Italian are.
I would put the Oil languages (including French) in a Northern Gallo-Romance and Rhaetian and Italian Gallo-Romance into a Southern Gallo-Romance with Arpitan as transitional between the two.
Another view of Occitan.

Ibero-Romance is a different split altogether. Occitan is better placed into a separate Romance category that I would call Catalan-Occitan. This analysis sees Occitan and Catalan as a singular branch of Romance. Catalan-Occitan is then properly put into Ibero-Romance.
It also recognizes that Occitan and Catalan were once a single language stretching across the south of France and into northwestern Spain. This language was very widely spoken, and at one time in the Middle Ages it was very widely used. It is “the language of the troubadours,” the wandering minstrels who plied their trade across Southern Europe in the Middle Ages, though in truth, the troubadours mostly came from Limousin and wrote their songs in a sort of Poitou-Limousin dialect that no longer exists.
Another map of Occitania. This one is a bit harder to read as it is largely in dialect, but if you study it a bit, it makes more sense.
Another map of Occitania. This one is a bit harder to read as it is largely in dialect, but if you study it a bit, it makes more sense.

From 500-1200, there was really only one language – Catalan-Occitan. Occitan only started distinguishing itself after 1200.
At the moment, Southern Languedocien has the closest relationship of all to Catalan – in fact, they are intelligible. Gascon is then the next closest to Catalan, but intelligibility data is lacking. The rest of Occitan is more distant from Catalan. Catalan speakers have a hard time understanding Auvergnat, Limousin, standard Languedocien and Provencal.
It is not the case, as often stated, that Catalan and Occitan are intelligible, but they are close. Catalan and Occitan probably have about 50% inherent intelligibility when spoken, much more in writing. Speakers of only Catalan and Spanish report a hard time understanding Languedocien Occitan TV broadcasts. Cultivated speakers could pick up the other language in a few weeks with coaching. The differences between Catalan and Occitan are said to be greater than among the Scandinavian languages.
A Third Map of Occitania. This is the best-laid out of all, and it is the easiest to read and make sense of.
A Third Map of Occitania. This is the best-laid out of all, and it is the easiest to read and make sense of.

Occitan has been on decline for a long time, as the Langue d’Oil has been been supplanting it for centuries. The decline began in 1539 when a French king ordered that langue d’oil be the official language of all of France. Despite a brief revival in the 1800’s, it’s been downhill ever since. Occitan speakers did not start speaking French in large numbers until 1885. Before that, there was only minor French influence on spoken Occitan. Since 1885, French influence on spoken Occitan has increased, in some cases dramatically.
The codification of the Parisien Langue d’Oil language as Standard French with the victory of the French Revolution and the corresponding fascist Jacobin war on all other languages caused Occitan to recede further into the background.
The French government is reactionary/fascist on the subject of language. The Jacobin Constitution baldly states that “French is the language of the state” and allows for no other languages. Hence, Occitan receives no state support in any way.
Occitan still has about 8 million people who can understand it and 3 million who can speak it to one degree or another. Estimates of the true number of speakers range from 1-3.7 million. Occitan is surely a modern language and does not lack for vocabulary – it has between 250,000 and 1 million words, though many say that this is an exaggeration.
Occitanists like to say that Occitan is all one language, but this is a political statement. They say this in order to unite the dying language and prevent it from splintering.
The Occitanist position is increasingly popular. For instance, Wikipedia is calling all of the Occitan languages “dialects.”
There are two centralized ways of writing all Occitan dialects, one based curiously enough on a Medieval standard. Around 1850, an Occitanist poet named Frederic Mistral invented a standard based on his own Provencal language, but this solution has not caught on well. The second is neo-Occitan, a new koine language created more recently.
Occitan is spoken most often by those over 50, except in Italy and Spain. Occitan is only protected in Spain and Italy, where the respective forms of Aranese and Transalpine Provencal are spoken.
If you hear Occitan, it sounds like some curious cross between Spanish and French, sort of the way that Catalan sounds.
It’s not true that Occitan is one language as the Occitanists centered in the south of the region insist.
The intelligibility among Occitan lects seems to be as I suspected. People with exposure to the other lects can pick them up pretty quickly, but someone who has never heard the other varieties has a hard time understanding them. This is called learned bilingualism. If learned bilingualism is the rationale for saying that Occitan is a single language, it stands on precarious scientific grounds.
Nevertheless, intelligibility in Occitan remains a very controversial subject. On the one hand, speakers say they can’t understand speakers of the same lect a few miles away; on the other hand, Occitan speakers say they can understand totally different varieties from far away very well.
At this point, it is time for some scientific intelligibility testing to sort all of this contradictory information out. Intelligibility testing has already been done with Occitan. It did find high, but by no means complete, intelligibility between major Occitan lects (Bec 1982). This suggests that intelligibility among Occitan lects is marginal, possibly on the order of 80% among major forms. I would appreciate it if anyone privy to this study could contact me.
Others have put the figure about where I did – at 70-85% intelligibility between major Occitan languages.
It is often said that French and Occitan speakers can communicate well enough. This is not the case. There are many French speakers living in Occitania who say that they cannot understand a word of Occitan.
A good overview of Occitan is here.
Method: Literature and reports were examined to determine the intelligibility of the various dialects of Occitan. >90% intelligibility was considered to be a dialect of a major Occitan language. <90% intelligibility was considered to be a separate language split off from Macro-Occitan. The emphasis was on intelligibility more than than structural factors, but structural factors were included.
Results: This treatment expands Ethnologue’s 1 Occitan language to 22 Occitan languages.
A great map of all of the languages and dialects of SW Europe. It's in Spanish, but you should be able to understand it anyway. The Occitan languages are the light green stretching across all of southern France.
A great map of all of the languages and dialects of SW Europe. It’s in Basque, but you should be able to understand it anyway. Occitan dialects are listed in light green in the area around Southern France.

Gascon is a Southern Occitan macrolanguage spoken in southwestern France and barely over the border into Spain. It has 256,000 speakers, 250,000 in France, but other figures put the number at 500,000. Gascon has some affinities to Basque – it is said to have a Basque substrate – but it is not close to Basque at all. Gascon is probably closer to Catalan than anything else (even closer than it is to Aragonese), however, there is no continuum between Gascon and Aragonese at the border. Instead, there is an abrupt transition.
Gascon is best seen as its own variety of Occitan outside of both Northern and Southern Occitan. It has heavy influence of Basque and Aragonese that makes it very different from the rest of Occitan.
Along with the Basque substrate, Gascon is a transition between Ibero-Romance (Occitan proper – Aragonese – Catalan) and Gallo-Romance (langues d’oil).
In contradiction to the Occitan centralizers, Gascon speakers say they speak a separate language and resent both being referred to as speakers of a dialect of Occitan and what they see as the cultural imperialism of Occitan politics centered in Toulouse.
In France, Gascon is spoken in the departments of Landes, Gers, Hautes-Pyrénées, the eastern parts of Pyrénées-Atlantiques and the western parts of Haute-Garonne and Ariège, and it is still used actively by many people.
50 years ago, near La Réole, France, there were still monolingual Gascon speakers among the older people. People were still being brought up speaking Gascon as recently as the early 1980’s. However, in France, it is not being taught much to children.
Gascon speakers have a hard time understanding Limousin, and Languedocien speakers say it’s hard to understand Gascon and vice versa. However, it is easier for Gascon speakers to understand Languedocien (though intelligibility is still difficult) than vice versa due to the French-like regularity of Languedocien. For example, “How are you?” is “Quin hes?” and “Cossi fas?” respectively in Gascony and Languedoc. It goes on like that through the Gascon-Languedocien lexicon. It’s clear we have two completely separate languages here.
Around Agen, there is a transition between Guyennais, Gascon and Languedocien. One village can understand the next, but once you get 10-15 miles away, things get difficult. Provencal speakers say Gascon is a foreign language. Gascon speakers can’t understand a word of Auvergnat.
It makes sense to split Gascon into a West and East Gascon. The border would run from Artix-Pau in the south to Marmande-Agen in the north. East Gascon would then start at around Pau and Agen and West Gascon at Artix and Marmande. These distinctions represent the variant influences of Bordeaux in the west and Toulouse in the east.
West Gascon is spoken from Bordeaux in the north to the Basque country in the south. In the east, it runs to Artix in the south and to Marmande in the north. Its differences with East Gascon revolve around the influence of the large city of Bordeaux on the language.
Dialects include Bazadais, Marmandais, Bordalés and Médoc.
Bazadais is spoken around the town of Bazas, famous for its cows. Marmandais is spoken around the town of Marmande.
Bordalés is spoken around Bordeaux. It is probably in very bad shape. It was declining badly even 50 years ago. There is actually a transitional Occitan-langue d’oil (Saintongeais) region around Bordeaux.
The region around Bordeaux is notorious for its sharp linguistic breaks. One early chronicler estimated that the distance between West Gascon Limonde and Saintongeais-speaking Pays Gabay north of Limonde to the Saintonge border was 50% (further than English and German), while the distance between Limonde and Perigord Limousin speaking Montpon-Menesterol just to the east was 25%.
There is evidence of a very sharp ancient linguistic border between Gascon and Old Saintongeais-Limousin on the southern border of Saintonge from St. Cliers to Coutras.
Médoc is a dialect spoken on the Médoc Peninsula south of the Gironde.
Landes is a West Gascon language spoken in southwestern maritime France in the Aquitaine region. As it is not even intelligible within itself (it differs so much that it is hardly intelligible even from village to village), it must be a separate language. Some say that Landes is nearly a dead language, but others say that it is still spoken in the villages.
The coast near Biscarosse gave up Landes long ago, but now even in the inland villages like Rion de Landes and Parentis en Born it is hard to find a speaker. The real Landes died around 1950. The current dialect is very Frenchified.
East Gascon is spoken from Pau to the Ariege River in the south and from Agen to Toulouse in the north. It represents the influence of the large city of Toulouse. Even between the cities and Pau (East Gascon) and Artix (West Gascon), which are very close together, communication is nearly impossible.
This language is probably in very bad shape. It is probably extinct in the Rivière-Basse region around the towns of Marciac, Plaisance and Maubourguet and in the Vic-Bihl region just to the west around Riscle. In Tarbes, Lannemezan and Lourdes, speakers are almost impossible to find. The eastern border with Languedocien is in the Ariege.
Neraqués and Lomagne Gascon are two East Gascon dialects. Neraques is spoken in Nerac, just southwest of Agen in the Lot et Garonne. Lomagne Gascon is spoken to the far northeast of the Gascon language, southeast of Agen down towards Toulouse.
Pyrenean Gascon is a macrolanguage that is unintelligible with the Gascon of the plains. This language is the most divergent member of Occitan, probably due to very strong Basque influence. Some would put it outside of Occitan proper altogether.
The borders of Pyrenean Gascon run from the Ariege in the east to Bearn in the west and to the Spanish border (except in the Aran Valley).
Pyrenean Gascon is nearly a dead language in France, only spoken by 1% of the population. In the region of Tarbes, Lourdes and Bagneres, there are almost no speakers left.
In Bearn, Pyrenean Gascon is still heavily used. In 1994, fully 26% of the population spoke Béarnais. Since then, the number has probably collapsed precipitously, possibly down to 5-6%.
It makes sense to split Pyrenean Gascon into three separate languages. Gascon speakers in the east of Bearnais have a hard time understanding the speakers in the west of Bearnais. They also have a hard time understanding the Couserans spoken in the Upper Ariege near the Foix and Andorra.
Although it makes no linguistic sense, Bearnese is often split off a separate dialect of Pyrenean Gascon. Dialects of Bearnese include Aspés, Ossau Bearnese, and Palois. Bearnese is spoken in Bearn.
West Pyrenean Gascon covers the western part of Bearn. It is here that there is the heaviest Basque influence of all. Speakers in the east of Bearn can understand speakers just to the east in Bigorre and Lourdes better than they can the speakers of western Bearn.
Oloronais (Aspois) is a dialect of Béarnais spoken in Oloron that borders on Souletin Basque. The actual linguistic border between Béarnais and Basque is in between Aramits and Tardets.
Central Pyrenean Gascon covers most of the Pyrenean Gascon region from eastern Bearn all the way to Ariege. Intelligibility is poor with both western Bearn and Couserans in the Ariege.
Bigourdan is a dialect of Central Pyrenean Gascon spoken in Bagneres de Bigorre region. Subdialects are Argelès, Aure, Bagnères, and Tarbais. Bagnères is spoken around the city of Bigorre itself, and Tarbais is spoken around the town of Tarbes.
Eastern Pyrenean Gascon is spoken in the far east of the Pyrenean Gascon region by the border with Languedocien and Catalan and over the border into the Aran Valley. Central Pyrenean Gascon speakers have a hard time understanding those in the Couserans in the Upper Ariege by Foix, Rousillon and Andorra.
Dialects include Aranese, Ariegois, Commingese, Couseranais, Sauratois, and Contadels.
Aranese is an Eastern Pyrenean Gascon dialect spoken by most of the 6,000 people living in the Aran Valley in the Spanish Pyrenees, where it has official status. It has Spanish, Aragonese and old Catalan influences, but at the moment it is under very heavy Catalan influence such that many Occitanists regard it as an outrageously degenerated dialect.
Aranese is intelligible with Commingese across the border in France. Aranese is not intelligible with Spanish, French, Catalan or the rest of Occitan.
Pujolo and Canejan are Aranese dialects.
Ariegois is a Pyrenean Gascon dialect spoken in the Upper Ariege.
Sauratois is an Ariegois dialect spoken in the Saurat region northeast of Tarascon on the Ariege River. Couseranais is an Ariegois dialect spoken in the Couserans northwest of Andorra. It still has a few speakers. Contadels is an Ariegois dialect spoken in Vicdessos north of Andorra. There is a very heavy Languedocien and Catalan influence on this dialect. This is actually a Gascon-Catalan transitional dialect.
Southern Occitan is a branch of Occitan that stretches across Southern France near the ocean. It includes Languedocien, Maritime Provencal, Nissart, and Rhodanian Provencal. This branch has more Iberian influence in the west and more Ligurian Southern Gallo-Romance influence in the east.
Languedocien is a Southern Occitan macrolanguage that has 1 million speakers in an area in a line going from north of Andorra – Aude – Fenoullens – Leucate in the south (border with Catalan), from Toulouse to Oust in the west (border with Gascon), in a line running from Toulouse – Albi – Agde in the north (border with Guyennais) and at Bassin de Thau in the east (border with Provencal).
Languedocien sounds like a mixture of Spanish and French in the north or Spanish and Catalan in the south.
Languedocien speakers have a hard time understanding Limousin, Auvergnat and Gascon. Languedocien speakers have a hard time being understood by the Provencal speakers in Toulouse.
Along with Provencal, this language is more conservative and closer to the Medieval Occitan.
If you try to learn Occitan now as a second language, you will learn Languedocien. Attempts to standardize writing of Languedocien have not been successful. An Occitan koine is being promoted out of the University of Montpellier that some Occitan speakers have referred to as an Occitan Esperanto.
All across Languedoc, most of the older people and many young people still speak Languedocien. In Carcassone, all street signs are bilingual in Occitan, Occitan is an obligatory subject for primary school students, and there are 22,000 speakers in the city. Nevertheless, it is not being learned much by children in general in the region as a whole.
It makes sense to split Languedocien into a Ibero-Languedocien and a North Languedocien (or Franco-Languedocien), the first more like Catalan, Spanish, Gascon and Aragonese and the second more like French and the rest of Occitan.
North Languedocien is a Languedocien language with borders running from Toulouse – Albi – Bassin de Thau in the north and east and around the Bages-Sigues Lagoon in the south. This language lacks the strong Catalan influence of Ibero-Languedocien. Instead, it has more French influence.
There are various dialects within North Languedocien that are quite divergent. Dialects include Besierenc, Narbonés, Carcassés, and Pezenas.
These are spoken around the cities that they are named after and are said to be unrecognizable from one region to the next, but until we get specific intelligibility data, we can’t split them.
Ibero-Languedocien is spoken in the south from Toulouse and Albi down through the Ariege, the Foix, the Aude, the Fenouillines and over to the coast at Leucate, possibly extending north to Carcassonne and Narbonne. This language is rooted in Iberian phonetics.
Ibero-Languedocien speakers feel that they have excellent communication only with Catalan. With the rest of Occitan, they feel that they are speaking another language, and there are communication problems.
Ibero-Languedocien is intelligible with Catalan. This dialect is the closest of all Occitan lects to literary Catalan and is spoken in the part of southwestern France right next to Catalonia. Ibero-Languedocien speakers can understand Catalan easier than they can understand Gascon.
The border between Ibero-Languedocien and Catalan proper begins in the Languedocien-speaking Fenouillèdes along the Agly River. To the south, Catalan is spoken – to the north, it is Languedocien. But that boundary is fairly sharp. On the coast, the transition zone occurs from Leucate to Le Barcares and Salces. The true transition zone occurs in the area north of Andorra. The Catalan of Formigueres is basically the same language as the Languedocien of Usson just to the north.
Tolosenc is a dialect of this language spoken around the city of Toulouse. It has Gascon influences. In the rural areas around Toulouse, almost everyone over 25 understands Tolosenc. In this area, many people over 40 were raised speaking Tolosenc as a first language, but most have forgotten it by now. However, in Toulouse proper, Occitan speakers have gone from 50% in 1950 to 10% today. Foissenc is spoken in the Foix.
Agathois is a divergent Languedocien lect spoken on the coast town of Agde. It is very different from the Besierenc dialect spoken in Beziers and Vias, which were wine-growing regions. Beziers and Vias received many Spanish immigrants to pick grapes in the vineyards and received many more during the Spanish Civil War. As a result, Besierenc now has heavy Spanish admixture. But Agde, on the coast, received no Spanish influx, and now communication is sometimes difficult between Agathois and Besierenc speakers.
Provencal is a very famous Southern Occitan macrolanguage that is spoken further east than Languedocien all the way to the Italian border. It has 200,000 speakers. Provencal is spoken in the departments of Alpes-Maritimes (except the eastern corner), Bouches-du-Rhône, Var, Vaucluse, in the southern parts of Alpes de Haute-Provence, and the eastern parts of Gard.
Provencal is said to be close to the Gallo-Italic Piedmontese language.
Auvergnat speakers say they cannot understand the Mompelhierenc spoken in Montpellier, and there is marginal intelligibility with Nimes and Sète. People with one parent who spoke South Auvergnat and another who spoke Provencal were not taught Occitan because the lects were too different. This implies that even South Auvergnat has poor intelligibility with Provencal.
Limousin speakers who move to the Provencal region say that the two feel very much like separate languages. Provencal speakers say that Gascon is a foreign language, they cannot understand Vivaroalpine and they even have a hard time with Languedocien.
Provencal, along with Languedocien, is closer to the Medieval Occitan language and is more conservative.
Dialects include Cévenol, Maritime Provencal, Marsillargues, Mompelhierenc, Bas-vivarois, Lunellois, Aptois, Bagnoulen, Barjoulen-Draguignanen, Canenc, Coumtadin, Foursquare-Manousquin, Grassenc, Marsihés, Maures, Castellane Provençal, and Sestian.
Cévenol is spoken in the Cevennes Mountains north and northwest of Nîmes and is doing well. Maritime Provencal is spoken around the Cote d’Azur, is doing well and is widely spoken, especially as Marsillargues in Marseilles. Mompelhierenc, spoken in Montpellier, has heavy Languedocien influence. Bas Vivarois is spoken in the lower half of the Ardeche region. Lunellois is spoken in Lunel between Montpellier and Nimes and still has speakers.
Aptois is spoken around the town of Apt north of Marseilles. Barjoulen-Draguignanen is spoken around the towns of Barjemon and Draguignan in the hills north of the French Riviera. Canenc is spoken around the Cannes. Grassenc is spoken on the French Riviera.
Rhodanian is spoken around Arles, Avignon and Nîmes, is apparently not intelligible with the rest of Provencal and may be more than one language. Rhodanian speakers from around Nîmes say that they cannot understand other speakers from villages only 12 miles away. This is actually a Languedocien language that underwent Provencal phonetic changes in the late 1700’s, resulting in a Provencal tongue. This probably accounts for its diversity.
Dialects include Arlaten, Bagnoulen, Camarguen and Nimoues. Arlaten is spoken around Arles. Bagnoulen is spoken around the town of Bagnols sur Centre. Camarguen is spoken around Camargue Bay. Nimoués is spoken in Nimes.
Nissart is a Southern Occitan dialect spoken in Nice. It has very heavy influence from the local Ligurian Gallo-Italian dialects. It is best seen as a transitional dialect between Gavot Vivaro-Alpine and Ligurian. Based on its history, a more proper analysis would be that it was a Ligurian language that became Provencalized by Alpine Provencal speaking immigrants from the mountains coming to work on the coast after 1861. However, others say that it has been part of the Occitan area since the Middle Ages.
The Nizzardos, residents of Nice, spoke a Ligurian dialect before Nice was taken over by France in 1860; since then, much French has gone in. It is similar to the Mentonasque and Monegasque spoken in Menton and Monaco (the first Occitan and the other Ligurian). Intelligibility between Nissart and Royasque Ligurian is very limited.
Nissart is in very bad shape; it is a dying language mostly spoken by older people, when it is spoken at all.
Dialects include Esteron, High Vésubie, and Northern Nissart.
Mentonasq is a curious Gavot Alpine Provencal dialect related to Nissart spoken near Monaco in and near the town of Menton. It has a lot of Ligurian influences like Nissart. This is intelligible with Nissart and is apparently a Nissart dialect.
This is best seen as transitional between Nissart and Intermelio to the east, a Ligurian dialect with strong Occitan influence. Studies have shown that Mentonasq is between Gavot Alpine Provencal (Nissart) and Royasque (Brigasc)-Pignasque (Ventimiglian) Ligurian (spoken in the Roya Valley in France and Pigna in Italy on the border), with an emphasis on the Occitan. About 2/3 of the words are Provencal.
There are still those who insist that this language is basically Ligurian with strong over layer of Provencal. Intelligibility between Mentonasq and Ligurian Royasque is better than between Nissart and Royasque but is still somewhat marginal.
Although it is close to Nissart, Mentonasq is also quite different from it.
Monegasque is quite different from Mentonasq. It is mostly spoken by older people, fisherman and rural types. There is bilingual signage. But the language is in bad shape as the young do not speak it, and there are many tourists.
Roquebrunasq is a dialect of Mentonasque, spoken on the Roquebrun-Cap Martin just to the west of Menton. It is somewhat different from Mentonasque. It is dying out. The similar dialects Gorbarin and Castellarois are spoken in Gorbio and Castellar. Gorbarin is particularly close to Mentonasc. Like Nissart, these are Gavot dialects transitional to Ligurian.
Northern Occitan is a branch of Occitan that is spoken in the north of the Occitan region and also over by the Italian border. There are great differences between Northern and Southern Occitan. For instance, 30% of the vocabulary of Auvergnat is not found in Southern Occitan.
One way to look at this is to say that the languages in this region – Limousin, Auvergnat and Vivaro-Alpin, are part of something called Medio Gallo-Roman, which is really in between the langue d’oc proper of the south – Gascon, Languedoc and Provencal – and the langues d’oil to the north and Arpitan to the east. Another way to look at it is to say that Northern Occitan is closer to Arpitan than to the rest of Iberian-dominated Southern Occitan.
Limousin is a Northern Occitan macrolanguage spoken in France and has over 100,000 speakers. It is spoken in Limousin Province and over the western border into the far eastern part of Saintonge and the Perigord in North Acquitaine. North Perigord in Acquitaine has Saintongeais influences. South Perigord speaks Guyennais.
Limousin is still widely spoken in the Limousin region and in northern Dordogne in Acquitaine.
Limousin may have once been many separate languages, at least in the Dordogne department. Older residents in the Périgord Vert near Nontron report that from 1930-1970, it was not unusual for different villages to have Limousin dialects so different that one village could not understand the next, and they had to resort to the use of a koine.
Gascon, Provencal, Languedocien and Auvergnat speakers say they cannot understand speakers of Limousin.
Charente Limousin is a Limousin dialect that is very hard to classify. It extends from Confolens south to Aubeterre. This is an Occitan-Oil transition zone with an emphasis on the Occitan. So these are Limousin dialects transitioning to Charentais langue d’oil.
Between Confolens and Ruffec around Chatain, there is a transitional dialect between langue d’oc and langue d’oil that is nevertheless intelligible with the Charentais spoken in Ruffec. This is probably a Charentais dialect transitional to Limousin.
This province is generally langue d’oil speaking and has been so since the original Limousin speakers were eliminated by the Black Plague in the 1300’s and replaced by langue d’oil speakers, but the area around the Charente River in the far east of the province has long spoke Occitan and never underwent replacement.
Saint-Eutrope and Montberonés are Charente Limousin dialects. Montberones is spoken in Montbron, and Saint-Eutrope is spoken in the town of the same name.
South Limousin is a separate language spoken south of Haute Vienne in Limousin south to the Limousin border. It is closer to Auvergnat and Languedocien.
Haute Vienne North Limousin speakers understand no more than 60% of the South Limousin of Ussel. Between Upper Limousin in Limoges and Lower Limousin in Brive, there are many confusing phonetic changes that make it hard for North Limousin speakers to understand Brive speakers.
Corrèzese is a dialect of South Limousin spoken around the city of Correze. Correzese speakers can understand Auvergnat and vice versa. Corrèzese is best seen as a Limousin dialect transitional to Auvergnat.
Sarladais is a South Limousin dialect spoken in Sarlat in Aquitaine just southeast of Limousin. It has strong Guyennais influences.
Monédières Limousin, a variety of South Limousin spoken in the Monédières Hills near Correze, is a separate language. For one thing, it does not even appear to be intelligible within itself. Some varieties of Bas Limousin in the Monédières Hills near Correze have a hard time understanding each other. For another, Limousin speakers say they have a harder time understanding Monédières Limousin than they do Auvergnat as a whole. This is more than one language.
Guyennais is a highly divergent lect, possibly a separate language, spoken in a swath across central Acquitaine, northern Languedocien and southwest Auvergnat. It is transitional between Gascon, Languedocien, Limousin and Auvergnat. In the South Perigord, the influences are Saintongeais, Gascon and Languedocien. To the east, the influences are Languedocien, Dauphinois Provencal and Auvergnat.
In the north, the boundary with Limousin and Auvergnat is a line from Bordeaux – Bergerac – Carlux – S. Cerre – Latronquiere – southern border of Auvergne to the Ardeche border. To the south, Guyennais borders Languedoc along a line running from Castelsarrasin – Montalban – Cordes – Albi to the border of Languedoc at Millau and the Cevannes.
Guyennais is still widely spoken. In Saint Cirq in Dordogne Department, all of the elderly speak Guyennais as a first language and continue to use it amongst themselves at all times.
Although Guyennais is typically lumped under the rubric of Languedocien, others lump Guyennais in with Limousin, saying that there is no way that Guyennais-Limousin is the same language as Languedocien-Gascon. The best view is that Guyennais was close to Limousin and Auvergnat, but it underwent extensive Languedocienization caused by the expansion of Toulouse to the north from the 800’s to the the 1200’s. At the moment, it is probably closest to Limousin and possibly secondarily with Auvergnat.
There is difficult intelligibility on the border of Guyennais and Gascon. Quercynois, Rouergat and Carladezien are not intelligible with Languedocien.
Guyennais is very similar to the South Limousin spoken in Brive and South Auvergnat. Specific intelligibility data between Guyennais and Limousin and Auvergnat in general is not available.
There is a strong tendency to want to split this off as a separate high level language within Occitan, but there’s no legitimacy to do so yet based on the available intelligibility information.
Haut Quercinois, Bas Quercinois, Rouergat, Carladézien, Bergeracois, Agenais, Gevaudan, and Aurillacois are dialects of Guyennais.
Quercynois (Carcinòl) is spoken in the Quercy in Midi-Pyrenees. Rouergat is spoken around the city of Rouerge. Carladézien is spoken in Auvergne and is still doing very well. It is transitional to Auvergnat. Bergeracois is spoken around Bergerac. Agenais is spoken in Agen and has Gascon influences. Gévaudan is spoken in the southern part of Lozère, and Aurillacois is spoken in the Aurillac. Both have Auvergnat influences.
North Limousin, spoken north of Correze in Haut Vienne to the Marche and over to Nontron in the west, is a separate language. North Limousin speakers only have 60% intelligibility of Ussel South Limousin. Confolentais, a dialect of North Limousin, is a very peculiar Limousin dialect spoken in Confolens in Saintongeais.
Millevaches is spoken on the Millevaches Plateau south of Limoges. Lemojaud is spoken in Limoges.
Monts de Blond Limousin is a North Limousin lect said to be so different from all other Limousin types that it must be a separate language. It is spoken in the Haut Vienne in the Monts de Blond region around Blond between Nantiat and Confolens near the Charente border. There is heavy influence from Charentais langue d’oil and Creusois.
Nontronnais is a North Limousin dialect that is so unusual that it must be a separate language. It is spoken in the North Perigord region around the town of Nontron near the Saintonge border. It has heavy Saintongeais langue d’oil influences.
Creusois (Marchois) is a language spoken in La Marche or the Croissant in north Limousin and over into Auvergne. It extends roughly from La Rochefoucauld in Charente to Saint-Priest-Laprugne just over the Auvergne border in Loire in the south and from Bellac in Limousin over to Montlucon and Moulins in Auvergne to the north. The eastern portion in Auvergnat underwent much more extreme changes than the western portion.
It borders on and is influenced by the oil languages Berrichon and Bourbonnais in the north and east and Poitou and Charentais in the west but is intelligible with none of them. In the northeast, there is a 50 mile wide Creusois zone between Limousin and Berrichon. Some say it is a langue d’oil with heavy Occitan influence, but a better analysis is of a langue d’oc with heavy oil influence. To the southeast around Vichy, there is some Arpitan influence.
This language is still widely spoken in places. 15 years ago, the dialect of Saint-Priest-la-Feuille in northern Limousin was still spoken by everyone over 40. A bit to the west, 15 years ago, Gartempaud, spoken in the village Gartempe, was still spoken by most residents over the age of 50.
Dialects include Western Creusois, Eastern Creusois, Central Creusois and Montluçonnais.
Montuluconnais is spoken around the town of Montlucon in Auvergnat.
It is often thought to be a part of Limousin, but Creusois speakers have a hard time understanding Limousin. Auvergnat speakers cannot understand Creusois. There is poor intelligibility with Berrichon, a langue d’oil.
This is basically an Occitan-Oil transitional dialect with an emphasis on Occitan.
Map of the Bourbonnais region in north Auvergne and southeast Berry showing Bourbonnais langue d’ oil, Auvergnat Occitan and Forez Arpitan.

Auvergnat is a North Occitan macrolanguage that has 1.35-1.5 million speakers. Auvergnat is spoken in an area covering the departments of Cantal (except the Aurillac region), Haute-Loire, and Puy-de-Dôme and extending to the Gannat region in Allier, the Saint-Bonnet-le-Château region in Loire, and the western border areas in Ardèche in Rhones-Alps Province.
In Auvergne, reports indicate that nearly everyone over age 35 can speak Occitan, and perhaps 50% of those age 15-35 can at least understand the language. 49% of the population supports bilingual signage.
A neo-language called Aleppo (Literary and Pedagogical Auvergnat) has been created. It is used to teach students who come from a variety of educational backgrounds and by writers who wish to enrich their prose by using loans from other dialects.
Every village has its own dialect, and there is often problematic intelligibility even from one village to the next.
People who learn standardized Occitan fairly well are completely lost listening to Auvergnat. Auvergnat in general cannot understand Limousin, with the exception of the dialect spoken in Corrèze. The reason is that the phonetics, inflections and vocabulary of Limousin are completely different than in Auvergnat.
Auvergnat speakers are completely lost with the Languedocien speech of Toulouse and Carcassone. Auvergnat speakers cannot understand Creusois. Auvergnat is utterly unintelligible to Gascon speakers.
Auvergnat speakers cannot understand the Provencal spoken in Montpellier, and there is marginal intelligibility with Nimes and Sète. The Languedocien influence on these Provencal dialects is what makes them hard to understand for Auvergnat speakers. People with one parent who spoke South Auvergnat and another who spoke Provencal were not taught Occitan because the lects were too different, implying that South Auvergnat has poor intelligibility with Provencal.
Auvergnat is closer to French than the rest of Occitan, and it has the strongest Arpitan influences of any Occitan language.
There area two major splits – South Auvergnat or Upper Auvergnat in the south of the region and North Auvergnat or Lower Auvergnat in the north of the region, which are separate languages. The names upper and lower do not correspond with north and south here, which is curious.
South Auvergnat is spoken from Mauriac in the west through Brioude in the center to Crappone sur Arzon south to the border of Auvergne. It has difficult intelligibility with the North Auvergnat spoken in Allier and Puy de Dôme. South Auvergnat is still in good shape, with 67% of the residents of Cantal speaking the language. South Auvergnat is quite similar to Sarlat Guyennais and the South Limousin spoken around Brive.
Dialects include Brivadois, Mauriacois, Yssingelais, and Sanfloran.
Brivadois is spoken around Brioude and Sanfloran around Saint Flour. Brivadois cannot understand the North Auvergnat spoken in Allier and Puy de Dome. It is in between North and South Auvergnat but is best characterized as South Auvergnat.
Mauriacois is spoken in the southwest in Mauriac, but it is very different from Aurillacois. It has some old influences from San Floran and Gevaudan. Yssingelais is spoken in Yssingeaux in far southeast Auvergne. It has strong Arpitan and Alpine Provencal influences. Some have classed this as an Alpine Provencal dialect, but this seems uncertain. Intelligibility data is lacking. San Floran is spoken in St. Flour. This is a very influential dialect, having influenced many nearby dialects.
North Auvergnat is a macrolanguage spoken in Allier and Puy de Dome. It is close to the langues d’oil, especially Bourbonnais but is probably not intelligible with them. North Auvergnat is not doing well.
Speakers of Brivadois, a South Auvergnat dialect transitional to North Auvergnat, have a hard time understanding the North Auvergnat of Allier and Puy de Dome, so it is separate from South Auvergnat. North Auvergnat, especially in the east, is possibly the most divergent lect in Occitan after of Gascon due to very heavy Bourbonnais and Arpitan influence. Some even think it is outside of Occitan proper altogether.
North Auvergnat can be divided into two separate languages – Northwest Auvergnat and Northeast Auvergnat. The differences are so dramatic that they must be separate languages.
Northeast Auvergnat is spoken in the eastern part of the North Auvergnat from Jumeaux and Arlanc north to the west bank of the Allier River near Vichy and Cusset. From Vichy-Cusset to the Loire border, Forez Arpitan was formerly spoken. North of Vichy-Cusset to the Champagne-Ardennes border, langue d’oil Bourbonnais used to be spoken. Northeast Auvergnat has very heavy Arpitan influences that make it so different from Northwest Auvergnat that it must be a separate language. In fact, Livradois speakers cannot understand Besse-en-Chandesse speakers.
Livradois is a Northeast Auvergnat dialect spoken on the broad Lemange Plain in the east-central part of Auvergne bordering on Loire. In the southern part of Livradois around St. Antheme, there are strong Forez Arpitan influences.
Northwest Auvergnat is spoken from about Champes sur Tarentaine to Lempdes north to Pionsat and Gannat. The heavy Arpitan influence on Northeast Auvergnat makes it so different that it must be separate from Northwest Auvergnat. And it is true that Besse-en-Chandesse Northwest Auvergnat speakers cannot understand Livradois Northeast Auvergnat speakers.
Alpine Provencal (Vivaro-Alpine) is a macrolanguage, part of the Provencal macrolanguage, and is often considered to be a separate branch of Northern Occitan.
An overview of Alpine Provencal is here. In France, Alpine Provençal is spoken by perhaps over 100,000 speakers, but most of them are middle-aged or elderly.
Maritime and Rhodanian Provencal speakers cannot understand Vivaroalpine, so it is a separate entity.
Dauphine Provencal (Vivaro-Dauphine) is a separate language within Alpine Provencal. It is spoken in the departments of Ardèche (except the north and the western border areas), Drôme (except the north) and the southernmost parts of Isère.
Dialects include Ardechois (Mid Vivarois), spoken in the center of the Ardeche and Dauphinois or Drômois, spoken in the Drôme River area.
Gerbier de Jonc is an Ardechois dialect spoken in the Ardeche region of that name. It differs greatly from the north to the south, with words changing from village to village.
Other dialects are Albenassien, Albonnais, Annonéen, Southeast Ardèchois, Boutierot, Northeast Drômois, Southeast Drômois, Montilien, Privadois, Valentinois, and Vernoux-Doux.
Privadois is spoken in Privas in the Ardeche. Montilien is spoken in Montelimier in the Drome. Albonnais, spoken in the village of Albon in the commune of St. Pierreville in the central Ardeche, was still widely used in everyday life in the town as of 15 years ago.
In areas east of Haute-Loire in the Southern Auvergnat region, Dauphine Provencal resembles South Auvergnat. It is apparently not intelligible with Rhodanian or Maritime Provencal.
Gavot Provencal is a divergent Northern Occitan language within Alpine Provencal in France. There are intelligibility problems between this and the Dauphine Provencal spoken in the Drome and the Ardeche such that the others say that Gavot is a separate language.
This language is spoken in an area bounded by Gap – Embrun – St. Paul on the north, Sistern – Digne – Anot – Nice on the west, Nice to Menton on the south and Menton – Roya Valley – Italian border to St. Paul on the east. Gavot is an eastern group of Vivaro-Alpine spoken in the French Occitan Alps. Speakers of Maritime and Rhodanian Provencal say they cannot understand Gavot.
Apparently all Gavot dialects, while differing from village to village in vocabulary, morphology, verbal conjugation and phonetics, are mutually intelligible.
Dialects include Molliérois, Embrunais, and Seynois. Molliérois is a dialect of Gavot spoken north of St. Martin Vesubie and Beaui near the Italian border. It differs significantly from the dialects of St. Martin Vesubie and Isola very close by. Embrunais is spoken in Embrun. Embrunais has problematic intelligibility with the Transalpin Provencal spoken in Briançon. Seynois is spoken in the town of Seyne and in the surrounding towns of Auzet Barle, Montclar, Selonnet, and Le Vernet.
Transalpin Provencal is a Northern Occitan language, the Italian group of the eastern section of Alpine Provencal, spoken in the Piedmontese Valleys in the Alps along the northwestern Italian border with France and just over the border with France in the Briançon region. There are about 100,000 speakers in Italy, about 50% of the population in the region. The language is in much worse shape in France, where it is near extinction. It is best seen as a Gavot-Piedmontese transitional language.
It is spoken in 14 Piedmontese valleys in the Alps (in the provinces of Cuneo and Torino) and in one community (Olivetta San Michele) and a few hamlets in the Liguria region (in the province of Imperia).
A lot of parents in this region still pass Transalpin Provencal on to their children, but the language is declining, being replaced with Piedmontese or Italian. It is spoken in the highest valleys only, having been replaced in the lowest valleys first and then the middle valleys. The highest valleys often lack schools, courts, post offices, etc. The people live in homes that often lack heating and bathrooms and sometimes lack electricity.
Of the young people under age 20, 40-50% of them speak the language. There is an increase in the number of cases where two Occitan speaking parents speak Italian only to their children. Of the population of 180,000, about 50,000 are Occitan 1st language speakers.
In Italy, it is spoken in the upper valleys of Piedmont (Val Maira, Val Varacho, Val d’Esturo, Entraigas, Limoun, Vinai, Pignerol, and Sestriero) by speakers of all ages, but younger people are reportedly shifting to Italian. Nevertheless, there are reports that the number of speakers of this language has actually risen in recent years, and it is now recognized as an official language by the state of Italy.
In the Estura Valley, Piedmontese (with heavy Transalpin Provencal influence) is spoken in the lower valley from Demonde up the valley to Aisone, and Transalpin Provencal is spoken from Aisone to the top of the valley. In this area, 100% of the population speaks either Transalpin Provencal or Piedmontese or both. It is only down by Cuneo that you start running into a lot of Italian speakers.
Transalpin Provencal is not intelligible outside of the region.
Escarton is a dialect of Transalpin Provencal that is spoken in France and Italy near the town of Briançon on the border of France and Italy where the Gavot Provencal, Piedmontese and Savoyard Arpitan languages all come together. All three languages influence this dialect, especially Savoyard, but at base it remains an Occitan dialect. It is spoken in the Cottian Alps.
There are many different dialects included under the Escarton rubric. Briançon dialects include Viaran and Montegenevre. Escarton also includes Queyras, spoken around Abries and Aigilles in France to the southeast. In Italy, it includes Oulx in Oulx, Bardonecchia in Val Susa and Val Chisone in the town of Sestriere in Val Chisone.
Escarton has difficult intelligibility with the rest of Occitan. It has better intelligibility with the Transalpin Provencal across the border in Italy than with the Embrunais Gavot of the lower valley in France.
Gardiol is a diaspora Alpine Provencal language spoken in Guardia Piedmontese, an Occitan-speaking town in southern Italy. The town, located in the Cantabria region in Cosenza Province, was established in the 1300’s by people from the Waldensian or Vaudois Protestant movements who were fleeing Catholic religious persecution. They were thought to be heretics and were massacred in the 1300’s.
The language is a Vivaroalpenc dialect formerly spoken in Briançon and in the Varaita and Pellice valleys of France. It is still taught from K-12 in school and has 340 speakers. Gardiol is under strong southern Italian influence. Gardiol is said to be incomprehensible to French Occitan speakers due to the fact that it has been diverging for over 700 years in isolation in Italy.
The outlandish costumes of the women of Guardia Piedmontese, Italy, based on clothing from the 1300's in southeastern France.
The outlandish costumes of the women of Guardia Piedmontese, Italy, based on clothing from the 1300’s in southeastern France.

There are more Gardiol speakers in Germany’s Württemberg, in the US (especially in North Carolina in the town of Valdese), in the Argentinian town of Pigüé, and in Canada’s province of Quebec. Intelligibility of these diaspora lects with the language in Italy is not known.

Bec, Pierre. 1982. “Occitan”, in Posner, Rebecca and Green, John N. Trends in Romance Linguistics and Philology, Volume 3. La Hague, Paris, New York: Mouton.

This research takes a lot of time, and I do not get paid anything for it. If you think this website is valuable to you, please consider a a contribution to support more of this valuable research.

A Reworking of German Language Classification Part 3: Upper German

Updated May 10, 2017. This post will be regularly updated for some time. Warning! This essay is very long; it runs to 101 pages.
This is Part 3 of my reclassification of the German language. Part 3 deals with Upper German.
Part 2, dealing with Middle German, is here, and Part 1, dealing with Low German, is here.
This classification splits Upper German from 10 languages into 81 languages using the criterion of >90% intelligibility = dialect and <90% intelligibility = language.
There is much confusion about the phrase High German or Upper German. Standard German is referred to as Hochdeutsch, or High German, and many think that that means that Standard German is a High German or Upper German language. In fact, it is a Middle German language. However, there is a conflation of Middle German and High or Upper German in which both are subsumed under the mantle of High German. In reality, though, Middle German and High or Upper German are quite different.
The Upper German lects are in pretty good shape. They are located in Southern Germany, and most are doing extremely well. The Upper German Franconian lects are doing fine. The Bavarian lects are going strong. Swabisch and Badisch are doing great.
Low Alemannic in southern Germany is doing fine. Bavarian is the standard language of communication in Austria, and Swiss German is the standard language of communication in Switzerland. Only Alsatian, spoken in France, is somewhat in trouble due to France’s one-language policy.
It is uncertain why Standard German has been unable to take out Upper German languages well, but Southern Germany has always been isolated from the rest due to mountainous terrain and an independent spirit. Bavarian and Swiss German are guaranteed as official languages of nations and are in no danger. A few small Upper German languages in Italy are in trouble, but that is mostly due to their being linguistic islands in a sea of Italian. Upper German Hutterite is doing very well.
This treatment breaks Upper German from Ethnologue’s 10 languages into 82 separate languages.

The Alemannic languages, including Swabish and South Franconian.
The Alemannic languages, including Swabish and South Franconian.

Sudfrankisch (South Franconian) is an Upper German language transitional between Central and Upper German. It is spoken in northwest and north-central Baden-Württemberg around Heidelberg, Karlsruhe, Pforzheim and Rastatt. It has a low number of speakers, and some do not even consider this lect to be a separate entity, so its treatment here is tentative.
The very existence of this language is controversial. For instance, although Karlsruhe and Heidelberg are said to be South Franconian-speaking, in other analyses, the language is “Kurpfalzisch”. This language, or at least the variety spoken in Heidelberg and Karlsruhe, is very hard for Standard German speakers to understand.
Dialects include Bad Schönborn, spoken around the city of the same name, Odenwäldisch, Kraichgauisch, spoken around the cities of Kraichgau and Santkanna, Unterländisch, spoken in and around Heilbronn, Central North Badisch (Zentral Nordbadisch), and Southern North Badisch (Süd Nordbadisch). Intelligibility is apparently good between all dialects (Costin 2015)
The Swabish speaking area in Germany
The Swabish speaking area in Germany.

Schwabian is a Alemannic lect that has about 40% intelligibility with Standard German. Speakers of Standard German say they find it almost impossible to understand. Commercials and TV series in Swabian are shown on German TV with subtitles. It is spoken in southwest Germany in a region called Swabia.
The southern border of the Swabian language is Villingen-Schwenningen. After that, it follows the Danube to the east. In the east, the border is a line from Augsburg south to the Aargau. Reutte/Außerfern, a dialect in upper East Tirol on the Lech River just south of the Bavarian border, is considered to be Swabian. Stuttgart is in the Schwabian speaking area and the standard version of Swabian is spoken in Stuttgart.
It has 820,000 speakers. Swabian has great dialectal diversity, and there is more than one language in Swabian.
Badisch and Swabian form a dialect chain in which the dialects at the far ends of the chain are not intelligible with each other. The Western Swabian dialects are most comprehensible with the eastern Badisch dialects. Swabian is not intelligible with Alsatian, Swiss German or Bavarian. In fact, the differences between Swabian and Swiss German are tremendous. This is important to note because there are claims that the two are mutually intelligible.
Swabian has many lects. Some of the major groupings are Lower Swabian (Niederschwäbisch or Neckarschwäbisch), East Swabian (Ostschwäbisch), Upper Swabian (Oberschwäbisch), and Southwest Swabian (Südwestschwäbisch). Schwäbisch vom Haiberg is an unclassified dialect spoken in the Swabian Alps.
Lower Swabian is spoken in and around Stuttgart and in the Eastern Black Forest. It is not fully intelligible with Upper Swabian (see the Würtingen entry below). In fact, separate languages develop quickly only a few miles from the Upper Swabian/Lower Swabian border. Lower Swabian is also spoken north of Stuttgart up to around Pforzheim and Heilbronn, where it starts shading into East Franconian. Some of the big cities in the Lower Swabia area include Esslingen, Reutlingen and Tubingen.
Würtingen Lower Swabian is a divergent Upper Swabian dialect spoken in Würtingen, 35 miles south of Stuttgart. It is not intelligible with the Upper Swabian spoken just six miles away and may not be intelligible with the rest of Lower Swabian. Investigation is needed to determine if Würtingen is intelligible with the rest of Lower Swabian.
The dialects of Würtingen and Dettingen 35 miles south of Stuttgart are so different as to represent separate languages, Würtingen Lower Swabian and Dettingen Upper Swabian, yet they are only 6 miles away from each other. Dettingen seems to be a Upper Swabian dialect, and Würtingen seems to be an Lower Swabian dialect. This is in the area around Reutlingen, where there are several distinct dialects of Swabian spoken.
Upper Swabian is language a spoken in the Upper Swabia in the Swabian  Mountains (Swabian Alps) in Baden-Württemberg. Tuttlingen is a main city in this area. Upper Swabia is the region from the Swabian Alps south to the Danube. At least the type spoken in Albstadt seems to be unintelligible with the rest of Swabian, in particular with the Swabian spoken in Tuttlingen and Esslingen. Even in and around Albstadt, there are villages only three miles away that speak completely separate languages of Alpine Swabian that are not intelligible with each other, so clearly there are multiple languages within Upper Swabian.
Dettingen Upper Swabian is spoken in and around Dettingen, 35 miles south of Stuttgart. It is not intelligible with the Lower Swabian spoken in Würtingen nearby.
Bavarian Swabian (Bayerisch Schwaben or Rieser Schwäbisch) is a major division of this language that is spoken in the Donau Reis, a region of Bavaria. It can be seen on the map as the Swabish speaking area of Bavaria north of the Danube. This is the form of Upper Swabian spoken in the Schwaben region of southwest Bavaria. According to residents, it is not intelligible with either Bavarian or with the rest of Swabian spoken in Baden-Württemberg (Kirmaier 2009), hence it is a separate language. Dialects include Augsburg and Lechhausen. Lechhausen is quite different. Other towns in the area include Brenz, Iller, and Lech. The town of Lech is said to be the border between Bavarian Swabian and Bavarian.
East Swabian is spoken in the Eastern Swabish Alps. It is also spoken in East Württemberg. Major towns in East Württemberg include Aalen, Ellwangen, Heidenheim an der Brenz, and Schwäbisch Gmünd.
Southwest Swabian is spoken in the Neckar Mountains.
Allgäu Swabian (Schwäbisch-Allgäuerisch) is spoken in the Allgäu region on the border of Switzerland, Swabia and Bavaria. It contains three divisions. Lower Allgäu Swabian (Unterallgäuerisch), Northern Upper Allgäu Swabian (Nord Oberallgäuerisch) and East Allgäu Swabian (Ostallgäuerisch). Wolfegg, Biberach an der Riß, and Bergatreute are dialects.
Reports indicate that the type of Swabian spoken where Austria, Switzerland and Germany all come together is not understood anywhere else in Germany. On that basis, we can assume that Allgäu Swabian is a separate language. Internal intelligibility data for the dialects is lacking.
Russian German Swabish is one of the divergent Swabish dialect spoken by Russian Germans in their widespread colonies. In general, it is not understood by anyone in Germany. There are only a few elderly speakers left. Whether or not it is intelligible with specific Swabish lects is not known. This is an old Swabish from around 200 years ago.
Low Alemannic is a group of Alemmanic Upper German lects that are spoken in southern Baden-Württemberg, across the border into France, a bit into Switzerland, and over into southwestern Bavaria.
A chart of the Alemannic languages in 1950 based on the work of Karl Bohnenberger in 1953. Bodensee and Upper Rhine Alemannic were added based on Hugo Steger's 1983 work
A chart of the Alemannic languages in 1950 based on the work of Karl Bohnenberger in 1953. Bodensee and Upper Rhine Alemannic were added based on Hugo Steger’s 1983 work.

Upper Rhine Alemannic (Oberrhiinalemannisch) is a Low Alemannic superfamily division based on the work of linguist Karl Bohnenberger. This group includes Alsatian, Badisch, Upper Rhine Alemannic proper, and Basel German.
South Badisch is a group of dialects, apparently a separate language, spoken along the French border of Germany and east a ways to the border with Swabian starting near Freiburg im Breisgau and heading up towards Karlsruhe, where it borders South Franconian. The differences between South Badisch and South Alemannic spoken just to the South are considerable, and the two are probably separate languages.
Dialects include Ortenau (Ortenauer), Gottenheim, Freiburg-Opfingen, Elz, Kuppenheim, Iffezheim, Zell am Harmersbach, Kämpflbach, Breisgau (Breisgauer), Middle Kinzig River, and Black Forest (Schwarzwälder).
Elz, a subdialect of Black Forest, is spoken around the city of Waldkirch in the Elz Valley. Gottenheim is spoken 6 miles northwest of Freiburg. Freiburg-Opfingen is spoken in and around the city of Freiburg and is composed to two dialects, Freiburg and Opfingen. Zell am Harmersbach is a dialect of Middle Kinzig River.
Badisch forms a dialect chain with Swabian in which the far ends of the chain are not intelligible. The eastern dialects of Badisch are intelligible with the western dialects of Swabian. Intelligibility data between this and Alsatian is needed. Badisch is not at all intelligible with Standard German.
Alemán Coloniero (Colonia Tovar) is a Low Alemannic language spoken in Venezuela. It is not intelligible with Standard German. It is originally derived from a Badisch-type lect.
Baar Alemannic (Baar Alemannisch) is a Low Alemannic dialect. It is spoken in a region called the Baar in the upper headwaters of the Danube River in far southern Baden-Württemberg.
Towns in this region include Löffingen, Tuttlingen, Bad Dürrheim, St. Georgen, Furtwangen, Villingen-Schwenningen, Rottweil, Trossingen, Hüfingen, Spaichingen, Geisingen, and Donaueschingen. Intelligibility data between this lect, Basel German, South Badisch and Upper Rhine Alemannic and is needed. Rottweil is a dialect spoken in the town of the same name.
Click to enlarge. A map of the languages of Alsace. Alsatian proper is in shades of green. Purples is Rhenish Franconian and light blue is Pfalzisch. Orange and pink are langues d’oil – orange is Welche, and pink is Franche-Compte. As you can see, more languages than just Alsatian are spoken in the Alsace.

Alsatian is a Low Alemannic language spoken in Alsace, France around Strasbourg, and is not intelligible with Standard German, Swabian, Swiss German or Bavarian. In Alsace, it is mostly spoken in the Sundgau region of south Alsace and in the rural areas of the center.
It is an Upper German language related to Schwabian, Swiss German and Walliser. It has 700,000 speakers. The language is still widely spoken despite the fact that it gets little to no support from the French state. 20 years ago, 70% of teenagers said they could speak the language well.
This is a strange area where there are speakers of French, German, and languages that are neither French nor German but are transitional between the two. In this way it resembles the Limburgs region in the Netherlands, Belgium and Germany. Alsatian borders Upper Rhine Alemannic on the east and Alsatian speakers say it is not the same language as what they speak (Auer 2005). Furthermore, Alsatian is not intelligible with the Upper Rhine Alemannic spoken over the border.
The reason is that Alsace has been cut off from the culture of Germany and Switzerland for so long that it has retained many archaic forms that went out to the east. At the same time, a huge amount of French has gone into Alsatian that has not gone into the languages to the east.
Alsatian is actually a number of dialects, not all of which are completely mutually intelligible, although this is somewhat controversial. The language changes from village to village, and it is common for Alsatians to not understand each other. This implies that Alsatian is actually more than one language, but we don’t have enough data yet about intelligibility between varieties to split any of them yet.
However, the Strasbourg variety has been promoted as the standard and is used on the local TV station (Osorio 2001). Dialects include Strasbourg, Colmar, Vosges, Orbey Valley, and Mulhouse.
Alsatian has only 40% intelligibility with Standard German (Minahan 2002).
Another map of the languages of the Alsace.

Lake Constance Alemannic (Bodeseealemannisch) is a super split in the Low Alemannic languages according to linguist Karl Bohnenberger. It includes Allgäuisch, Vorarlbergerisch, and South Württembergish (Süd Württembergisch), all separate languages.
It has a strong French influence. It has 50% intelligibility with Central Bavarian. It is probably not intelligible with Swiss German.
This language family is spoken in Vaduz, Lichtenstein; Bregenz, Austria, and Ravensburg and Tuttlingnen in Baden-Württemberg. In Tuttlingnen, it borders on Swabish.
Allgäuisch (Allgäuerisch) is a group of Low Alemannic lects spoken in far southwestern part of German Bavaria on the border with Switzerland, Austria and Baden-Württemberg. This is part of the Lake Constance Alemannic superfamily. It is not intelligible with Swabish.
It probably resembles Swiss German, but considering that you need a dictionary to translate between Allgäuisch and Swiss German, they must be separate languages.
This language is probably closest to Swabish and the Vorarlbergerisch spoken in far western Austria, to which it is geographically close. This language has heavy French influence.
There are four different Allgäuisch subdialects in each of the four major valleys in the region. One of the dialects is Bernbueren, spoken near Schongau and Weilheim. Other dialects include West Allgäuisch (Westallgäuerisch), East Allgäuisch, Upper Allgäuisch and Lower Allgäuisch. Upper Allgäuisch is further divided into Southern Upper Allgäuisch (Süd Oberallgäuerisch) and Northern Upper Allgäuisch.
Lower Allgäuisch is spoken in the northern Allgäu.
Opfenbach West Allgäuisch is a West Allgauisch dialect. It is spoken at least in and around the town of Opfenbach in far southwestern Bavaria between Wangen and Lindenberg.
East Allgäuisch is spoken in the East Allgäu and in the area around Füssen and the Upper Lech River.
Upper Allgäuisch is spoken in the southern Allgäu and in central Allgäu around Immenstadt and Kempten. The area around Immenstadt and Kempten is probably where Northern Upper Allgäuisch is spoken. Oberstdorf is a dialect of Southern Upper Allgäuisch.
Vorarlbergerisch Vorarlbergerisch is a group of Low Alemannic languages that is part of the Low Alemannic Lake Constance Alemannic Family. It is similar to Swiss German. Vorarlbergerisch was originally a Swabian language. For the most part, the Vorarlbergers came from Valais in Switzerland in the 1200’s and 1300’s.
This language is spoken in Austria and is not intelligible with Bavarian, Standard German or other German languages. It is spoken in Vorarlberg, a region in far western Austria near the Swiss border.
This is a very different form of Eastern Upper Alemannic Swiss German that is still widely spoken in the area of Vorarlberg. Most reports on the lect indicate that it seems to be a separate language, unintelligible with all other German, Swiss German and Austrian lects other than West Allgauish and Appenzell Swiss German.
Most towns in Vorarlberg have their own dialects. It has elements of Swiss German along with Tyrolean and Bavarian. Vorarlbergerisch is so different that speakers are given subtitles when they speak on Austrian TV. Many Vorarlbergerisch speakers either cannot or do not speak Standard German. There are three main divisions of Vorarlberg – Montafon, Lustenauerisch and Bregenzwalderisch.
Feldkirch, Lustenauerisch, and Dornbin are listed as dialects, but Lustenauerisch is so different that it is a separate language. Most Vorarlbergers have some difficulty understanding Lustenauerisch, Muntafunerisch and Wälderisch.
South Württembergish (Süd Württembergisch) is a major division of Lake Constance Alemannic. It is spoken east of Tuttlingnen and the Baar along the Upper Danube, south to the Swiss border and over to the border with Bavaria. This language has a heavy French flavor. South Wurttembergish has good intelligibility of Vorarlbergerisch (Scheffknecht 2015) and is best seen as a form of that language.
Überlingen, Radolfzell, and Konstanz are dialects. Konstanz is spoken in the city of Konstanz on Lake Constance straddling the Swiss border. It is very different from the Thurgau Swiss German spoken across the border in Kruezlingen (Auer 2005).
Lustenauerisch is so different that it itself is a separate language. Most people in Vorarlberg say that they cannot completely understand Lustenauerisch when it is spoken. That is because for many vocabulary items, the words are completely different. In addition, vowels also differ (Scheffknecht 2015).
Bregenz Forest Vorarlbergerisch (Bregenzwalderisch or Wälderisch) is a very distinct form of Vorarlbergerisch spoken in the Bregenz Forest (Bregenzerwald) in far northwest Vorarlberg on the borders of Switzerland and Germany. Other Vorarlbergerisch speakers from elsewhere in Vorarlberg have some difficulty understanding Bregenzerwald speakers, so it may be a separate language (Scheffknecht 2015). This area is very famous for its dairy products, especially its cheeses. Lustenauerisch speakers say this is a different language from both Vorarlbergerisch and Lustenauerisch.
There are two main dialects of this language – Vorderwald and Hinterwald – and they are quite different. Nearly every village has its own dialect. Intelligibility between dialects is not known. Egg is a dialect of this language.
West Allgäuisch is spoken in the western Allgäu, in the Alemannic-Swabish transition zone of the Allgäu and in the city of Lindau and the area around far eastern Lake Bodensee. West Allgäuisch is close to Swiss German and especially the form of Vorarlbergerisch spoken in the Bregenz Forest (Bregenzerwald) in the northern part of Vorarlberg on the German border. This dialect is apparently intelligible with Vorarlbergerisch (Scheffknecht 2015), at least with Bregenz Vorarlbergerisch. This lect is best seen as a form of Bregenz Vorarlbergerisch.
Montafon Vorarlbergerisch (Muntafunerisch) is a Vorarlbergerisch language that is spoken in the Montafon Valley in Vorarlberg, Austria. This valley extends from about Bludenz to the Silvretta Mountains on the border with Switzerland. Speakers say that the situation is better described as other Vorarlbergerisch speakers having some difficulty understanding Muntafunerisch (Scheffknecht 2015).
It has Romansch influences since it is spoken near the Romansch-speaking part of Switzerland. Even villages 15-20 miles away cannot understand this language. This language is utterly unintelligible to any German. Schruns is a dialect of this language.
Appenzell Swiss German (Appenzellerisch) is an Eastern Upper Alemannic Swiss German lect that, while not intelligible with other forms of Swiss German, is actually intelligible with Vorarlbergerisch (Scheffknecht 2015) and is best seen as a form of that language. It is spoken in Appenzell Canton in Switzerland near the border with Germany, Austria and Liechtenstein. Appenzell Innerhoden and St. Gallen (Sankt Gallener or St. Galler Deutsch) are dialects of this language.
High Alemannic is a group of lects that are spoken primarily in Switzerland. However, a few are also spoken in Baden-Württemberg right on the border with Switzerland. The most famous High Alemannic language is Swiss German. Central Bavarian has 50% intelligibility with High Alemmanic languages.
South Alemmanic is a group of High Alemannic dialects, apparently a separate language, spoken in far southwestern Baden-Württemberg in regions called Markgräflerland and Hotzenwäld. Markgräflerland goes from about Basel to about Bad Krozingen in the north and to the Black Forest in the east. Hotzenwäld is a region around the Swiss border from Wehr to Waldshut-Tiengen, otherwise known as the Waldshut District. The differences between South Alemannic and Banish are considerable, and the two are probably separate languages.
Klettgau is a South Alemannic dialect spoken on the Swiss border in the Waldshut District. Other dialects include Markgräflerland (Markgräflerisch), Hotzenwäld (Hotzenwälderisch), Rheinfelden, and High Rhine Alemannic (Hochrhein Alemannisch). Intelligibility between this and Swiss German in Switzerland and South Sundgau in Germany is not known, although it is probably not fully intelligible with Swiss German. Within Markgräflerland, there are subdialects such as Lörrach, Grenzach-Wyhlen, and Weil am Rhein.
South Sundgau (Süd Sundgauisch) is a High Alemannic dialect spoken in southern Baden down around the Swiss border. Intelligibility between this and Swiss German is not known, but it is said that once you leave Switzerland and cross the border, people are no longer speaking anything close to Swiss German.
Standard Swiss German (Schwyzerdütsch) is a High Alemannic language that is from 20% intelligible with Standard German. For many Germans, Swiss German is about as intelligible as Dutch. It has over 6 million speakers. There are dozens of varieties, and every canton in Switzerland has its own lect. Two major varieties are Zurich and Bernese German. However, Bernese is not intelligible with Swiss German proper. Thurgau is  very different.
The city of Vaduz, Austria, also speaks Swiss German. There are 20-70 different lects within Swiss German, and according to Ethnologue, many of them are not mutually intelligible. Swiss German is so diverse that speakers are given subtitles when they speak on Austrian and German TV.
The dialectal situation of Swiss German is very complex. About 30-40 years ago, before people started moving around a lot, there were many full Swiss German languages that were not intelligible to other speakers. We can call these the pure dialects. However, the situation has changed a lot since then. A form of Swiss German, call it Standard Swiss German, is now used across Switzerland when communicating with people who speak another form of the language.
Many of the dialects seem to be changing from full languages into mutually intelligible forms of Standard Swiss German with regional dialects, similar to the situation in the US with our mutually intelligible regional dialects. When people are interviewed on Swiss TV, they typically speak in this standard language to make sure that they are understood.
There are some elderly people who can speak only their regional form of Swiss German and not the standard version, and sometimes they cannot communicate with people in a similar situation speaking another version of the language.
However, if you recorded speakers of many of the various forms of Swiss German speaking among themselves and then presented it to speakers of other forms of the language, you would probably need subtitles for them to understand it. In terms of lexicon, the Swiss German lects differ dramatically. There may be 40 different words for the same term in 40 different lects.
Many Swiss German speakers dislike speaking Hochdeutsch, only speak it if they have to, and may refuse to speak it unless it is mandatory. Hochdeutsch classes are now mandatory in the schools, but most Swiss hate to study the language, and this requirement is resented by many Swiss. Some can understand the Hochdeutsch spoken on TV but may not understand the Hochdeutsch of a visitor. Some older Swiss cannot understand Hochdeutsch at all.
However, most even elderly Swiss can speak some form of Hochdeutsch (Chervet 2016).
Although Swiss German is considered to be a Upper German language, it has Low, High and Highest Alemannic forms inside of it. Hence, “Swiss German” is something of a trashcan description for forms of German spoken in Switzerland. The Pündner dialect is unclassified.
Basel German (Baseldeutsch, Baslerdütsch, Baslerdietsch, Baseldütsch) is a type of Low Alemannic Swiss German spoken in and around Basel, Switzerland, that is not intelligible with High Alemannic Swiss German.
However, the watered-down lect spoken in the city of Basel itself nowadays is indeed intelligible with Swiss German Proper (Chervet 2016). It is spoken across the border a bit into France west of Basel and north and northeast of Basel up into Baden-Württemberg to Freiburg.
There are different dialects spoken in Baselstadt (a canton encompassing the city of Basel) and Baselland (Basel Canton), but it is not known how much they differ. Intelligibility between Basel German and South Alemmanic spoken to the north is not known, but it is said that when you cross from Germany to Switzerland in this region, people are no longer speaking the same language.
Bernese Swiss German (Bärndütsch, Bäärndüütsch, Berndüütsche, Baernduetsch, Bern Deutsch) is is a Western High Alemannic Swiss German language that is not intelligible with Swiss German proper and is thus a separate language. Langenthal is a dialect of this language.
Other Western High Alemannic Swiss German dialects include Solothurn (Solothurner, Solothurnerdütsch), Olten, West Aargau (Westaargauisch), Lower Frick Valley (Unterfricktal), Möhlin, Upper Frick Valley (Oberfricktal), Laufenburg, Central Aargau, Aargau, Middle Bernese (Mittelbernisch), Entlebuchisch, Lucerne (Lozärno, Lozärnerdütsch), and Zug (Zogerdütsch).
The Frick Valley is located in northwest Aargau Canton. Möhlin is a subdialect of Lower Frick Valley and Laufenburg is a subdialect of Upper Frick Valley. Olten is a subdialect of Solothurn. Intelligibility data between the lects is not known.
Ettiswil Bernese Swiss German is spoken in the town of Ettiswil in the canton Bern. It is so divergent that it may well be a separate language.
Zurich Swiss German (Zuridootch, Züridüütsch, Zürcher, Züritüüstcht, Züritütsch, Züridütsch, Zöridütsch, Zuerideutsch or Zürischnüre) is not readily intelligible to speakers of Standard Swiss German. It is spoken in Zurich.
As most Swiss hear this language a lot on TV, they are familiar with it and it is probably intelligible to most of them, but that does not mean it’s inherently mutually intelligible, because it’s not. Züridüütsch is a Central Swiss German dialect. Zurich Oberland and Goldbach are dialects of this language.
Other Central Swiss German dialects include Stadtzürcherisch, Ämtler, See, Oberländer, Winterthurer and Unterländer.
Schaffhausen (Neu Schaffhauserdeutsch, Schaffhuserisch), Zurich Weinland (Zürcher Weinländerdeutsch), Davos, Lower Toggenburg (Untertoggenburgerisch), Upper Toggenburg (Obertoggenburgerisch), and Rheintal (Rheintalerisch), Seeztal (Seeztalerdeutsch).
Other dialects in the same group include Middle Lucerne/South Aargau (Mittelland Luzerndeutsch/Südaargauisch), Sursee, East Aargau (Ostaargauisch), Schaan, Balzers, Lucerne (Luzerndeutsch, Luzerner, Luzärnerisch, Luzärner), Bünd (Bündnerisch, Bündner, Bündnerdüütsh, Bündnerdütsh), Bad Ragaz, Chur (Churertütsch, Churer) and Graubünden (Graubündnerisch).
Intelligibility data is lacking. Lucerne contains the following subdialects: Lucerne Hinterland (Hinterland Luzerndeutsch), Middle Lucerne (Lucerne Mittelland), Rigi, Sursee, Entlebuch and Lucerne/Hochdorf. Bad Ragaz is a subdialect of St. Gallen. Chur and Davos are subdialects of Graubünden. Schaan and Balzers are spoken in Lichtenstein.
Thurgau Swiss German is an Eastern High Alemannic Swiss German language that is hard for many Swiss German speakers to understand. Dialects include West Thurgau (West Thurgauerisch), East Thurgau (Ost Thurgauerisch) and Upper Thurgau.
Inner Swiss German is a group of Swiss German lects that are transitional between High Alemannic Swiss German and Highest Alemannic Swiss German. Intelligibility data is lacking. Dialects include West Oberland (Westoberländisch), Haslital (Haslitalerisch), Lungern, North Urn (Nord Urnerdeutsch), South Urn (Süd Urnerdeutsch), Obwalden (Obwaldnerisch), Nidwalden (Nidwaldnerisch), Engelberg (Engelbergisch) and West Obwalden (Westobwaldnerisch). Lungern is a dialect of Obwalden.
Nidwalden Swiss German (Nidwaldnerisch) is an Inner Swiss German language that is not intelligible with other Swiss German lects, especially with Zurich Swiss German. Intelligibility with other Inner Swiss German lects is not known.
Fribourg Swiss German (Fribourgerisch, Friburgerisch) is a Highest Alemannic Swiss German language that is not intelligible to other speakers of Swiss German and must be a separate language. It is spoken in Fribourg Canton southwest of Bern in southwest Switzerland. Intelligibility with other Highest Alemmanic Swiss German lects is not known. Jaun, Sensebezirk and St. Antoni are dialects of this language.
Other Highest Alemannic Swiss German lects include Unterwalden and Glarus (Glarnerdeutsch, Glarner). Since Highest Alemannic languages seem to be hard for High Alemannic Swiss German speakers to understand, it is questionable to what degree the lects above are intelligible to High Alemannic. Intelligibility testing is in order.
Bernese Oberland Swiss German is a Highest Alemmanic Swiss German language notorious for having poor intelligibility even with native speakers of Swiss German. It therefore qualifies as a separate language. Intelligibility with other Highest Alemmanic Swiss German lects is not known.
Uri Swiss German (Ursnerisch)is a Highest Alemannnic Swiss German language has poor intelligibility with other Swiss German speakers, in particular with Zurich. It is spoken in Uri Canton. Intelligibility with other Highest Alemmanic Swiss German lects is not known. Attinghausen is a dialect.
Schwyz Swiss German is a Highest Alemannnic Swiss German that is not intelligible to other Swiss German speakers, especially speakers of Zurich. It is spoken in the canton of Schwyz. Intelligibility with other Highest Alemmanic Swiss German lects is not known.
Walser German is a Highest Alemannic language spoken in Switzerland, Italy, Austria, Lichtenstein and Germany. It is spoken by 22,780 speakers. It is not intelligible with any other Alemannic languages and is very different. This is very different from the Walliser language, which is a variety of Swiss German spoken in Wallis Canton.
The Walsers split off from the Walliser group in about 1200 and moved to other areas. The Walsers moved into many areas of the Alps, often displacing or attempting to displace Romansch speakers. In many places, settlements failed, but they held in a few others.
By the mid-1300’s, Black Plague ended the Walser migrations by devastating both the source and the destinations of the migrants.
Most Walser dialects are very different even from one another, so there may be more than three languages in Walser. A process of assimilation is occurring in Switzerland whereby Walser speakers are assimilating to the German-speaking culture around them and in the process losing their language. Intelligibility between the widely variant dialects, other than Toitschu, is not known.
The Walser are expert dairymen, woodworkers, weavers and mountain-climbers who often build a distinctive style house called a Walser house.
Walser has many dialects.
Prättigau (Prätttigauer), Avers, Obersaxen, Davos and Rheinwald are spoken in Grisons Canton.
Triesenberg is spoken in Lichtenstein and has the support of the local government.
Kleinwalsertal is spoken in Austria and has been on the decline lately.
Rimella, Rima San-Giuseppe, Alagna Vallesia, Macugnaga and Formatta are dialects of Walser spoken in northwest Italy. The dictionary for Algana Walser has an incredible 22,000 words. Intelligibility data among dialects is not known.
Gurin Walser German (Gurinerdeutsch) is a Walser dialect spoken in Bosco-Gurin, Ticino (Italian-speaking) Canton, Switzerland. It has remained isolated from other German varieties for centuries and may well be a separate language. This is close to the forms of Walser spoken in Italy. It must be unintelligible with other forms of Walser other than Italian Walser, and since Italian Walser is not even intelligible to the villages right next door, Gurin Walser must be a separate language.
There are only 23 speakers of this language left in the village of Bosco-Gurin, and it seems to be dying out (PFECMR 2006). However, including speakers outside the town, there are 120 speakers. In addition, 40 people have receptive but not productive competence in the language (COE 2006).
Toitschu Walser German is an outlying language related to Walser that is spoken in the village of Issime in the Upper Lys Valley in Valle d’Aosta in far northwest Italy.
Toitschu is a highly divergent Walser lect that has been heavily influenced by Piedmontese and Francoprovencal. It is unintelligible with the rest of Walser and is a separate language. Both Toitschu and Titsch have 600 speakers and are both an endangered languages.
Titsch Walser German is spoken in the same region as Toitschu in the Italian Alps of northwest Italy in the nearby villages of Gressoney-Saint-Jean and Gressoney-La-Trinité. There are currently major efforts underway to preserve both Toitschu and Titsch, but the regional Italian government does not seem very cooperative.
Both languages are quickly giving way to Italian especially and both lack many words for modern things. Titsch is much different from Toitschu as it seems to have continued to evolve in time, while Toitschu seems to have been frozen back in 1200 or so.
There is poor intelligibility between Toitschu and Titsch, and both must be separate languages. Major dictionary projects have just been completed and a large conference on both languages was held in the region recently which resulted in the publication of an amazing 163 page document exclusively about the Walser language. The dictionary of Titsch has an incredible 125,000 words, only 4% of which are foreign loans.
Walliser German has about 250,000 speakers in the German part of Wallis (Valais) Canton, Central Switzerland. It is is a Highest Alemannic language. It is not intelligible with Standard German or with Walser. This is the more modern form of  older, archaic Walser German.
There are six dialects: Gomer, Briger, Saaser, Zermatter (spoken in Zermatt), Lötschentaler and Raron. Simplon is a dialect of Gomer. There is currently a petition before SIL to have it recognized as a separate language. The petition states that all of the the dialects are mutually intelligible.
Gomer differs in having a vowel shift to öi > ö. Briger is the most commonly spoken dialect. Zermatter has a different phonology and sounds melodic. Saaser is similar to Zermatter but not as melodic-sounding. It has a lot of unique vocabulary.
Lötschentaler Walliser is the most archaic dialect, about halfway between the archaic Walser German and the modern Walliser German. It also has a lot of unique vocabulary. It is so different that other Walliser German speakers have a hard time understanding it (Chervet 2016). Therefore it makes sense to split it off into a separate language.
Raron is characterized by a vowel shift ä > e. General Walliser Cäse > Raron Cese.
The main city here is Brig.
The language arose from immigrants from the Bern region who came to Wallis in the 700’s. Two different immigration waves led to two different Walliser dialect groups. In the 1100’s, a Walliser group split off and moved to other parts of the Alps. This group became the Walser German language speakers.
Bavarian. North Bavarian is in yellow, Central Bavarian in pink, and Southern Bavarian is in blue.

Bavarian is a macro-language with three main varieties: Northern Bavarian, Central Bavarian and Southern Bavarian.
There are claims that broad Bavarian is mutually intelligible across its length and breadth, but these claims seem somewhat dubious if not false in light of the 40% intelligibility figure with Standard German and in light of my interviews with native speakers.
Also, there are claims that the diversity of dialects of Bavarian makes it impossible to create one unified dialect for writing Bavarian, as the debate over the Bavarian Wikipedia shows. Even Northern and Central Bavarian, supposedly mutually intelligible, are so different that to create one written form to unite them is impossible.
For these reasons, intelligibility testing is imperative for Bavarian.
Central Bavarian is described as extremely diverse. The various Vienna dialects have all died in the last 20 years, and Viennese now speak a Bavarian-Standard German mixed language based on an old East Viennese dialect mixed with Standard German and no longer speak pure Bavarian.
The differences between Tyrolean Southern Bavarian, Carinthian Southern Bavarian, Styrian Southern Bavarian and Viennese are described as great. An attempt on the Internet to compare Bavarian with Texan English was described as ridiculous.
All of this suggests that intelligibility inside of Bavarian is not all it is cracked up to be.
Bavaria itself is very diverse linguistically, and the state is not synonymous with the language. In Southwestern Bavaria, Bavarian Swabian is spoken; the northern half of Bavaria speaks several Middle German Franconian lects (Bavarian is Upper German); and the far northwest of Bavaria speaks a Palatinian Rhine-Franconian language.
Hence, less than 1/4 of Bavaria actually speaks Bavarian, adding up to about 1/3 of the population of the region. Each Bavarian-speaking village in Germany is said to have its own dialect.
Bavarian is not intelligible with Swabian, Alsatian or Swiss German.
A nice chart of the various Bavarian lects is here.
Northern Bavarian or German Bavarian is spoken in Upper Palatinate, Bavaria. It is not intelligible with Central Bavarian (Kirmaier 2009).
Another map of the various Bavarian languages.

Oberpfälz North Bavarian (Oberpfälzerisch or Oberpfälzisch) is a language spoken in southeastern Germany in central eastern and northeastern Bavaria from Regensburg, Kelheim and the Bavarian Forest north along the Naab River to the Fichtelgebirge (Fir Mountains) and in the Northern Bohemian Forest along the border with Czechoslovakia. It is also spoken up by Neumarkt.
According to residents (Kirmaier 2009), this is a separate language, not intelligible with other German Bavarian lects. Dialects of this language include Danube Oberpfälzisch, which, though different, is fully intelligible with the Oberpfälzisch spoken in Neumarkt. This is the Oberpfälzisch spoken along the Danube around the towns of Kelheim and Regensburg.
Bohemian German (Boehmerwaelderischish) is a Upper German language spoken in Czechoslovakia, Germany and the US. It looks like both North and Central Bavarian.
Starting in the 1200’s, Germans began moving into the Sudetenland, often invited by Bohemian kings. Over the centuries, they pushed out the Czechs and Slavs living in the area and took it over for farming. Although intelligibility data for Bohemian German is lacking, it is often considered to be a full language of its own, so we will treat it as one in this analysis.
Actually, since it ranges from East Middle German to Bavarian Upper German, Bohemian German seems to be a wastebasket designation for the varying lects spoken in the Sudetenland.
On the border of Silesia, it resembled Silesian. On the border of the Erzgebirge, it looked like Erzgebirgisch. In the far northeast, where the Riesengebirge separated Bohemia from Silesia, in the Hultschiner Laendle, the people had a very divergent lect of their own.
To the south of the city of Mies, along the Bohemian Mountains, it looked like Niederbayerisch. A dialect called Böhmish is spoken spoken in the Böhmerwald or Bohemian Forest. In the south, extending all the way towards Moravia, it looked very much like the Central Bavarian spoken in Austria. Sorting all of this out and determining what was a dialect and what was a separate language is going to be difficult. Schönhengst is a dialect of this language spoken in Moravia. Some Bohemian German speakers migrated to New Ulm, Minnesota. Quite a few others could be found in Bukovina, Romania.
Egerland Bohemian German (Egerlaenderisch) is spoken in Bischofteinitz, Mies, Tachau and Taus Counties in the Czech Republic in Western Bohemia and in and around New Ulm, Minnesota, where there are still speakers ranging from 52-98 years old. In the Czech Republic, each village had a separate dialect, but all dialects are mutually intelligible. This appears to be a separate language from Oberpfalz Northern Bavarian. German speakers visiting New Ulm say that they cannot understand one word of this language.
This seems to be the same language as Sechsämterland spoken across the border. The Sechsämterland dialect is spoken in the area around Selb, Wunsiedel, Hohenberg and Thierstein in the far northeast of Bavaria near the border with Czechoslovakia and Lower Saxony.
Dialectal diversity is very high in this area, and every village has its own dialect.
Lauterbach is a divergent dialect spoken east of Tirschenreuth on the Czech border. Tiss is a divergent subdialect of Egerland. Sangerberg is a divergent Egerlaenderisch dialect spoken in Prameny, Czechoslovakia. Eger is spoken in the large German city of the same name. Tachauer is a dialect that formed the basis for the Machliniec dialect spoken formerly spoken by the Carpathian Germans in their language island in the Machliniec area of the Ukraine. They left during WW2.
German Central Bavarian is a group of Bavarian lects that are spoken in Germany. This group includes Lower Bavarian, Upper Bavarian and Lechrain Bavarian (Lechrainisch). It has 50% intelligibility of High Alemannic and Lake Constance Low Alemannic. Lechrain Bavarian is spoken in Western Bavaria and is transitional to Swabian. Map of the Lechrain region. Lechrain is very different from the rest of Bavarian, but intelligibility data is lacking.
Lower Bavarian includes the Bohemian Forest language and many dialects.
Upper Bavarian includes the Starnberg, Highland and Meisbach languages and many dialects.
Lower Bavarian Central Bavarian (Niederbayerisch) is spoken in the Lower Bavarian region of German Bavaria. Major cities include Landshut.
According to residents (Kirmaier 2009), this is a full language unintelligible with other German Bavarian lects. Speakers of Landshut Lower Bavarian Central Bavarian claim that Landshut is intelligible with Münchnerisch.
On the other hand, some speakers of Münchnerisch find Regensburg Niederbayerisch almost impossible to understand. Dialects include Landshut, Regensburg, Passau, Straubing, Rottal-Inn, Breitenberg, Neureichenau, Thalberg, Germannsdorf, Untergriesbach, Wegscheid, Geiselhöring, Rattenberg and Landau.
Rottal-Inn is spoken in the Rottal-Inn district east of Munich. Towns here include Eggenfelden, Pfarrkirchen and Simbach am Inn. Rottal-Inn is a fairly typical Central Bavarian dialect, nevertheless, the dialect of Simbach is different from the dialect spoken just across the border in Braunau.
Breitenberg, Neureichenau, Thalberg, Germannsdorf, Untergriesbach and Wegscheid are spoken in far southeast Bavaria near the Austrian and Czech border and are very divergent. Geiselhöring is spoken in the Straubing-Bogen area of the Bavarian Forest. Rattenberg is also spoken in the Straubing-Bogen area and sounds like Viennese.
Bohemian Forest Lower Bavarian is spoken in the far southern Bohemian Forest, at least along the Regen River and around the town of Zwiesel, where a dialect called Zwieslerisch is spoken. At least Zwieslerisch is not intelligible with the Niederbayerisch spoken around Straubing, which is only 60 miles away. This language is interesting because it has significant influence from Muhlviertel Lower Bavarian in Austria.
Upper Bavarian Central Bavarian (Oberbayerisch) is spoken in the Upper Bavarian region of German Bavaria. The major city in this region is Munich. According to residents, it is a separate language not intelligible with the rest of German Central Bavarian (Kirmaier 2009).
Upper Bavarian Central Bavarian is said to be intelligible across the border into Austria for some ways, but this notion needs clarification since it is said that if you go 15-20 miles in any direction outside of Munich, you are dealing with separate languages.
Some say that people in Munich do not speak Bavarian anymore, but this does not seem to be the case. On the contrary, 20% of the population are Bavarian native speakers and with them, nearly all casual conversation is carried on in Oberbayerisch, and they often refuse to speak Standard German on principle at parties and such.
However, the variety spoken in Munich (Münchnerisch) is a very watered-down type of Bavarian that is no longer the real deal. Nevertheless, speakers of Standard German often find it baffling. The pure Bavarian Münchnerisch seems to be dying in Munich with the massive influx of immigrants from all over Germany. Münchnerisch is still holding on very well in the boroughs of Sendling, Giesling, Obermenzing and parts of Neuhausen.
The type of broad Central Bavarian spoken in Munich is widely understood in the urban centers from Munich to Vienna. There are at least 19 major Central Bavarian dialects, some of which are separate languages.
Dialects include Oberschweinbach, Friedberg, Holledau and Bad Reichenhall. Holledau is spoken in a region north of Munich roughly bounded by Moosburg, Pfaffenhofen, Ingolstadt and Neustadt. This is the largest hops-growing region in the world.
Oberschweinbach is spoken the Fürstenfeldbruck district west of Munich. Bad Reichenhall is spoken southeast of Munich on the border with Austria, near Salzburg. Friedberg, while located in Bavarian Swabia, speaks Bavarian, not Swabian.
Starnberg Upper Bavarian is spoken in and around the city of Starnberg, 12 miles southwest of Munich. It has poor intelligibility with Munich Upper Bavarian. This language is mutually intelligible for some distance around it, but speakers cannot understand the Highland Upper Bavarian spoken 20 miles to the south (Anonymous July 2009).
Highland Upper Bavarian is spoken along the German-Austrian border in Germany and Austria in the regions of Rosenheim, Meisbach and Garmisch-Partenkirchen in Germany and across the border in the Karwendel Mountains in Austria.
Rosenheim Upper Bavarian is spoken in the Rosenheim District south of Munich near the Austrian border, especially along the Mangfall River in the foothills of the Alps, the Chiegmau Mountains. Towns here include Rosenheim and Bad Aibling. It has very poor intelligibility with Münchnerisch. Intelligibility testing is needed between this language, Garmisch-Partenkirchen and Meisbach.
Meisbach Upper Bavarian is a Bavarian language spoken in the Meisbach district of Bavaria in the towns of Meisbach, Finsterwald and possibly others. It is not intelligible with at least some other highland Bavarian lects (de Gyurky 2006). Intelligibility testing is needed between this and other highland Bavarian languages, especially Garmisch-Partenkirchen and Rosenheim, which are close by. Rosenheim is actually the next district over.
Garmisch-Partenkirchen Upper Bavarian is a separate language that is spoken in Garmisch-Partenkirchen 50 miles southwest of Munich 6 miles from the Austrian border. This language is not intelligible at all with Münchnerisch. There are 2 dialects in this language – Garmisch and Partenkirchen. Intelligibility between the two is not known, and intelligibility between this language, Rosenheim and Meisbach is also unknown. This language is also spoken across the border in the Karwendel Mountains in Austria.
This language is said to resemble the Tirol Bavarian spoken in Innsbruck, and may not even be a Central Bavarian language.
Austrian Standard Central Bavarian is a koine language that is understood in most of Austria except for many in Vorarlberg who speak Vorarlbergerisch. It is based somewhat on the Vienna dialect, but it seems to have diverged quite a bit from the true pure Viennese. It is even understood in Tirol.
This language differs dramatically from the Central Bavarian spoken across the border in Munich and in general is often not intelligible with it. There is a wide diversity of lects in Austrian Bavarian. It is not unusual for one lect to not be understood 50-80 miles away. In Austria as a whole, one source describes the dialects of the country as akin to dozens of different languages, which implies that there are more than 20 languages spoken here. Other sources say that there is a different dialect in each Austrian region, and none of them are intelligible with each other. Based on that, further investigation into Austrian Bavarian intelligibility is urgently needed.
The lects are reasonably stable compared to the situation in Germany because most Austrians still grow up and live most of their lives in one area. Nevertheless, the situation is still poorly understood. Central Bavarian is not intelligible with the Southern Bavarian spoken in Tirol, Carinthia or Syria in Austria.
Austrian Central Bavarian has two major divisions, Austrian Central Bavarian proper and Austrian Southern Central Bavarian.
Southern Central Bavarian includes two main divisions – Styrian and West Southern Central Bavarian. Styrian includes West Styrian (Weststeirisch), Middle Styrian (Mittelsteirisch), Upper Styrian (Obersteirisch), East Styrian (Oststeirisch), Southeast Lower Austrian (Südostniederösterreichisch) and Burgenländ (Burgenländisch).
West Southern Central Bavarian includes dialects such as Salzburg (Salzburgisch), Ausseerländ (Ausseerländisch), North Tirol (Nordtirolerisch) and Werdenfelsisch.
Dialects include Innviertlerisch, Linz, Upper Pielachtal, Salzburgerisch , Wienerwald, Braunau, Bad Aussee, Bad Goisern, St. Johann in Tirol, Salzkammergut, Kufstein and many more.
Viennese and Linz are very different. Innviertlerisch is spoken in the Innviertel Mountains in Upper Austria near the Bavarian border. Intelligibility testing is needed between this and Mühlviertlerisch. Upper Pielachtal is spoken along the Mariazellerbahn Railway from Mariazell to St. Polen in Lower Austria.
Salzburgerisch is spoken in Salzburg. Wienerwald is spoken in the Vienna Forest west of Vienna. Bad Aussee is spoken in far northwest Styria near the border with Upper Austria. Bad Goisern is spoken in far southern Upper Austria near the borders with Salzburg and Styria. Braunau is spoken on the border with Bavaria.
St. Johann in Tirol and Kufstein are actually spoken in Tirol – there are a few Central Bavarian lects spoken there. St. Johann is spoken in the Kitzbühel district in the far northeast of Tirol near the border with Salzburg. Kufstein is spoken in the Kufstein district in northeast Tirol near the Bavarian border.
Central Bavarian is a dialect chain in which, while the lects of two adjoining cities are similar, the lects of major cities can differ dramatically. Speakers of Standard German sometimes say that they cannot a word of Viennese Central Bavarian.
Thalgau Central Bavarian is spoken at the very least in and around the town of Thalgau east of Salzburg in Salzburg state. It is utterly unintelligible with other forms of Central Bavarian.
Salzburg Central Bavarian (Salzburgerisch) is spoken in and around Salzburg, Austria. However, as of 30-35 years ago, it had poor intelligibility with Pongauer, Pinzgauer and Flachgauer. Hence, it may well be a separate language. The situation today is not known except that dialect use has dropped off alarmingly in Salzburg since then.
Pongau Central Bavarian is spoken in the Pongau region south of Salzburg in Austria. Towns ion the area include Bad Hofgastein, Schwarzach, Werfen, Bad Gastein, Dorfgastein, Radstadt, Flachau, and Bischofshofen. 30-35 years ago, it had poor intelligibility with Pinzgauer, Salzburger and Flachgauer. Thus it may well be a separate language. Pongauer has Danube Bavarian influences. The situation today is unknown.
Pinzgau Central Bavarian is spoken in the Pinzgau region southwest of Salzburg on the German border near the border with Tirol. The principal town in this region is Zell am See. Towns in the region include Bruck an der Großglocknerstraße, Dienten am Hochkönig, Ferleiten, Fusch an der Großglocknerstraße, Hollersbach im Pinzgau, Kaprun, Krimml, Lend, Lofer, Mittersill, Neukirchen am Großvenediger, Rauris, Saalbach-Hinterglemm, Saalfelden am Steinernen Meer, Taxenbach, Unken, and Uttendorf.
Dialect use remains very high in this area. Pinzgauer is transitional between Central Bavarian and Southern Bavarian, but it is utterly unintelligible with Tyrolerisch. As of 30-35 years ago, it had poor intelligibility with Pongauer, Salzburger and Flachgauer. The situation today is not known.
Flachgau Central Bavarian is spoken in the Flachgau region surrounding Salzburg. 30-35 years ago, it had poor intelligibility with Pinzgauer, Salzburger and Pongauer. The situation today is not known. Like Pongauer, it is similar to the Danube Bavarian spoken across the border to the west in Germany. Towns in the area include Neumarkt am Wallersee, Seekirchen am Wallersee, Mattsee, Anif, Fuschl am See, Sankt Gilgen, Lamprechtshausen, Oberndorf bei Salzburg and Straßwalchen.
Lungau Central Bavarian is a lect spoken in the Lungau District in the southeast part of Salzburg state. It is quite different from surrounding lects. It is transitional between South Bavarian (Tyrolean, Styrian and Carinthian) and Central Bavarian (Salzburg, Upper Austria, Lower Austria. Carinthian influences are most prominent. It has 20,000 speakers. Use of this dialect has dropped off a lot in recent decades.
Intelligibility data with surrounding Bavarian languages is not known, but considering that the other Salzburg district dialects have poor intelligibility with each other, and the uniqueness of Lungauer, Lungauer is probably a separate language.
Mühlviertel Central Bavarian (Mühlviertlerisch) is spoken in the Muhlviertel, or Bohemian Forest, region of Austria where Austria, Czechoslovakia and Germany all come together. It has poor intelligibility with other types of Austrian Central Bavarian. This language is extremely variable, with each village having its own dialect, and the dialects even between villages often differing markedly.
It does not appear to be readily intelligible with the Linz dialect spoken in the biggest city of Upper Austria either. Intelligibility is unknown between this language and Bohemian Forest Lower Bavarian spoken in the German part of the Bohemian Forest. Rural Upper Austrian Central Bavarian in general is unintelligible in both Vienna and Graz.
Viennese Central Bavarian (Wienerisch) itself seems to be a separate language. The stronger form of the dialect spoken by low level workers, taxi drivers, etc. is hard to understand even for other Austrians speaking closely related lects. It is therefore reasonable to assume that this hard form of Wienerisch is a separate language. It is still alive in some suburbs such as Ottakring.
Viennese has many unusual words that other forms of German lack. It has a comical quality that is sometimes imitated in parodies.
Most Viennese now speak a Viennese German dialect that is readily understandable to any speaker of German. It is quite similar to the standard German spoken on news outlets. There are a few words that are different for body parts, expletives and food, but other than that, the vocabulary is the same as Hochdeutsch. The accent is less different than the difference between British and American English.
However, Lower Austrian Central Bavarian is still spoken, mostly by older people, in the countryside outside Vienna. It is only about 50% intelligible with Viennese Central Bavarian, so it is a separate language. Further investigation is needed to determine the exact names of these various rural lects and how well they can communicate with each other.
Carpathian Central Bavarian was formerly spoken in Slovakia by scattered German colonies. They were ethnically cleansed after WW2, and most ended up in Germany. There still appear to be some speakers left, but they are probably elderly and the languages appear to be moribund.
Dialects included Pressburg, Zipser and Hauerlaender. Pressburg was spoken near the city of Pressburg, and Zips and Hauerlaender were spoken near areas of the same names. Pressburg is a dialect of Viennese, but Zips and Hauerlaender are so diverse that they are not intelligible with any other forms of Bavarian.
Zipser Carpathian Central Bavarian was spoken in an area of Slovakia called the Zips. Speakers were ethnically cleansed after WW2. Scattered elderly speakers probably remain, mostly in Germany. Not intelligible with any other forms of Bavarian (sample).
Hauerlaender Carpathian Central Bavarian was spoken in and around an area called the Hauerland in Slovakia. Speakers were ethnically cleansed after WW2. Scattered elderly speakers probably remain, mostly in Germany. Not intelligible with any other forms of Bavarian (sample).
Landers Central Bavarian is spoken by Transylvanian Saxons who lived in Transylvania in Romania. They were deported from the Salzkammergut region of Austria northeast of Salzburg in the 1730’s. They were ethnically cleansed after WW2, but then were allowed to return.
The language is still spoken in Neppendorf, Großau, and Großpold in Romania and in Germany where many of the Landers fled to after the war. They originally spoke a Salzkammergut Central Bavarian lect, but over time, it changed so much that it must surely be a separate language, and that is the impression that Tapani Salminem, top expert on European languages, gives in a recent assessment.
Southern Bavarian is spoken in Austria and in Alto Adige-Südtiro in Italy and includes the cities of Graz, Klagenfurt, Lienz and Innsbruck in Austria and Bozen and Moran in Italy. It is also spoken in the Samnaun region in Switzerland.
Some of the Tyrolean lects in Austria, referred to here for convenience sake as Tyrolean Southern Bavarian (Tirolerisch), are so divergent that they are not intelligible with the rest of even Southern Bavarian; further, each valley has its own lect , and some are not intelligible even with each other. Hence, Austrian Tyrolean Southern Bavarian is a separate language.
In Innsbruck, the main city in the Tyrolean Bavarian region, speakers have a hard time understanding many of the Tyrolean Bavarian lects spoken in many of the surrounding valleys.
There are several main divisions in this language, including Tirol Highlands (Tiroler Oberländisch), Central Tirol (Zentral Tirolerisch), Tirol Lowlands (Tiroler Unterländisch) and East Tirol (Osttirolerisch). Smaller dialects include Innsbruck, Galtür, West Steeg, West Stuben, West Ischgl, West Lech, West Warth, West St. Anton/Tirol, Imst and Zillertal. Zillertal is spoken in the Zillertal Valley.
Samnaun is an isolated dialect of this language spoken in the Samnaun region of the Lower Engadine Valley on the border of Austria and Switzerland. It is also spoken in the town of Samnaun in Switzerland, making it the only Bavarian lect spoken in that country. It is said to be very different from the rest of Southern Bavarian, possibly due to its heavy Romansch influence. The Samnaun area was Puter Romansch speaking all the way up into the 1800’s. Intelligibility between Samnaun and the rest of Austrian Tyrolean Bavarian is not known.
Zillertal Tyrolean Southern Bavarian is not intelligible with Kitzbuhele spoken to the northwest, therefore, it is a separate language. Zillertal is transitional with Salzburg Central Bavarian to the east.
Kitzbuhele Tyrolean Southern Bavarian has poor intelligibility with Zillertal, therefore, it is a separate language. Kitzbuhele has probably even more Salzburg Central Bavarian influence than Zillertal. Kitzbuhele is spoken in the Kitzbuhele Mountains on the eastern border of Tirol Province.
Ötztal Tyrolean Southern Bavarian is one of the most ancient and divergent lects in Austrian Tyrolean Southern Bavarian. It has about 8-15,000 speakers. It was recently awarded a UNESCO cultural heritage award as a unique cultural heritage. There is no one Ötztal lect, but there are separate dialects in every little village, and they often vary dramatically.
It is spoken in the Ötztal Valley in Austria is understood at least into the Upper Inn Valley in Austria and over the border in Italy to the Schnals region northwest of Merano. Ötztal appears to be secure for the next few generations anyway and is the common means of communication among people of all ages. Since Ötztal is not understood outside the region, it must be a separate language.
Lower Inn Valley Tyrolean Southern Bavarian is not intelligible with Lechtal Tyrolean Southern Bavarian spoken just to the northwest. This language is spoken in the lower valley of the Inn River west of Innsbruck. Therefore, it is a separate language.
Lechtal Tyrolean Southern Bavarian has poor intelligibility with Lower Inn Valley Tyrolean Southern Bavarian spoken just to the southeast. This language is spoken in the Lechtaler Mountains west of Innsbruck. Towns in the region include Steeg, Bach, Elbigenalp, Elmen, Stanzach, Weissbach and Reutte. This language is on the border between the Alemannic and Bavarian language groups, and it also has an Allgauish flavor.
Pitztal Tyrolean Southern Bavarian is spoken in the Pitztal Mountains west of Innsbruck. Towns in this region include Arzl and St. Leonhard. Pitztal is very different from Ötztal Austrian Tyrolean Southern Bavarian and communication between the two lects is difficult. Therefore, Pitztal is a separate language.
West Tyrolean Galtür was Swiss German speaking until 1900, and today its dialect is more Alemannic than other Tyrolean lects. The West Tyrolean areas of West Steeg, West Stuben, West Ischgl, West Lech, West Warth and West St. Anton/Tirol, all along the border of West Tyrol and Vorarlberg, were originally Highest Alemannic Walser settlements like Vorarlberg. All of West Tyrol was Swabian-Bavarian speaking until the Middle Ages.
Onto this Swabian base came influence from the Walser and Swiss German villages described above, and all of this on top of an earlier Romansch base, as the whole region was also Romansch-speaking. All of these have receded, leaving only Tyrolean Bavarian, but these are the substantial inputs into Western Tyrolean Bavarian.
Western Styrian or Western Styrian Southern Bavarian, (Steirisch) is said to be unintelligible outside of the region, and hence must be a separate language. Another lect spoken in Styria, this one in the southern part, is South Styrian. Intelligibility data is not available.
Speakers of Central Austrian spoken on the Austrian flats cannot understand Carinthian Southern Bavarian (Kärntnerisch) either, so it looks like a separate language too. There are three principal dialects of Carinthian, Upper Carinthian (Oberkärntnerisch), Middle Carinthian (Mittelkärntnerisch) and Lower Carinthian (Unterkärntnerisch). Intelligibility data is lacking. Carinthian has heavy Slavic influence due to its proximity to Slovenia.
There are also speakers of Carinthian Southern Bavarian in the Canale Valley/Val Canale area of Udine in Italy. This area used to be part of Austria but it changed hands after WW2 and most of the German speakers moved to Austria. Now about 80% of the population speaks Italian and Friuli and 20% speak Carinthian. This appears to be the same language in Italy and Austria. In Carinthia, there are at least 10 separate dialects of this language.
Intelligibility testing is needed between Tyrolean Southern Bavarian and Carinthian Southern Bavarian.
Gottschee Southern Bavarian (Göttscheabarisch or Gottscheerisch) is an outlying Bavarian language spoken by people called the Gottscheers in Kocevje, Slovenia. They apparently originally came to the region in the 1300’s from the Carinthian/Tyrolean border area. It is heavily influenced by the Slovene Carniolan dialects.
It is closely related to the lects of other outlying German colonies in the area, including Zahre (Sauris in Italian), Tischelwang (Paluzza-Timau in Italian) and Pladen (Sappada in Italian) in Northern Italy. The Italian settlements were settled around 1420.
Pladen/Sappada is in the eastern Upper Italian province of Belluno at the far end of the Piave Valley, to the south of the Carnic Alps. These people originally came from the East Tyrolean Pustertal Valley in Austria in the vicinity of Sillian-Heimfels near the towns of Villgraten, Tilliach, Kartitsch, Abfaltersbach and Maria Luggau. Pladen Southern Bavarian is spoken here by about 1,000 of the 1,500 residents, but many also speak Friulian (Maurer-Lausegger 2007).
Southern Bavarian is spoken in Zahre and based on an old East Tyrolean language from the Lesach Valley, which they left in 1280. Zahre is very similar to Pladen, but has more influence from the Romance family, particularly Italian (Maurer-Lausegger 2007). However, Zahre has been isolated from Pladen for 700 years (Denison 1971). This time period is so long that the two lects are probably no longer mutually intelligible.
Zahre is still very much alive and spoken in the town, but it is being displaced by Friulian among young adults and by Italian among children. The Zahre lect was pronounced nearly extinct in 1849 and again in 1897 by visitors.
In Timau, Tischelwang Southern Bavarian is spoken in the But valley, on a tributary of the Tagliamento River on the southern slopes of the Plöcken Pass in the Carnic Alps in the province of Udine. This is actually a Carinthian lect that is probably not intelligible with the Pladen and Zahre lects, though intelligibility data is needed (Maurer-Lausegger 2007).
Therefore, Tischelwang Southern Bavarian is in all probability a separate language. Pladen and Zahre are probably no longer intelligible with lects in Austria, considering they have been isolated from their Austrian parents for 700 years, hence they are probably separate languages. Pladen and Zahre have been isolated from each other for 700 years since the migration, hence they are probably two separate languages, Pladen Southern Bavarian and Zahre Southern Bavarian.
Tischelwang has been heavily influenced by the Friulian language.
Gottscheerisch has maintained many of the features of the Medieval Bavarian languages and it is said to be the oldest living Bavarian language. Speakers were ethnically cleansed after WW2, and now they are scattered about the world. There are about 3,000 native speakers left in the world, many of them living in Ridgewood, New York, where speakers still maintain the language. All remaining speakers are elderly.
It does not appear to be intelligible with the rest of Bavarian or with other German languages and is therefore a separate language.
In Italy, Italian Southern Bavarian encompasses three different lects that differ dramatically from one another. It is spoken in Belluno, Trento and Udine (Maurer-Lausegger 2007).
The Fersina Valley/Valle del Fersina is in Eastern Upper Italy, to the north of Pergine (Persen) near the capital of Trento in the province of Trentino. There are many Bavarian speakers here. They originally came from various valleys in North and South Tyrol. They speak an old mixed Tyrolean vernacular from the 1200’s with a lot of unique developments.
In addition, in the Fersina Valley, every village has its own subdialect. Fersina Valley Southern Bavarian is probably a separate language and is probably not intelligible with other Bavarian lects (Maurer-Lausegger 2007).
In this area, everyone speaks Italian too. This variety of Bavarian has heavy Italian influence.
There is also a South Tyrol Standard Southern Bavarian (Südtirolerisch) that is beginning to emerge in this part of Italy so the three dialects can talk to each other (Maurer-Lausegger 2007). Although intelligibility data between this koine and the rest of Southern Bavarian is not known, it does appear to be a separate language, as most koines are.
One Tyrolean lect spoken in this area is called Eisacktalerisch. It is spoken in the Eisack Valley of South Tyrol and is about halfway between the Innsbruck dialect and the lect spoken in Bolzano. Intelligibility data is not known.
Since the three dialects of Southern Bavarian in Italy cannot understand each other, we may as well split them off.
Udine Southern Bavarian is spoken in the province of Udine in the Friuli-Venezia Giulia region. It is not intelligible with the varieties of Southern Bavarian spoken in Trentino or Belluno.
Belluno Southern Bavarian is a Bavarian language spoken in the province of Belluno in the Veneto region of Italy. It is not intelligible with either Trento Southern Bavarian or Udine Southern Bavarian. One dialect of Belluno is called Puschterisch and is spoken in the area of Brunico only 15 miles south of East Tirol. Intelligibility with the rest of Belluno is not known.
Trento Southern Bavarian is spoken in the province of Trento in the Trentino-Alto Adige/Südtirol region of Italy. It is not intelligible with Belluno Southern Bavarian or Udine Southern Bavarian.
Hianzen Southern Bavarian (Hianzisch) is spoken in southern Burgenland, Austria, along the Hungarian border, particularly around the town of Güssing. It seems to have poor intelligibility even with other nearby forms of Southern Bavarian.
Cimbrian is a Bavarian macrolanguage spoken in northeastern Italy. It is not intelligible with Standard German or with other Bavarian languages. It has 2,230 speakers. Cimbrian is actually three separate languages.
Lusernese (Lusern) Cimbrian is a separate Cimbrian language not intelligible with other types of Cimbrian. It is spoken in the province of Trento, Italy, where it has 500 speakers in Trentino Alto Adige 40 km southeast of the city of Trento.
Tredici Communi (Dreizehn Gemeinden) Cimbrian (Tauch) is a separate Cimbrian language not intelligible with other types of Cimbrian. It has 230 speakers near Verona, Italy, where it is currently spoken only the village of Giazza-Ljetzan.
Sette Comuni (Sieben Gemeinden) Cimbrian is a separate Cimbrian language not intelligible with other types of Cimbrian. It is spoken near Asiago, Italy, where it is currently spoken only the village of Roana-Robaan. It has 1,500 speakers.
Mocheno is a Bavarian language spoken in Alto Adige-Südtirol, Italy. It is not intelligible with Standard German or with other Bavarian languages. It has 3,500 speakers.
Hutterite German is a Bavarian language spoken in Canada and the US. Intelligibility: 70% intelligible with Pennsylvania German, a Palatine language, but only 50% intelligible with the Low German Plautdietsch and Standard German. Hutterite is derived from a Carinthian Bavarian lect.
Yiddish is a language spoken by European Jews that has heavy Hebrew influence on a Germanic background. It branched off from Medieval Middle German (mostly Rhenish languages) and was influenced by modern German in the 1800’s and 1900’s. It is not a dialect of German as commonly thought, but is instead a full language. It contains two languages, Western Yiddish and Eastern Yiddish.
Eastern Yiddish is spoken in Israel by 215,000 speakers and by 3,142,560 Jewish speakers worldwide. It has poor intelligibility with Western Yiddish. Eastern Yiddish originated east of the Oder River through Poland, in an area moving into Belarus, Russia (to Smolensk), Lithuania, Latvia, Hungary, Romania, Ukraine, and Palestine before 1917 (in Jerusalem and Safed).
There are three dialects: Southeastern, Mideastern and Northeastern. Dialects are apparently intelligible. Southeastern is spoken in Ukraine and Romania, Mideastern is spoken in Poland and Hungary and Northeastern is spoken in Lithuania and Belarus. Eastern Yiddish is not intelligible with Standard German or any other form of German.
Linguist Paul Wexler argues that Eastern Yiddish is a version of West Yiddish creolized over a Kiev-Polessian Slavic lect. Hence, it is a Germano-Slavic creole.
Western Yiddish is a language spoken in Germany by 49,210 Jewish speakers. There are also speakers in Belgium, France, Hungary, Israel, the Netherlands and Switzerland. There are three dialects: Southwestern , Midwestern and Northwestern .
Southwestern is spoken in southern Germany, Switzerland, and Alsace (France). Midwestern is spoken in central Germany and parts of the Czech Republic and Slovakia. Northwestern is spoken in northern Germany and the Netherlands. West Yiddish has poor intelligibility with East Yiddish. Western Yiddish is not intelligible with Standard German or any other form of German.
Linguist Paul Wexler has argued that Western Yiddish is a Germano-Sorbian creole.
Crimean German is an extremely divergent lect of German that must be a separate language. There are probably few speakers of this language left. It is poorly known.
Baltic German (Baltendeutsch) is another extremely divergent lect of German that in all probability is a separate language. They were ethnically cleansed by the Soviets in 1939. This language was formerly spoken by German colonies in the Baltic states. Most of them left for Germany after World War 2. About 10% of the words are unique to Baltic German. The last remaining speakers are mostly over age 45, and it is not being taught to children. There are about 300-400 of them left in Canada, but the youngest of them are age 45. They grew up speaking the language.


Anonymous A and B. Starnberg Upper Bavarian speakers. Oakhurst, California, USA. Personal communication. July 2009.
Auer, Peter. The Construction Of Linguistic Borders And The Linguistic Construction Of Borders. 2005. In Filppula, Markku, Palander, Marjatta and Penttilä, Esa (eds.) Dialects Across Borders: Selected Papers From the 11th International Conference on Methods in Dialectology (Methods XI), Joensuu, August 2002. Current Issues in Linguistic Theory 273. Amsterdam: John Benjamins Publishing Company.
Bindorffer, Györgyi. 2004. Hungarian Germans. Identity Questions: Past and Present. Ethnologia Balkanica 8:115-127.
Chervet, Ben. Swiss German speaker, Bern, Switzerland. Personal communication. February 2016.
Costin, Paul. Karlsruhe South Franconian native speaker. Personal communication. May 2015.
Council of Europe (COE). May 26, 2006. Periodical Report Relating to the European Charter for Regional or Minority Languages Third Report – Switzerland. Strasbourg, Germany.
de Gyurky, Szabolcs Michael. 2006. The Cognitive Dynamics of Computer Science: Cost-Effective Large Scale Software Development, p. 86. Hoboken, NJ: Wiley-IEEE Computer Society, John Wiley & Sons, Inc.
Denison, Norman. 1971. Some Observations on Language Variety and Plurilingualism, chapter 7 in Ardener, Edwin. Social Anthropology and Language. London: Tavistock Publications.Jeep, John M., editor. 2001.
Kirmaier, Andrea. Oberpfälzisch North Bavarian native speaker, Neumark, Germany. Personal communication. March 2009.
Maurer-Lausegger, H. May 21, 2007 The Diversity of Languages in the Alpine-Adriatic Region I. Linguistic Minorities and Enclaves in Northern Italy. Tidsskrift for Sprogforskning, North America.
Medieval Germany: An Encyclopedia. New York and London: Garland.
Minahan, James. 2002. Encyclopedia of the Stateless Nations, Illustrated Edition, p. 42. Westport, CN: Greenwood Publishing Group.
Myhill, John. 2006. Language, Religion and National Identity in Europe and the Middle East: A Historical Study. Amsterdam: John Benjamins Publishing Company.
Osorio, Fransisco. 2001. Mass Media Anthropology. Unpublished PhD thesis: Santiago: University of Chile.
Public Foundation for European Comparative Minority Research (PFECMR). 2006. Walser German In Switzerland – Through the Lenses of the European Charter For Regional or Minority Languages. Council of Europe.
Ross, Charles. 1989. The Dialects of Modern German: A Linguistics Survey. London: Routledge.
Scheffknecht, Sibylle. Lustenauerisch native speaker. Lustenau, Austria. Personal communication, March-April 2015.

This research takes a lot of time, and I do not get paid anything for it. If you think this website is valuable to you, please consider a a contribution to support more of this valuable research.

A Reworking of German Language Classification Part 2: Middle German

Updated June 27, 2016. This post will be regularly updated for some time. Warning! This essay is very long; it runs to 79 pages.
Part 2 post deals with the huge language family known as Middle German. Part 1 deals with Low German and Part 3 deals with High German.
This classification splits Middle German from 15 languages into 43 languages using the criterion of >90% intelligibility = dialect and <90% intelligibility = language.
Middle German is easily the most famous German division of them all, and it is doing better than Low German or High German. That is because Standard German is a Middle German language.
There is much confusion about this because Middle German and High German are often lumped in together as “High German” with Middle German being a branch of High German.  This is reflected in the term for Standard German “Hochdeutsch”, which means High German.
Thuringian (Thüringisch) is group of East Middle German lects related to Upper Saxon spoken to the west of Berlinisch and Upper Saxon. The status of Thuringian is very confused. It’s often said to be easy to understand, but some of the individual dialects are quite hard for Standard German speakers to understand.
However, at least people from Schleswig-Holstein cannot understand Thuringian, so it is not true that all Hochdeutsch speakers can understand this lect. Therefore, Thuringian is best seen a separate language.
Thuringian has a sing-song quality and is one of the easier lects for Standard German speakers to understand. The southern linguistic boundary of Thuringian with East Franconian is formed by the ridge of the Thuringian forest. Thuringian has many dialects.
Northeast Thuringian (Nordostthüringisch) is a Thuringian language spoken in Halle, Merseburg and Bernburg (Saale) in Saxony-Anhalt and in Artern in Thuringia. At least the Halle form of this language is very difficult for outsiders to understand. It is spoken very near the Upper Saxon zone, so possibly it has been influenced by Upper Saxon.
Mansfeldisch is a dialect of Northeast Thuringian spoken in Hettstedt, Mansfeld, and Eisleben in Saxony-Anhalt.
Eichsfeldisch is a Thuringian language spoken in Heilbad Heiligenstadt, Leinefelde, Worbis, and Mühlhausen in Thuringia and in Eschwege, Bad Sooden-Allendorf, and Witzenhausen in Hessen. This dialect is quite divergent and has Eastphalian and North Hessian features.
This language appears to be quite diverse internally, and there may be more than one language in it. It is not intelligible with Eastphalian, North Hessian or other types of Thuringian.
Central Thuringian (Zentralthüringisch) is a dialect of Thuringian that is spoken in a triangle in central Germany formed by Arnstadt, Erfurt, and Gotha.
Ilm Thuringian (Ilmthüringisch) is a dialect of Thuringian is spoken in Königsee, Bad Blankenburg, Rudolstad, Weimar, Jena, and Apolda in Thuringia and in Bad Bibra in Saxony-Anhalt. It has two subdialects, North Ilm Thuringian (Nord Ilmthüringisch) and South Ilm Thuringian (Süd Ilmthüringisch). The lect spoken in Weimar at least is hard for speakers of Standard German to understand.
North Thuringian (Nordthüringisch) is a dialect of Thuringian spoken in Nordhausen, Bad Frankenhausen, and Sondershausen in Thuringia, in Sangerhausen, Harzgerode, and Stolberg (Harz) in Saxony-Anhalt, and in Bad Lauterberg and Bad Sachsa in Lower Saxony.
Stiege North Thuringian is a North Thuringian dialect spoken in the town of Stiege in the Lower Harz Mountains in Saxony-Anhalt just north of the Thuringian border. This area is transitional between Low German and Middle German. The town of Hassenfelde four miles to the north was Eastphalian speaking, but Stiege is Thuringian. So Stieger is a Thuringian language with strong Eastphalian influence. A century ago, Stieger was not intelligible with the rest of North Thuringian (Liesenberg 2008).
West Thuringian (Westthüringisch) is a dialect of Thuringian spoken in Eisenach, Bad Liebenstein, Bad Salzungen, and Ruhla in Thuringia.
Southeast Thuringian (Südostthüringisch) is a dialect of Thuringian spoken in Saalfeld/Saale, Gera, Greiz, Neustadt, and Bad Lobenstein in Thuringia, in Mühltroff and Elsterberg in Saxony and in Ludwigsstadt and Teuschnitz in Bavaria.
Upper Saxon is an East Middle German language that is not mutually intelligible with Standard German. What’s odd is that Standard German was based on a specific Upper Saxon dialect as spoken in about 1700. It has since drifted into a language of its own. Intelligibility between Upper Saxon and Standard German is very poor, worse than intelligibility with Bavarian, and is probably less than 40%.
It is spoken in southeastern Germany, southwest of Berlin near Saxony, in Dresden, Leipzig and Chemnitz in Saxony and around Halle in Saxony-Anhalt. Some other Germans, especially from southern Germany, find Upper Saxon almost impossible to understand (Kirmaier 2009). Upper Saxon is considered by many Germans to be among the hardest dialects of all to understand, if speaking of dialects spoken in Germany proper.
It has extensive Slavic borrowings. Since German reunification in 1990, Upper Saxon has been giving way to Standard German. It has 2-4 million speakers. Upper Saxon has nine different dialects within it.
Standard German (Hochdeutsch) is an East Middle German language based on Upper Saxon, the pluricentric language of German, and the official and uniting language of all German speakers. Genetically, it is closest to Thuringian, Upper Saxon and Lower Silesian, but it has diverged dramatically. It was originally based on a certain Upper Saxon dialect, and there is a dialect of Upper Saxon today that still bears remarkable similarity to Standard German.
The best version of Standard German spoken today (the one that “lacks an accent”) is said to be the speech of Hanover in central Lower Saxony. This is in the Eastphalian Low Saxon area, but Standard German has pretty much cleaned out the Low Saxon in the area and has almost completely replaced it.
It is also known as Hochdeutsch. Most German dialect speakers also speak Standard German, but in a few places there are speakers of German type languages in and around Germany that cannot speak Hochdeutsch, notably in far western Austria, to some extent in Switzerland, and a few older people in Hessen.
Further, the Dutch Low Saxon speakers in the Netherlands, treated as Macro-German speakers in this analysis, may not speak Standard German, though many Dutch have at least some understanding of German. It is possible that some of the South Meuse-Rhenish transitional lects may not speak German either.
Standard German has been seriously impacting Low German since the 1700’s, but it has only effected other German languages recently. Like other pluricentric languages, Standard German serves the function of being a common language for many Macro-German speakers who would not ordinarily have one.
Unserdeutsch is a German-based creole spoken in New Guinea by only about 100 remaining speakers, some of whom are middle aged. It originated based on the Standard German spoken in German colonial times.
It was formed, oddly enough, by New Guinean children who were raised in an orphanage run by German speakers. It then came to be spoken by the White-New Guinean Catholic Vunapope community in the Gazelle Peninsula of New Britain. It is one of only two German-based creoles.
Belgranodeutsch is a German-Spanish creole spoken in Buenos Aires, Argentina. It is a mixture between Spanish and Standard German and is no doubt not intelligible with Standard German. The Belgrano district is a part of Buenos Aires that has many German speakers.
Namibian Black German (Küchendeutsch) is a German pidgin spoken in Namibia based on Standard German. It is presently nearly extinct. It used to be spoken by Namibian Blacks who were servants for their German colonial masters in the German colony of Sudwest Africa. It is probably not intelligible with Standard German.
Berlinerisch is an East Middle German lect, and is one of the easiest German dialects to understand, however, some speakers of Standard German say it takes them several months to learn to understand it completely, so in that sense it may be a separate language. But those speakers were generally Americans who spoke German as a second language. It is said by Berliners that any German can understand Berlinerisch, but there are reports that Upper Saxon speakers have a hard time with Berlinerisch, so it appears to be a separate language.
Ruhrdeutsch is an East Middle German language spoken in the Ruhr far away from the other East Middle German languages. It is a strange language which spoken around Essen in North Rhine-Westphalia. It has elements of Low Franconian Bergisch lects and Westphalian Low Saxon. It is quite distinct and is probably a separate language (sample).
North Upper Saxon (Nordobersächsisch). This Upper Saxon dialect is spoken in the Elbe-Elster Region. Intelligibility data is needed between this and other forms of Upper Saxon. It is not well-known.
Anhaltisch is an Upper Saxon dialect spoken in Dessau, Köthen, Bernburg, Staßfurt and Aschersleben in central Saxony-Anhalt south of Magdeburg. It is also spoken down around Zietz and Hohenmolsen in far southern Saxony-Anhalt where it meets Thuringia and Saxony. It is very divergent. Anhaltisch is transitional between Upper Saxon and Thuringian. Anhaltisch speakers say other Germans find Alhaltisch almost impossible to understand (Wahl July 2014).
Dialects include Gladitz and Trebnitz. The two are very different. Trebnitz is closer to Upper Saxon. Gladitz at least has poor intelligibility even with Brandenburgish.
Osterlandic (Osterländisch) is an Upper Saxon dialect that is spoken in Delitzsch and Torgau in far northwest Saxony, across the border into far southern Saxony-Anhalt in Wittenberg and Bitterfeld, and into far southeastern Brandenburg in Liebenwerda and Elsterwerda. Osterlandic is not intelligible with any Upper Saxon lects spoken in Saxony, nor with Erzgebirgish. This language is still doing very well.
Dialects include Northeast Osterlandic (Nordost Osterländisch), Southwest Osterlandic (Südwest Osterländisch), Southeast Osterlandic (Südost Osterländisch) and Schraden Osterlandic (Schraden Osterländisch).
Meissenish is a group of Upper Saxon lects spoken in Saxony.
North Meissenish (Nordmeißenisch) is an Upper Saxon dialect spoken around the cities of Grimma, Döbeln and Riesa in northern Saxony east of Leipzig. It is little known, but there are still many speakers. This language is incredibly hard for Standard German speakers to understand.
Northeast Meissenish (Nordostmeißenisch) is an Upper Saxon dialect spoken in a small area around Lommatzsch and Großenhain in Saxony northwest of Dresden. It is little known, but must still have many speakers. Intelligibility data is needed between this and other forms of Upper Saxon.
West Meissenish (Westmeißenisch) is an Upper Saxon dialect spoken in Saxony on both sides of the lower Zwickauer Mulde River around Rochlitz, Mittweida, and Borna north and northwest of Chemnitz which forms an intermediate position between North Meissenish and South Meissenish on one side and a Thuringian dialect called Altenburg (Altenburgish) on the other side.
It has Thuringian and Hessian characteristics. It is little known, but still has many speakers. Intelligibility data is needed between this and other forms of Upper Saxon.
South Meissenish (Südmeißenisch) is an Upper Saxon dialect spoken in an area of Saxony around the cities of Öderan, Frankenberg, Hainichen, and Freiberg northeast of Chemnitz. It is poorly known, but still has quite a few speakers. Poor intelligibility with Southeast Meissenish.
Southeast Meissenish (Südostmeißnisch) is an Upper Saxon language spoken in Saxony in a circle around Dresden around the cities of Dippoldswalde, Meißen, Radeburg, Pirna, and Bad Schandau. It was heavily influenced extensively by the old language spoken in Dresden. Many speakers remain, but it is poorly known.
Southeast Meissenish is utterly unintelligible even with other East German lects such as the Havelländisch Markish spoken in Brandenburg west of Berlin. Southeast Meissenish speakers have a hard time understanding South Meissenish.
Northern Bohemian is an Upper Saxon language formerly spoken in the part of Sudetenland region of Czechoslovakia near Saxony. It was spoken in the towns of Děčín, Ústí nad Labem, and Teplice south and southeast of Dresden. It is apparently now extinct. The % of German speakers in the Czech Republic is down from 30% before WW2 to 1% today, or 18,000 speakers. No one speaks Northern Bohemian anymore; all German speakers speak Standard German instead.
East Thuringian (Ostthüringisch) is an Upper Saxon language spoken in Eisenberg and Altenburg in Thuringia and in Zeitz, Naumburg (Saale), and Hohenmölsen in Saxony-Anhalt. It is almost completely unintelligible with Standard German. Intelligibility with other Upper Saxon lects is unknown.
Fore Vogtländisch (Vorvogtländisch) is an Upper Saxon lect that is transitional to East Franconian. It is little known.
Lusatian (Lausitzisch) is an East Middle German language spoken in Eastern Germany. There is difficult intelligibility between this language and Standard German. It has some traces that go back to Dutch for some reason. There are various dialects of Lusatian. Dialects include West Lusatian, East Lusatian, New Lusatian, Upper Lusatian and Lower Lusatian. This language has very marginal intelligibility with Standard German, and intelligibility testing is indicated.
West Lusatian (Westlausitzisch) is a lect spoken in eastern Saxony, east of the upper Pulsnitz River, west of the Lausitzian speakers in the Sorbian area and northwest of Dresden in an isolated region around Pulsnitz, Bichofswerda and Kamenz. This dialect is transitional between Upper Saxon and Upper Lusatian. This lect is little known, but it still has 50,000 speakers.
New Lusatian (Neulausitzer) is spoken in Saxony in the Sorb-speaking area around Bautzen and Hoyerswerda.
Upper Lusatian (Oberlausitzer) is a lect spoken in southeastern Saxony near Zittau by the border of Germany, Czechoslovakia and Poland. It is spoken in the village of Schönbach and other areas. This lect is still the primary means of communication in the region. Difficult intelligibility with Standard German (90% comprehension with slow speech). Upper Lusatian lects only 5 miles away can differ markedly. Intelligibility within these lects is not known. This dialect has a lot of Slavic in it and also some French.
Lower Lusatian (Niederlausitzisch) is spoken around Cottbus, Finsterwalde, Senftenberg, and Spreewald in far southern Brandenburg and south across the border in Hoyerswerda, Weißwasser in far northern Saxony.
Lower Lausitzian and Lower Silesian overlap geographically with High Prussian in central East Prussia and neighboring West Prussia. This dialect is transitional between Low German and Middle German. This dialect appears to be in good shape, has many speakers at least in Cottbus and has poor intelligibility with Standard German.
Silesian* (Schlesisch) is a group of two East Middle German languages. As a separate language, Lower Silesian it is recognized by Ethnologue. It is spoken north of the Riesengebirge (Giants Mountains) around Glatz in eastern Bohemia, Czechoslovakia and in Kuhländchen in the upper Oder area. It was formerly spoken in Western Moravia. By the 1100’s, this region was covered with German settlements and was completely Germanophone. As a high-level German dialect grouping, it must be a separate language.
Silesian dialects include Neiderländisch, Kräuter Silesian (Kräuterschlesisch), Mountain Silesian (Gebirgsschlesisch), Glätzisch, Brieg-Grottkau Silesian (Brieg-Grottkauer Schlesisch), Reichenberg, and Upper Silesian (Oberschlesisch). This language is often described as a sort of German Creole with heavy Polish elements in it.
Lower Silesian (Niederschlesisch) is an East Middle German language spoken southeast of Berlin close to the Polish border near Bautzen and in Eastern Bohemia in Czechoslovakia. It was formerly spoken extensively in Poland. It is not mutually intelligible with Standard German. In some places, it is still spoken by young people. It still has quite a few speakers, possibly as many as 500,000. It overlaps with High Prussian in central East Prussia and West Prussia. Oppelner is a dialect of this language.
Hultschiner Laendle Bohemian German is a Silesian dialect spoken in a pocket of the Sudetenland where Bohemia borders Silesia. This divergent lect is considered to be a separate lect from the rest of Bohemian German and is probably close to and may be a dialect of Lower Silesian. Poorly known, but there are probably still some speakers left.
High Prussian (Hochpreußisch) is an East Middle German Silesian language that was formerly spoken extensively in East Prussia, now part of Poland. The language is moribund with the expulsion of Germans from Poland after WW2, and there are only a few elderly speakers left. It must surely be a separate language and must be unintelligible with other German languages or with Standard German.
The language originated from Silesian speakers who moved to the area in the 1200’s-1400’s. It was then influenced by the extinct (since 1700’s) Baltic Low Prussian language, a West Baltic language related to Lithuanian and Latvian. All West Baltic tongues have gone extinct. Baltic Low Prussian went extinct around 1710 when a series of famines and bubonic plague epidemics swept through the population, decimating the speakers.
Dialects of High Prussian include Oberländisch and Breslau (Breslausch).
Barossa German is a moribund language spoken in Australia by a few remaining elderly speakers. Speakers came from the High Prussian and Silesian regions, so the language is a Middle East German tongue.
It is very strange, barely intelligible at all to Standard German speakers and probably not intelligible with any other German lects either. German settlers arriving around 1840 settled in the Barossa Valley in South Australia. It declined with the suppression of Germans and the German language in Australia during WW1.
Vilamovian (Wilamowicean or Wymysorys) is a West Germanic language spoken in Wilamowice (Wymysoj), Poland near Bielsko-Biała, on the border between the regions of Silesia and Lesser Poland. This is in the far southwestern part of Poland near Germany and Czechoslovakia. It is derived from the Middle German circa the 1200’s spoken by settlers who came to the area from Germany, Scotland and the Netherlands.
Why they all decided to speak a German language is not known. Further, despite their disparate origins, they all decided on a Dutch identity, while speaking German nevertheless. Very confusing. Low German, Dutch, Frisian, Old English and Polish went into the mix.
The Polish Communists banned the language after WW2, but the ban was lifted in 1956. Nevertheless, the language has been replaced by Polish, and the only speakers now are 70 elderly speakers, so it is moribund.

The Middle German languages, with West Middle German on the left and East Middle German on the right and Thuringian in light blue in the middle.
The Middle German languages, with West Middle German on the left and East Middle German on the right and Thuringian in light blue in the middle.

Ripuarian Franconian is a West Central German Central Franconian language spoken in northwest Germany on the borders of Netherlands and Belgium in North Rhine-Westphalia around the town of Cologne. Ripuarian Franconian consists of 150 different, often quite divergent, lects for which dictionaries have been published. The Ripuarian lects are not intelligible at all with Standard German.
They form a dialect chain whereby one city can understand itself and the cities next to it, but once you get a couple of cities over, they can’t understand each other anymore. At the extremes of the Ripuarian dialect chain, intelligibility is as low as 20%. Therefore, Ripuarian is clearly more than one language. There are 1 million speakers of Ripuarian.
Some of the varieties include Bonn German or Bönnsch, Homburgisch, Lammersdorf, Neusser, Bad Neuenahr/Ahrweiler, and Bocholtz German, which may well be separate languages, but we will need more evidence before splitting them.
Kölsch is the specific variety of Ripuarian Franconian spoken in Cologne, North Rhine-Westphalia. Kölsch is not intelligible with the rest of Ripuarian Franconian. It has about 250,000 speakers. Here is a sample of Kölsch. You can see it doesn’t look much like any of the surrounding languages. Kolsch has 30% intelligibility with Aachen Platt and 75% with Eschweiler German (Köhler 2015).
Hommersch is a Ripuarian Franconian language at the other end of the huge Ripuarian dialect chain from Eschweiler German, and therefore it makes sense to split it into a separate language . Intelligibility with Kolsch is not known but is probably difficult.
Eschweiler German, spoken in Eschweiler, Germany, is often considered to be  a SE Limburgish lect, but actually it is a Ripaurian lect as it has difficult intelligibility with Southeast Limburgs. It is intelligible with Stolberg German, but it has more Ripaurian influences (Tulipan 2013). It is said to be halfway between Aachen Platt and Kolsch. Intelligibility is 55% with Aachen German and 75% with Kolsch (Köhler 2015). As it is not fully intelligible with Kolsch, it must be a separate language.
The various Low Franconian and Middle Franconian languages. D is Ripaurian, E is Moselle Franconian, F is Luxembourgish, G is Rhenish Franconian, H is South Franconian and I is East Franconian.

Moselle Franconian (Moselfränkisch) is a West Central German Central Franconian language spoken south of Ripuarian Franconian in Germany on the borders of Belgium and France and which also shades into Belgium and France. It is not intelligible with Luxembourgish other than that the westernmost dialects of Moselle are intelligible with the easternmost dialects of Luxembourgish near the Luxembourg border. Trier is a major city in this speaking area.
There are many Moselle Franconian lects. All are spoken in Germany unless otherwise noted.
Lorraine Franconian is a simply Moselle Franconian spoken in France. It is apparently intelligible with the Moselle Franconian spoken across the border in Germany. It is not intelligible with Standard German, Luxembourgish, or the Alemannic High German language Alsatian with which it is often paired, or with the Rhenish Franconian spoken in the Lorraine. It has 78,000 speakers. Use is decreasing, and only 20% of children under age 15 are able to speak it.
It is mostly spoken in the Moselle Department of the state of Lorraine around Thionville. Hettangeois, Bitscherland, and Rodener are some of the dialects of Lorraine Franconian.
Trierisch is spoken in Trier, Germany. The Trierisch dialect differs even within the city of Trier. Outside the city of Trier, the dialect is clearly different from that spoken in the city, and village residents do not refer to their lects as Trierisch. Eifler speakers cannot understand Trierisch and vice versa. However, Trierisch and Konz are intelligible with the East Luxembourgeois spoken in the far east of Luxembourg on the German border in the towns of Grevenmacher and Echternach.
The following Moselle Franconian dialects are spoken in Germany.
Reiler is spoken in and around Reil. There are various Moselle Franconian dialects spoken around the Schneifel Mountains and the Venn Region over near the Belgian border. Wäller is spoken in eastern Westerwald, on the border between Moselfränkisch and Hessisch. Westerwälder or West Westerwäldisch is spoken in the Westerwald. Wittlicher is spoken in Wittlich. Andernacher/Annenach/Annenache Platt is spoken in Andernach. It has more Ripuarian features than other Moselle lects due to its connection to Cologne. Unter Moselfränkisch is another Moselle dialect, but I am not sure where it is spoken.
Kröver is a Moselle Franconian language spoken in the town of Kröv on the Mosel River. Germans who lived in the area refer to it as a separate language, however precise intelligibility data is unknown.
Eifler or Eifelplatt is a Moselle Franconian dialect spoken in the Eifel Mountains in Germany. Eifler is so strange that intelligibility with Standard German is close to zero. It is said to be not intelligible to outsiders, but it is intelligible with the Bad Honningen dialect just to the west of Eifler. Eifler has dialects of its own, including Demerather, Maifeld, Southeifel, and Uebereltz. South Eifel is similar to Luxembourgish, in fact, the South Eifel spoken in Bitburg is often referred to simply as Luxembourgeois. Eifler is also spoken in Belgium around St. Vith. This was where the Battle of the Bulge was fought.
Siegerländer or Siegerländish is a Moselle Franconian dialect spoken in the Siegerland region in Nordrhein-Westfalen.
Luxembourgish (Lëtzebuergesch or Letzeburgisch) is a Moselle Franconian language spoken in Luxembourg. It is not intelligible with Moselle Franconian other than those Moselle dialects right next the German border with Luxembourg, where the easternmost dialects of Luxembourgish are intelligible with Moselle Franconian. Luxembourgish is close to the South Eifel dialect.
There are several distinct varieties of Luxembourgish, but there is a Standard Luxembourgish emerging now in place of them. Luxembourgish is not used in the classroom, and there is a tendency of the state to use German and French in public announcements. Both languages are heavily promoted, such that Luxembourgers are typically trilingual.
Although almost everyone speaks Luxembourgish, there is frustration on the part of speakers that the language cannot accommodate many modern and technical terms, for which German and French are often used instead. There is a heavy French influence. Luxembourgish has 40% intelligibility with Standard German.
It is also spoken in Belgium and France. In France, it is spoken in the Moselle Department of Lorraine around the Thionville area where Lorraine Franconian is also spoken (Hughes 2005). In Belgium, it is spoken in Arlerland in Eastern Belgium.
East Luxembourgish is language spoken in the far east of Luxembourg around the cities of Grevenmacher and Echternach on the border with Germany. It is not intelligible with the rest of Luxembourgish, so it appears to be a separate language. However, it is intelligible with the West Moselle Franconian spoken across the border in Trier and Konz. In Germany, it is probably also spoken in Nittel, Welschbilling, Irrel, and Waserbillig, and elsewhere in Luxembourg in Mertert, Mombach, and Rosport.
Since this is considered to be part of Luxembourgish and not Moselle Franconian, it is best to split is off as a separate Luxembourgish language. The exact borders of this language are not known – we do not know where the border between this language and Luxembourgeois is, we do not know where Trierisch ends and Eifler begins in the Eifel Mountains, we do not know where Trierisch lects become end and other Moselle Franconian lects begin heading up the Moselle River Valley, and we do not know the status of the Moselle Franconian lects to the southeast heading into northern Saarland.
Transylvanian Saxon (Siebenbürger Sächsisch) itself is a macrolanguage of Germans in Romania. It is derived from a movement of German settlers to Transylvania from 1150-1230, so it has been split off from other German lects for a very long time. The first phase were settlers from the Luxembourg and Moselle region. The second phase from 1200-1230 consisted mostly of settlers from the Rhineland, southern Netherlands and Belgium and the Moselle region once again. A few others came from Thuringia, Bavaria and even France. The main area where they settled in the center of Romania is called the Transylvanian Saxon Triangle.
Originally, it was basically an ancient form of the Moselle Franconian language because this is where most of the original settlers were from . It has incredible dialectal diversity, with over 250 documented dialects. Obviously, it is not intelligible with Standard German, but intelligibility data is lacking with the rest of Moselle Franconian, though considering the diversity of Moselle itself, this is probably for all intents and purposes a separate language from Moselle Franconian.
Transylvanian Saxon has 80% lexical similarity with Luxembourgeois, but that often boils down to less than 50% intelligibility in the real world. Transylvanian Saxon is completely unintelligible with Danube Swabian with only 30-40% intelligibility (Costin 2014), Standard German and even Franconian.
Every village has its own dialect, and dialects can be quite different. Intelligibility data for the various Transylvanian Saxon dialects is lacking and urgently needed. At least in 1855, there were 7 distinct dialects of Transylvanian Saxon and at this time, there were mutually unintelligible dialects of Transylvanian Saxon (separate languages).
In fact, prior to WW2, there were unintelligible dialects of Transylvanian Saxon between various villages, yet the Saxons had developed a Standard Transylvanian Saxon koine in order to communicate. Two separate languages may have been Hianzisch and Hittisch. They may well be extinct.
Transylvanian Saxon originally was a dialect chain, but villages that were too far apart could not understand each other. Apparently since at least World War 2, heavy dialect leveling has occurred along with koine adoption and while Transylvanian Saxon at some point in the last 150 years was made up of separate languages, in recent years, enough dialect leveling has occurred and there has been enough dialect merger such that present day Saxons can understand each other well. No one really knows why it is called Saxon, except that before the immigrants moved into the area, they moved through the state of Saxony.
With the return to capitalism in 1990, 90% of the 500,000 Saxon population left for Germany, leaving only 50,000 behind.
The area around the cities of Media and Sibiu speak this language, and they they call it Siebenbürger Sachsen. It is still spoken today even by people in their 20’s.
Rhine Franconian (Rheinfränkisch) is a family of lects that are spoken in the western German regions of Saarland, Rhineland-Palatinate and Hessen and in northern Bas-Rhin in the state of Alsace.
The Franconian language group is related to Hessian that is spoken in the Rhine River Valley in Germany over to the French border. It is not intelligible with Standard German. Rhine Franconian is also not intelligible with Moselle Franconian or with Luxembourgish. It is spoken to the southwest of the Hessian zone.
In the western part of this area, Pfälzisch is spoken in restaurants, stores, offices, schools, theaters. You would almost think that Standard German is the foreign language. Pfälzisch is still very popular and most kids still grow up speaking it. Every village has a different dialect.
Some of the dialects are Großrosseln, Saarbrücken, Zaisenhausen, Altrip, Bann, Gabsheim, Odenwälderisch, Rülzheim, Nordpfälzisch, Südpfälzisch, Wissembourg, Thaleischweiler-Fröschen and Pirmasens. Dialectal intelligibility is not known.
Odenwälderisch, spoken in the Odenwald, a mountain chain in southern Hesse, northern Bavaria and northern Baden-Württemberg, has been influenced by South Hessian. Rülzheim is spoken in the Germersheim area along the Rhine.
Pirmasens is spoken in the town of the same name in far southwest Palatine near the French border. Gabsheim is spoken in northern Palatine between Mainz and Alzey near the town of Wörrstadt.
Bann is spoken in the town of Bann, near Kaiserslautern. Thaleischweiler-Fröschen is spoken in the Palatine Forest 4 miles north of Pirmasens. Altrip is spoken in the city of Altrip 4 miles south of Ludwigshafen. Großrosseln and Saarbrücken, spoken in the Saarland, are partway between Rhine Franconian and Moselle Franconian. Großrosseln is almost a suburb of Saarbrücken. Wissembourg is spoken in northern Alsace, France.
Mainzerisch is spoken around the city of Mainz.
Westpalatine German or West Pfälzisch is a major level split in the Rhine Franconian lects. This is a separate language from Rhine Franconian proper because speakers refer to Rhine Franconian as one language and this as another language. For instance, Saarlandsich speakers say that Frankish (Rhine Franconian) is a different language from what they speak (Anonymous 2014).
Lects included in this grouping include Saarländisch, Saarpfälzisch, Westrichisch, Pfälzer-Bergländisch, Pfälzer-Wäldisch, Schwarzwälder-Hochwäldisch, Idarwäldisch, Hunsrückisch, Naheländisch, Rheinhessisch, Kaulbach, and Waldpfälzisch.
Saarländisch, Saarpfälzisch and Westrichisch are spoken in the Saarland. Most of the rest are spoken in the Rheinland Palatine.
Saarlännisch is a form of Westpfälzisch spoken in the Saarland. It is intelligible with the Lorraine Pfalzisch spoken right across the border in the French state of Lorraine (Hughes 2005). Saarlandisch is not intelligible with the rest of Westpfalzisch spoken in the Rhineland Palatinate (Anonymous 2014).
Here is a sample of the Saarland dialect. There are various subdialects within Saarlännisch, including Eschringen, Ensheim, Saarlouis, and Irsch.
Saarlouis is still spoken by almost everyone in the town, including teachers.
Lorraine Pfalzisch is a Rhenish Franconian Westpfälzisch tongue spoken in northeast France in eastern Moselle Department in the state of Lorraine region. In the Lorraine, it is spoken between Forbach and Biche in the Moselle Department of Lorraine Province.
The Rhenish Franconian spoken in Lorraine is not intelligible with the Luxembourgeois or with Lorraine Franconian spoken there. It is however intelligible with the Saarlännisch spoken just across the border in Germany in the state of Saarland (Hughes 2005).
Lothringian Pfalzisch is a separate language spoken in the transitional area in Lorraine between the Lorraine Franconian (Moselle Franconian) speakers and the Lorraine Pfalzisch (Rhenish Franconian) speakers. This language is transitional between these two lects. However, the hard Lothringian Pfalzisch is not intelligible with Saarlandisch spoken over the border into Germany (Anonymous 2014). Since it is not intelligible with Saarlandisch, it is no doubt also not intelligible with Lorraine Pfalzisch, which is a part of Saarlandisch. Intelligibility with Moselle Franconian is not known but is probably not full. St. Arnold is a dialect. Intelligibility with Hunsrücker, a similar lect, is not known.
Hunsrückisch, or Hunsrücker, is a Westpfälzisch dialect that is partway between the Rheinfränkisch and Moselfränkisch languages. Intelligibility with Lothringian Pfalzisch, a similar lect, is not known.
Riograndenser Hunsrückisch is a variety of Hunsrückisch that is widely spoken in southern Brazil. Although it resembles Hunsrückisch as of 100 years ago, it has also received many inputs from other German languages, including Low German languages like Pomeranian and Plautdietsch, other European languages such as Italian and Venetian and of course lots of Portuguese. Intelligibility between this and Hunsrückisch in Germany is not known.
The German grammar has been largely replaced by Brazilian Portuguese grammar. It is apparently not at all intelligible with Standard German.
Danube Swabian is spoken by former residents of the Danube region of Europe, especially Hungary, but also in Slovenia, Bosnia, Croatia, Romania and Bulgaria. Most were expelled from the area after WW2. They now live in Brazil and Germany. This language is not intelligible at all with Standard German and is  completely unintelligible with the Transylvanian Saxon that is spoken by other Hungarian Germans. Danube Swabian is ~30-40% intelligible with Transylvanian Saxon (Costin 2014) and is 65% intelligible with Standard German (Giesser 2014).
This language is only nominally Swabian, but it does sound like a combination of the earlier forms of many German languages such as Swabian, Pfälzisch, Alemannic and Alsatian circa the 1700’s, because that is what it seems like it is. But actually the High German aspect is a secondary layer to the essential core of the language, which is, to pin it down, best thought of as a Rhine Franconian lect similar to Saarlännisch.
It had many dialects, and you could often tell the particular village a person came from by his speech. It has many Hungarian borrowings. It is still spoken, but less and less. There are still many middle aged speakers age 45+.
It was still widely spoken in 1989, and many people could not even speak Hungarian but only spoke Swabian. Mandatory classes in Standard German have been introduced, and Danube Swabian is spoken less often.
The dialects are, incredibly enough, often regarded as largely mutually intelligible. However, other reports say that in Hungary, each village had its own dialect and adjacent villages sometimes could not understand each other (Bindorffer 2004). The latter report casts doubt on the mutual intelligibility of the Danube Swabian lects.
In the Banat (a region encompassing parts of Romania, Serbia and Hungary) alone, at one time there may have been as many as 24 different dialects. Danube Swabian has high intelligibility with Black Sea German, a form of German spoken in the southern Ukraine.
Kurpfälzisch is spoken in the northern part of Baden-Württemberg from Karlsruhe north to up around Heidelberg and Mannheim. This is a Palatinian lect. It is unintelligible with Standard German, but it is intelligible with Rhine Franconian in general. Even the young people of Heidelberg today no longer understand the dialect of their own city.
Other Germans find the language spoken in Mannheim to be nearly incomprehensible, on the order of Swabish and German Bavarian. It is probably about 40% intelligible with Standard German. This is a Rheinish language related to Pfälzisch and Hessian.
There are different dialects of this language, including Heidelberg, Viernheim, Sandheim, Seckenheim, and Mannheim.
The Mannheim dialect is described as “completely different” in the northern and southern parts of Mannheim, so there are two dialects, North Mannheim and South Mannheim. The Mannheim dialect is in excellent shape, and most of the town speaks it habitually. Sandhofen is one of the dialects spoken in the north of Mannheim. There seems to be a broad Mannheim dialect that is understood all across the general Mannheim region.
Pennsylvania German is a West Middle German Rhine Franconian (Rhenish Palatinate) macrolanguage that is descended from Pfälzisch. It is spoken in the USA. It is 70% intelligible with the Bavarian language Hutterite German. Although there are reports that it is mostly not intelligible with Pfälzisch nowadays, other reports say it is intelligible with the Mainzerisch spoken in Mainz. It seems to bear specific resemblance to the Kurpfälzisch spoken in Mannheim. There are also reports that it is intelligible with Riograndenser Hunsrückisch.
There are 2-3 million speakers of Pennsylvania German in the US, Canada and Central and South America. It has high intelligibility with Danube Swabian. Speakers are generally members of the Amish religious sect, though not all. Most speakers live in Pennsylvania, Ohio and Indiana in the US.
It is often said that Pennsylvania German is one language with many dialects that are all mutually intelligible. However, recent data shows that this is not the case. There are differences in this language even within the same branch of Anabaptism. The differences are sometimes serious enough to cause major disruptions in communication (Bowie 1997).
In Ohio at least, there are two groups of Amish Pennsylvania German speakers. The first group is descended from Amish who moved to Indiana from Pennsylvania and Ohio starting in the 1840’s.
Swiss Pennsylvania German is the language spoken by the second group consisting of the second wave of Amish who came to the US from 1815-1860. They came mostly from Switzerland and went straight to Ohio. Swiss Pennsylvania German speakers and regular Pennsylvania German speakers in Ohio have poor mutual intelligibility hence they tend to communicate in English.
Forepalatine German or Vorderpfälzisch is a high level split in Rhenish Franconian that is probably a separate language. It is spoken in the Vorderpfälz Middle Rhine region near Mannheim in the southeast of Rhineland-Palatine.
Dialects include Alsatian Pfälzisch (Elsässisch-Pfälzisch) spoken near Weißenburg (Wissembourg) and Nordelsass in France, Haardtgebirgisch, spoken near Haardtgebirge, Germany, Speyerisch-Landauisch, spoken near Speyer and Landau, Germany, Ludwigshafenerisch, spoken near Ludwigshafen, Germany, and Wormserisch, spoken around Worms, Germany. Speyerisch-Landauisch is still very commonly used in everyday life. Dialect intelligibility is lacking.
Alsatian Pfalzisch is a dialect of Vorderpfälzisch that was spoken by many German colonists in the Black Sea area. This variety of Black Sea German derived from many immigrants that came from the Pfalzisch-speaking part of Alsace in France and settled in Russia during the 1800’s. Many then moved from the Black Sea to North Dakota in the US.
All forms – that spoken in the Black Sea, the form spoken in Alsace, and the form spoken in North Dakota – appear to be intelligible. It is not intelligible with the Low Alemannic Alsatian language widely spoken in Alsace. Intelligibility between this and the Lorraine Pfalzisch in Lorraine and Saarlännisch is not known.
Texas German is apparently a Forepalatine dialect of German spoken by German settlers who came to central Texas in the 1840’s in an attempt to establish a New Germany in the US. It is an endangered language, and there are now projects to try to save it. The youngest speaker is 47 years old. Although it is a unique dialect, mutual intelligibility with Standard German is 95%, so it is not a separate language. It is most closely related to Forepalatine dialects west of Mannheim in the Ludwigshafen area (Guion 1996).
South Hessian (Südhessisch) is a form of Rhenish Franconian. Some South Hessian dialects and languages are Biblis, Darmstadt, Dörnigheim, Haaner, Hanau, Heppenheim, Langenselbold, Seligenstadt, Mainzerisch, Orwisch, Rodgau, Ronneburg, Wetterauisch, Taunus-Hessisch, Untermainländisch, Riedhessisch, and Odenwälderisch. Haaner is spoken in Dreieichenhain, 15 miles south of Frankfurt. Darmstadt South Hessian is still spoken in Sheboygan County, Wisconsin, today.
Frankfurterisch South Hessian is a South Hessian dialect spoken in the city of Frankfurt. It has poor intelligibility with Bad Homburg South Hessian 10 miles north of town. It is intelligible with the Kurpfälzisch spoken in Heidelberg.
Bad Homburg South Hessian is a South Hessian language that is spoken in and around Bad Homburg 10 miles north of Frankfurt in Southern Hessen. It is not intelligible with Frankfurterisch, but it is intelligible with lects spoken around it.
Rhenish Hessian (Rheinhessisch) is a South Hessian dialect spoken in Rhenish Hessen around Mainz, Bingen, Bad Kreuznach, and in Hessen in the Rheingau area and Wiesbaden. Others place this language within Westpfälzisch.
Rheingauer Rhinehessen is a dialect of Rhinehessen that is spoken in and around the wine-growing region of Rheingauer. It is intelligible with the rest of Rhenish Hessian.
The Franconian languages. Low Franconian (Dutch) in yellow, Middle Franconian in green and Upper Franconian in blue.
The Franconian languages. Low Franconian (Dutch) in yellow, Middle Franconian in green and Upper Franconian in blue.

Hessian is a West Middle German language spoken in Hessen that is closest to Pfalzisch. Hessian is only about 40% intelligible with Standard German. In many cases, Standard German speakers say they can scarcely understand a word of Hessian.
There are many Hessian lects, but intelligibility data is generally lacking between them. Some Hessian lects are surely separate languages. Hessian is spoken northeast of Pfälzisch. Hessian is not intelligible with Luxembourgish or with other Rheinish lects.
There are several main varieties of Hessian: Lower Hessian (or Niederhessisch), Upper Hessian or Central Hessian (Oberhessisch or Mittelhessisch), West Hessian (Weshessisch), South Hessian (Südhessisch) and Wittgenstein Hessian. Within Lower Hessian, there are two subvariants, North Hessian (Nordhessisch) and East Hessian (Osthessisch).
All of these have subdialects.
Lower Hessian (Niederhessisch) is a family of German dialects which contains two large dialects, North Hessian and East Hessian.
North Hessian (Nordhessisch) is a German dialect within Lower Hessian. Further, Hessian is an extremely diverse family. Schenklengsfeld and Kassel are dialects of North Hessian. North Hessian is no longer spoken in many parts of this region, especially in the cities. However, it is still quite alive in small villages, particularly around the Ohm River.
East Hessian (Osthessisch) is part of the Lower Hessian group and is a separate language. Dialects include Salzung.
Fulda East Hessian is not inherently intelligible with Schlitz East Hessian, though many Schlitz East Hessian speakers have learned to speak it (Wahl May 2009). Intelligibility with Voralberg East Hessian is unknown and needs investigation.
Schlitz East Hessian (Schlitzerplatt) is a form of East Hessian spoken in the town of Schlitz and surrounding villages in Hessen. Speakers say (Wahl 2009) that it is not intelligible with any other German lects. Hence, it is a separate language. Schlitzerplatt is now nearly extinct in Schlitz itself, but it may still have a few elderly speakers in outlying villages (Wahl 2014).
Voralberg East Hessian is a form of East Hessian spoken in the Voralberg region of the state of Hessen. It has poor intelligibility even with the nearby Schlitz East Hessian, even into the 2000’s. The region is described as desolate and the residents as poor farmers. The language is said to reflect the region and its residents and is described as “harsh, cold and brutal” (Wahl February 2010). The language is still widely spoken even recently. Since it has poor intelligibility even with Schlitzerplatt, it may well be a separate language.
Upper Hessian or Central Hessian (Oberhessisch or Mittelhessisch), is another high level split in the Hessian family that is definitely a separate language. Central Hessian dialects include Holzhausen, Ruttershausen, Langenbach, and Hättenberger Land (the area around Wetzlar and Gießen). Within Central Hessian, there are probably numerous languages because it is not uncommon for village dialects to not be understood even a few miles away.
The Hessich Hinterland 150-200 years ago
The Hessich Hinterland 150-200 years ago

Hinterlander Central Hessian (Hinterländer Platt) is a Central Hessian dialect spoken in the Hinterlander region of Hessen.
Wittgenstein Hessian is a highly divergent Hessian dialect that may or may not be a separate language. It is spoken in Wittgenstein in North Rhine-Westphalia.
Volga German is a language or series of languages spoken by Germans in the Volga Region of Russia. Beginning in 1763, Catherine the Great urged Germans to come to Russia to farm empty lands. Many deeply impoverished Germans took up the call and migrated to Russia, where they were given land in the Volga Region.
Volga German in general seems to be a West Middle German Rhenish Franconian language with deep affinities to Hessian, though there are Swabian influences too.
It is almost completely unintelligible with Standard German. This language, like Yiddish, has been deeply influenced by Russian in terms of both lexicon and syntax. Since 1990, many have left Russia for Germany. As in the case of Bohemian German, this may be another trash can category for a variety of lects spoken by different German groups in the Volga.
Although, arguing against this is evidence from the region in 1850 that a Standard Volga German koine (Kolonistendeutsch) was already developing. The language may have an archaic character. German visitors in 1924 noted that it sounded like 17th Century German.
Amana German is still spoken in the Amana Colonies of Iowa. This area was settled by a fundamentalist German Lutheran group called the Inspirationists around 1850. Amana is a mixture of many different German lects, but it is primarily a Buedingen-Geldhausen Hessian lect. There is also major Swabian influence.
Even in the US, each village continued to have its own dialect until major changes occurred in 1932, after which a Standard Amana German developed. Intelligibility with Amana German and the rest of German is not known.
Ostfränkisch (East Franconian) is a High German language group transitional between Central and High German. It is spoken in Thuringia, Bavaria, Hessen and Baden-Württemberg around Eisenach, Coburg, Würzburg, Hof, Bayreuth, Plauen and Bamberg, in the area east of Frankfurt, to southern and western Thuringia and out to the Vogtland. It has a very high number of speakers. Klein-Allmerspan, Oberschefflenz, and Kupfer River are dialects. It is not intelligible at all with any type of Bavarian, even the Bavarian spoken nearby, and it is said to be only understandable by those who live there.
Map of the East Franconian lects

Main-Franconian is one of the Ostfrankisch (East Franconian) High German languages that are transitional between Central and High German. It is spoken along the Main River which runs into the Rhine.
It is spoken in Germany in the Main-Tauber District of Baden-Württemberg, in Upper Franconia (Oberfranken) in Bavaria, and in Schmalkalden-Meiningen, Hildburghausen, Sonneberg, and the city of Suhl in southern Thuringia. It is also spoken around Schlüchtern in Eastern Hesse near the border with northwest Bavaria.
Major cities where it is spoken include Bayreuth. This language is not intelligible at all with German Bavarian (Kirmaier 2009).
There are many Main-Franconian lects.
Taubergründisch is an East Franconian lect spoken in Bavaria in Euerhausen and Sonderhofen, and in Baden-Württemberg in Weikersheim, Bad Mergentheim, and Tauberbischofsheim. This lect borders on South Franconian.
Ansbachisch is an East Franconian lect. I am not sure where it is spoken.
Lower Franconian (Unterfränkisch) is a Main-Franconian lect spoken in Würzburg and Schweinfurt in the Unterfranken or Lower Franconian region of Bavaria. There is high but not complete intelligibility between Lower Franconian and the rest of Main-Franconian (Kirmaier 2009), but it appears that Lower Franconian is not fully intelligible with Main-Franconian since Lower Franconian is not even intelligible within itself.
However, Lower Franconian has huge dialectal diversity. There are apparently over 250 dialects of Lower Franconian alone. Some of these dialects are not very different, but others are so different that intelligibility is poor. Villages spaced far apart often have poor intelligibility. There are a number of separate languages in Lower Franconian, but until we can begin to delineate them, we can’t list any. These small lects appear to be dying out lately.
Grabfeldisch is a Main-Franconian lect spoken in Bad Königshofen and Mellrichstadt in Bavaria, in Römhild and Frankenheim in Thuringia, and in Gersfeld and Hilders in Hessen. Schlüchtern may be a dialect.
Bambergerisch is a Main-Franconian lect spoken in Bamberg, Forchheim, and Erlangen in Bavaria.
Hennebergisch is a Main-Franconian language spoken in Schmalkalden, Meiningen, Zella-Mehlis, Suhl, and Schleusingen in Thuringia.
Itzgründisch is a Main-Franconian language spoken in Coburg, Neustadt and Bad Staffelstein in Bavaria and in Sonneberg, Effelder-Rauenstein, and Hildburghausen in Thuringia. Itzgründisch is not intelligible with Upper Saxon and probably with none of the other Main-Franconian lects either. Sonnebarger is a dialect spoken in Sonneberg.
Frammersbacher Welschen is spoken in the town of Frammersbach in the Spessart area. It is a secret language, so not a dialect proper.
Upper Franconian (Oberfränkisch) is a Main-Franconian lect spoken in Bavaria in Bayreuth, Kulmbach, Kronach, Hof, and Lichtenfels. It has high, but not full, intelligibility with Lower Franconian.
Hof Upper Franconian (Hofer) is the Upper Franconian language spoken in Hof. Hofer is very divergent, even within itself, and in all probability it is a separate language.
Rhöner Platt East Franconian is a complex dialect, possibly a separate language, that is hard to characterize and is spoken around the Rhön area of eastern Hessen. This is a Middle German language, but it hard to say if it is East or West Middle German because it has been influenced by both.
It has been influenced by East Franconian, Hessian and Thuringian. This language is spoken around the Fulda Gap between the former nations of East and West Germany. It is not intelligible to people even 15 miles away, so it must be a separate language.
Central Franconian (Mittelfränkischen) is spoken in the Mittelfranken or Central Franconian region. This is a language spoken in and around Nuremberg that is very different from the rest of East Franconian to the extent that it is not intelligible with it (Kirmaier 2009). Therefore, it is a separate language.
There are some dialects of this language. Hetzle is spoken in the village of the same name near Nuremberg. Fürther is spoken in the town of Fürth near Nuremberg. Nuernbergerisch is spoken in the city of Nuremberg. There are apparently some dialects of Central Franconian in the rural areas that are impossible for even native speakers of Fürther to understand. Therefore, there is more than one language in Central Franconian, but we need some details before we proceed.
Hohenlohisch is an East Franconian dialect that is spoken around Bad Mergentheim, Crailsheim, Gerabronn, Künzelsau, Öhringen, and Schwäbisch Hall in Baden-Württemberg. This is probably the same language as Schwäbisch-Fränkisch, a type of East Franconian that is spoken on the border of the Swabian speaking area.
Vogtländisch is one of the Ostfrankisch (East Franconian) High German dialects that are transitional between Central and High German. It is spoken in Vogtland in Saxony, and it is also spoken in Austria. It is intelligible with Erzgebirgisch, but not with any other German lects (Goldammer 2009).
Speakers now are mostly elderly, as children have not been raised speaking it for some time now. Still, there are quite a few speakers.
The dialects differ drastically from one another but are nevertheless intelligible with each other. Cities where it is spoken include Plauen and Klingenthal.
There are four dialects.
Middle Vogtländisch (Mittelvogtländisch) is spoken around Mühltroff, Treuen, and Oelsnitz.
Northern or Nether Vogtländisch (Nordvogtländisch) is spoken along a line going from Reichenbach – Mylau – Netzschkau – Elsterberg – Pausa.
Eastern Vogtländisch (Ostvogtländisch) is spoken around Göltzschtal from Frankenstein to Lengenfeld.
Upper Vogtländisch (Obervogtländisch) is spoken south to a line running from Bobenneukirchen – Oelsnitz – Werda – Schöneck.
Eastern Vogtländisch around Klingenthal is regarded as particularly incomprehensible by Standard German speakers.
The Vogtlandisch language in far southwestern Saxony. Includes some Thuringian lects in Eastern Thuringia.
The Vogtländisch dialects in far southwestern Saxony. Includes some Thuringian lects in Eastern Thuringia.

Erzgebirgisch is an East Middle German language related to East Franconian. Although it is often said to be an Upper Saxon language, the latest thinking is that it is separate from Upper Saxon, and it has little in common linguistically with Upper Saxon. Neither is it intelligible with Upper Saxon.
A good case can be made that it is an East Franconian language. It has high intelligibility with Vogtlandisch and some of the furthermost east varieties of East Franconian.
It is spoken on on the border with the former Sudetenland region of Czechoslovakia in Saxony, especially in the area of the Erzgebirge or Ore Mountains. There are other dialects spoken far away in the Harz Mountains in Lower Saxony.
It is losing ground to Upper Saxon, and many speakers are emigrating out of the area, hence the language is declining, but it still has 500,000 speakers. It is also close to Bavarian.
There has been little research on this language since 1929 and even that dealt only with Upper Harz. Historically, this language was created during the 1100’s and 1200’s as East Franconian speakers  from the West moved into the Ore Mountains and either displaced or assimilated Slavic speakers.
It has at least seven dialects, Upper Erzgebirgisch, Fore Erzgebirgisch (Vorerzgebirgisch), East Erzgebirgisch, West Erzgebirgisch, Clausthal-Zellerfeld Erzgebirgisch, Upper Harz Erzgebirgisch, and North Erzgebirgisch. Fore Erzgebirgisch is transitional to East Franconian. All dialects are intelligible.
Upper Harz Erzgebirgisch is located far away from the rest of the dialects to the north in the Upper Harz Mountains (see map below) of Lower Saxony. This dialect is nearly extinct. It still has many elderly speakers, but it is probably not being passed on to children. This dialect is the most different of all, but it is still intelligible with the rest of Erzgebirgisch (Goldammer 2009).
This lect is spoken in the Upper Harz Mountains, is heavily influenced by Ostfälisch and is nearly extinct. It is spoken around the city of Clausthal-Zellerfeld in Lower Saxony. This lect is different mostly in that it is very archaic.
West Erzgebirgish (Westerzgebirgisch), is an Erzgebirgisch dialect spoken around Scheeberg, Marienberg, and Annaberg. This language is not intelligible with Standard German and is regarded as being one of the toughest dialects for Standard German speakers to understand. Intelligibility with Standard German is surely below 40%.
As of 20 years ago, this language was still the primary means of communication in the area, and it still has many speakers. This dialect is transitional to East Franconian. There is a lot of East Franconian influence in this dialect.
The differences between West Erzgebirgisch and East Erzgebirgisch are considerable, but the two are nevertheless intelligible. This lect has a lot of Upper Franconian elements, along with a lot of influence from Eastern Meissenish. West Erzgebirgisch is also similar to Vogtlandisch.
Osterzgebirgisch or East Erzgebirgisch , an Erzgebirgisch dialect, represents a transitional dialect between West Erzgebirgish and Upper Saxon. This dialect has very heavy Upper Saxon influence, and it is losing speakers. This lect is close to Meissenish.
The Erzgebirgisch lects located in Saxony near Chemnitz.
The Erzgebirgisch dialects located in Saxony near Chemnitz.


Anonymous. Saarlandsich native speaker. Personal communication, Yosemite National Park, California. March 2014.
Bowie, David. 1997. Was Mir Wisse: A Review of the Literature on the Languages of the Pennsylvania Germans. In Current Work in Linguistics, ed.
Costin, Paul. Siebenbürger Sachsen native speaker. Personal communication. April 2014.
Dimitriadis, Alexis; Lee, Hikyoung; Siegel, Laura; Surek-Clark, Clarissa, and Williams, Alexander, 4 (3):1-18. Philadelphia: University of Pennsylvania Working Papers in Linguistics.
Denison, Norman. 1971. Some Observations on Language Variety and Plurilingualism, chapter 7 in Ardener, Edwin. Social Anthropology and Language. London: Tavistock Publications.
Giesser, Diane. Danube Swabian native speaker. Personal communication. December 2014.
Goldammer, Thomas. Erzgebirgisch and Upper Saxon native speaker. Personal communication. August 2009.
Guion, S. 1996. The Death of Texas German in Gillespie County. In P.S. Ureland and I. Clarkson (eds.), Language Contact Across the North Atlantic. Tübingen: Max Niemeyer Verlag. 443-463.
Hughes, Stephanie. 2005. Bilingualism in North-East France With Specific Reference to Rhenish Franconian Spoken by Moselle Cross-border (or Frontier) Workers. In Preisler, Bent, et al., eds. The Consequences of Mobility: Linguistic and Sociocultural Contact Zones. Roskilde, Denmark: Roskilde Universitetscenter Institut for Sprog og Kultur.
Jeep, John M., editor. 2001. Medieval Germany: An Encyclopedia. New York and London: Garland.
Kirmaier, Andrea. Oberpfälzisch North Bavarian native speaker. Personal communication. March 2009.
Liesenberg, Friedrich. 1890. Die Stieger Mundart: ein Idiom des Unterharzes, besonders hinsichtlich der Lautlehre dargestellt, pp. 178. Charleston, SC: Bibliolife.
Myhill, John. 2006. Language, Religion and National Identity in Europe and the Middle East: A Historical Study. Amsterdam: John Benjamins Publishing Company.
Pützer, Manfred. 1997. Zu Transkriptionskonventionen Bei Plosiven im Übergangsgebiet Zwischen Moselfränkischen und Rheinfränkischen Dialekten im Germanophonen Lothringen (Frankreich). Phonus 3:25-60.
Ross, Charles. 1989. The Dialects of Modern German: A Linguistics Survey. London: Routledge.
Smith, Norval. Personal communication. March 2009.
Wahl, Petra. Schlitz East Hessian native speaker. Personal communication. March 2009.
Wahl, Petra. Schlitz East Hessian native speaker. Personal communication. May 2009.
Wahl, Petra. February 2010. Schlitz East Hessian native speaker. Personal communication.
Wahl, Petra. April 2014. Schlitz East Hessian native speaker. Personal communication.
Wahl, Petra. July 2014. Schlitz East Hessian native speaker. Personal communication.

This research takes a lot of time, and I do not get paid anything for it. If you think this website is valuable to you, please consider a a contribution to support more of this valuable research.

A Reworking of German Language Classification, Part 1: Low German

Updated June 28, 2016. This post will be regularly updated for some time. Warning! This essay is very long; it runs to 66 pages on the Internet. This is part 1 of the essay: Low German. Part 2 deals with Central German and Part 3 deals with High German.
This classification splits Low German from 10 languages into 12 languages using the criterion of >90% intelligibility = dialect and <90% intelligibility = language.
Low German or Low Saxon is a group of far northern German dialects. Dialects up by Hamburg and Friesland sound like English. According to an older edition of Ethnologue, there are 20-30 Low German lects which are all mutually unintelligible. None of the Low German languages are intelligible with Standard German. Low German differs from region to region and even from village to village.
Ethnologue says that there are only 1,000 speakers of Low German, but that 10 million can understand the language. This is completely wrong.
On the basis of a recent survey conducted in the former West Germany 25 years ago, 20%, or 3.5 million, said that they could speak Low German very well. Another 36%, or 6.5 million, said they had anywhere from some to good Low German speaking abilities. Fully 89%, or 16 million, had some reasonable amount of understanding of the language.
This means that 25 years ago, there were 10 million people in West Germany alone who could speak Low German to one degree or another.
Until fairly recently, most Low German speakers could only speak their Low German language and nothing else.

Low Saxon dialects extending from Netherlands to Lithuania
Low Saxon dialects extending from Netherlands to Lithuania.

Low German has over 4,000 different dialects in it. As late as 1960, a situation existed from Hamburg west to the Netherlands border in Germany whereby, while most spoke Standard German, most also spoke regional dialects. In general, one village could understand the next couple of villages over, but beyond that, things got dicey. So even 50 years, you had many de facto separate languages of Low German. The question is how many of these separate languages still exist.
Another view of Low Saxon dialects
Another view of Low Saxon dialects

Even today, more pure forms of Low Saxon have about 40% intelligibility with Standard German.
Recent writings suggest that all Low Saxon speakers can communicate adequately in any of these disparate lects. This flies in the face of SIL’s earlier statement about 20-30 inherently unintelligible languages. Therefore, there needs to be an assessment on the ground of what existing Low Saxon lects look like and how intelligible they are with each other.
Some descriptions describe the type of intelligibility within Low Saxon as akin to that between the Scandinavian languages. However, recent findings seem to indicate that the mutual intelligibility of the Scandinavian languages is much exaggerated.
Having seen transcriptions of translations of a single short story into different Low Saxon lects, it seems clear to me that they differ dramatically. Ethnologue has already split off Westphalian and East Frisian Low Saxon, so the matter is settled as far as those two go. All of the rest are lumped into Low German as some possibly dubious macrolect. The situation regarding intelligibility within German Low Saxon remains very confused.
North Low Saxon is a Low Saxon language spoken in the north of Germany. It is understood across a wide region. There is a standard version based broadly on the Hamburg dialect that is widely used on TV and in the media. Dialects include Holsteinisch, Schleswigsch, Bremen, Hamburgs, Emsland and Oldenberg. All of these dialects are apparently mutually intelligible, though transcribed versions of them are often quite divergent.
Schleswig-Holstein is a North Low Saxon language spoken in the Schleswig-Holstein region of Germany. Holsteinisch North Low Saxon and Schleswigsch North Low Saxon appear to be mutually intelligible, but some speakers of Schleswig-Holstein have difficulty understanding North Low Saxon from Lower Saxony to the south.
Holsteinisch (Holsatian) is a North Low Saxon dialect spoken in Holstein around Kiel. It has similarities with the Old Low German language and to Bremen Westphalian and Heide Westphalian. Holsteinisch is still holding up pretty well.
This is the area from which the Angles and Saxons originated before leaving for Britain. Holsteinisch may well be intelligible with Hamburgs, Oldenberg and Schleswigsch. Subdialects include Heikendoft, Dithmarscher, Central Holsteinisch (Zentral Holsteinisch), Stormarner, East Holsteinisch (Ostholsteinisch) and Kiel.
Schleswigsch (Schleswickian) is a North Low Saxon dialect spoken in Schleswig. It has a lot of influence from North Frisian and Low Danish. This lect is not intelligible at all with Standard German. People from around Berlin even say that they can’t understand a word of this lect in its pure form. In the city of Flensberg near the Danish border, the whole city speaks Schleswigsch. Very close to the border with Denmark, people speak a form of Schleswigsch that is intelligible with South Jutnish, a highly divergent form of Danish spoken in the far south of Denmark that is actually a separate language.
This dialect is doing very well compared to the rest of Low Saxon, especially in the west area of the zone.
Angel German is a North Low Saxon language spoken in the Anglen region of far northern German. This region stretches from the Baltic Sea in the north and east to the Dannevirke and the Slie in the south and the moors in the west, a triangle of 350 square miles. This is where the Angles who invaded England and gave the name of the country and the language their name came from. Angel German to this day sounds somewhat like English. Angel German is poorly understood even by other Low German speakers because it still has many Danish words and grammar (Bock 1933).
Most of the grammar is German, but most of the vocabulary is Danish with some German and Low German words mixed in (Gosch 1861). It sounds a lot more like Danish than German. There are certain Danish sounds such as the z, soft s and sch that no German can pronounce without special training. One theory is that this was formerly Danish or more properly South Jutnish, a dialect so divergent that it can be seen as a separate language from Danish, that simply became Germanized over time due to instruction in the German language.
At one time, fishermen around Lübeck spoke a form of Low Saxon koine that could be understood by sailors and fishermen from any of the Baltic Sea nations.
Hamburgisch is a North Low Saxon dialect that serves as something like the official form of Low Saxon, a koine, or Standard Low Saxon, in Germany. It is widely understood across the Low Saxon speaking region. The language itself is spoken and around Hamburg.
Ollands, spoken in Ollands, a fruit and vegetable growing region of northern Germany on the Lower Elbe, is a subdialect of Hamburgisch. Other subdialects include Finkwarder, Kirchwerder, Harburg , Olwarder, Veerlanner (with many sub-subdialects), and Barmbeker. Kirchwerder is spoken 12 miles southeast of Hamburg.
There are still some middle-aged speakers of Hamburgisch, and the language is doing better than most Low Saxon lects. In addition, there are also some young speakers of Hamburgish, especially on islands off the coast of the mouth of the Elbe River.
Bremen is a North Low Saxon dialect that is spoken in the area about from Bremen east. It has traces of both the Frisian and the Oldenberg languages and is related to both the Holstein and the Heide languages. It may be intelligible with Oldenberg, Hamburgs, Schleswigsch and Holsatian. Bremen has only 57% intelligibility to speakers of Gronings-East Frisian Low Saxon.
Oldenberg is a North Low Saxon dialect that is spoken just west of Bremen in what used to be the state of Oldenberg. This language is holding up better than a lot of the other Low Saxon lects.
It has been influenced somewhat in the north by East Frisian, and there is an inland version of East Frisian called Saterland Frisian that is spoken right around this area. To the south, it has been influenced by Munsterland Westphalian, and to the east, by Bremen Westphalian and Heine Westphalian.
Subdialects include North Oldenburg (Nordoldenburger). It may be intelligible with Holsatian, Hamburgs, Schleswigsch and Bremen.
Emsland includes the southern part of the former Weser-Ems district (the area around Osnabrück in Lower Saxony and Emsdetten in far Northern Rhine-Westphalia). This dialect has very heavy East Frisian, Dutch and Groningen influences. It is still doing fairly well. Weser-Trave is a subdialect.
West Low Saxon is a group of languages and dialects spoken in the Netherlands and Germany. For the West Low Saxon languages spoken in the Netherlands, see here. Dialects spoken in Germany include South Emsland (Südemsländisch), Hümmlinger, South Oldenburg (Südoldenburgisch), North Osnabrück (Nordosnabrückisch), and West Diepholzer. South Oldenburg is spoken in Oldenburger Münsterland.
Dialect in North Rhine Westphalia. North Low Saxon, Westphalian, Eastphalian, Hessian, Ripaurian, Low Franconian and Moselle Franconian are shown.
Dialect in North Rhine Westphalia. North Low Saxon, Westphalian, Eastphalian, Hessian, Ripaurian, Low Franconian and Moselle Franconian are shown.

Westphalian (Westfäölsk) is a West Low German language spoken in Westphalia in the northeastern part of North Rhine-Westphalia, but not in Siegerland and Wittgenstein. It is a separate language and is definitely not intelligible with other forms of Low German.
It is mostly spoken by older people now. Westphalian is doing fairly well, but not great, compared to other Low Saxon lects. There are still a tiny number of speakers in Iowa in the Waterloo and Cedar Falls area of Blackhawk and Bremer Counties. Dialects include West Munsterland (Westmünsterländisch), South Westphalian (Südwestfälisch), and Bentheimisch.
Münsterlandish is a Westphalian language spoken in Westphalia around Munster. Furthermore, it is very hard for Northern Low Saxon to understand, harder to understand than the rest of Westphalian. It is mostly spoken by older people now. Intelligibility testing with this language and the rest of Westphalian is indicated.
Steinfurt is a Munsterlandish dialect spoken in Westphalia around Munster. It is quite different from Munster Westphalian proper. This language is transitional between North Saxon, Eastphalian and Westphalian.
This seems to be the lect that is often described as Grafschafter Platt (County Language). The term “county” refers to the fact that this region was one of the few in Germany that had was ruled by a count (a feudal figure like a duke). Grafschafter Platt seems to be spoken in the region between Osnabrück (Emsland) and Munsterland and over to the Dutch border where it looks like it borders on Twents.
There are a tremendous number of dialects in this language, especially over by the Dutch border. It’s not really correct to say that each village has its own dialect, but there is definitely a dialect continuum with new dialects every few villages or so.
The five main dialects are Gildehaus, Upper Grafschaft, Nordhorn, Lower Grafschaft, and Wietermarschen Group.
Wietermarschen Group is spoken in Wietermarschen, Drievorden, and Engden.
Nordhorn is spoken in the city of that name.
Lower Grafschaft is spoken around the towns of Emlichheim, Laar, and Hoogstede. Lower Grafschaft has heavy Dutch influence, more than any other West Low German language spoken in Germany. This language has undergone a serious decline in the past 50 years. It is now spoken by 25% of the population and understood by 50%.
East Westphalian (Ostwestfälisch) is a series of lects that are spoken in the eastern parts of the Westphalian speaking zone.
Osnabrück is an East Westphalian dialect spoken in the area around Osnabrück in southern Lower Saxony.
Lübbecke is an East Westphalian language that is spoken in and around Lübbecke and to the north. The variety from around the towns of Stemwede and Oppenwehe has poor intelligibility with the Osnabrück Westphalian spoken around Bad Ilburg south of Osnabrück only 25 miles to the southwest. The region is heavily forested.
Ravensbergish-Lippish is an East Westphalian dialect that is spoken in the north of Northern Rhine-Westphalia near Lippstadt, Steinhagen, and Rheda-Wiedenbruck.
Paderborner is an East Westphalian dialect spoken around Paderborn in northeast Northern Rhine-Westphalia near the border with Lower Saxony.
Soester Westphalian is an East Westphalian dialect spoken in and around the city of Soester east of the Ruhr region.
Sauerland is an East Westphalian language, definitely a separate language, spoken in Westphalia in the Sauerland, which is in southeast North Rhine-Westphalia. Although it is related to Westphalian, it is a separate language and is not intelligible with other forms of Westphalian or Low German. This language is definitely still spoken. There are other languages spoken in the Ruhr-Sauerland region.
Balve is an East Westphalian dialect spoken in and around Balve, near Dortmund in the Sauerland region of North Rhine-Westphalia. This area has tremendous dialect diversity and there is a new language every couple dozen miles or so. Knowing this and comparing Balve with Lüdenscheid, Balve may be a separate language, however, until we get specific data, we can’t split it off.
Lüdenscheid is an East Westphalian language spoken in and around the city of Lüdenscheid in North Rhine Westphalia. Speakers report that the Low German in this region is incredibly varied, with new languages ever couple dozen miles or so. Thus, Lüdenscheid may be a separate language, , however, until we get specific data, we can’t split it off.
Gladbeck is an East Westphalian language spoken in the town of Gladbeck. Gladbeck is a town located in the Ruhr between Gelsenkirchen and Bottrop and north of Essen.
Eastphalian (Ostfälisch) is a West Low German language spoken east of the Weser River in southern parts of Lower Saxony and western parts of Saxony-Anhalt, in cities such as Hanover, Braunschweig, Hildesheim, Göttingen, and Magdeburg in Eastphalia. It is completely unintelligible with the rest of Low Saxon.This language is barely holding on, with very low activity and only a few speakers. Eastphalian is best seen as transitional between Low German and Middle German.
Eastphalian lects include Solling, Braunschweiger, Bode (Bode Ostfälisch), Calenberger, Elbe (Elbostfälisch), Göttingisch-Grubenhagensch, Heide (Heideostfälisch), Hildesheimer , Holzland (Holzland Ostfälisch), Huy (Huy Ostfälisch), North Eastphalian (Nord Ostfälisch), Oker (Oker Ostfälisch), East Eastphalian (Ostostfälisch), and Papenteicher .
Solling is an extremely divergent lect of Eastphalian that is spoken in the Solling Forest of Lower Saxony. It is dying out, but the pure form of it is still spoken by the elderly. It is very strange and is said to sound like the Frisian language. It is quite possible that this is a separate language, as it is said to be quite distant from Eastphalian proper.
Elbostfälisch is an Eastphalian language spoken around Oschersleben and Haldensleben in the Magdeburger Börde, which is between Helmstedt and Magdeburg. It is spoken on the west side of the Elbe River from Magdeburg west to the Harz Mountains in Saxony-Anhalt. This language has heavy influence from East Low German and is actually transitional between East and West Low German. It is still in very good shape and is widely spoken in Magdeburg at least.
Göttingisch-Grubenhagensch is a dialect of Eastphalian spoken around Göttingen, Northeim, and Osterode am Harz. It is mostly spoken by older people now.
Heideostfälisch is a dialect of Eastphalian spoken around Celle that has some with Northern Low Saxon elements. It is spoken between Hamburg, Bremen and Hanover. This language has many words that look like English. This suggests that Heide was close to one of the original Old Low German languages that gave rise to Old English. Heide means “pagan” in this language, and this group resisted feudalism and Christianity longer than most other groups in the area. This in part explains the ancient nature of their tongue. This is the area from which the Angles and Saxons originated before leaving for Britain.
Central Eastphalian is is a dialect of Eastphalian spoken in a large area surrounding Braunschweig and Hanover. It is mostly spoken by older people now. This dialect is in particularly poor shape.
Papenteicher is an Eastphalian dialect spoken just north of Brunswick. There are only about 300 speakers left. It is no longer taught to children. There are many more who know individual words and phrase. The language is almost never heard in the region anymore. This dialect has some interesting sounds that are only heard in Friesland and Jutland. It is thought that the region was originally settled by people from this region 1,500 years ago.
East Low German is a group of languages spoken in Mecklenburg – West Pomerania and Brandenburg and surrounding areas, including over into Poland. The two main branches are Mecklenburgish-Pomeranian and Markish. Low Prussian is not included in this grouping – it is a separate group. These are more recent lects, created from West Low German lects with Russian and Standard German admixture.
Mecklenburgisch-Vorpommersch is an East Low German language group. Lects in this group include Wendländisch, Mecklenburgish, West Pomeranian (Westpommersch), and Strelitzisch. Strelizisch is transitional to Mittelpommersch. There are still a considerable number of people who speak and understand this language.
Mecklenburgisch is an East Low Saxon dialect spoken in Mecklenburg-West Pomerania. It has very high to excellent intelligibility with North Low Saxon, so it may only be a dialect of Low German. It is spoken in far northeastern Germany around Straslund and Rostock. This area was once Slavic, but Charlemagne moved Germans in this area. These Germans spoke Low German. Many of these immigrants also spoke Frisian, so there is an element of that too.
However, in the 1700’s, High German became such a strong force in the area that this dialect began to be mixed with High German. The dialect remains very archaic, as the region is very resistant to change in general. This dialect is doing ok, but not great, compared to other Los Saxon lects. Although it is officially an East Low German dialect, it is actually on the border between East Low German and West Low German. Before 1945, this dialect was the main means of communication in the villages.
Vorpommern or West Pomeranian is an East Low Saxon language spoken in Western Pomerania, Vorpommern or Hither Pomerania, however you want to translate the term. This is a part of far northeast Germany on the border with Poland. The northern border is the North Sea. This language is doing ok, but not great, compared to other Low Saxon lects.
As part of a dialect continuum, Pomeranian is clearly not intelligible with East Frisian Low Saxon, but it may be intelligible with Mittelpommersch or Mecklenburgisch.
Markish (Märkisch)* is a group of East Low German lects spoken below Middle Pomeranian through Brandenburg down to Berlin and south and east of it, and over to the eastern parts of Saxony-Anhalt. It is little known. It is a major high level German dialect division. Markish is not intelligible with Upper Saxon. This suggests that Markish may well be a separate language. There are many Dutch words in this language.
There are nine lects of Markish.
Pomeranian or East Pomeranian is a Markish language spoken in Poland. This area was Slavic in the 600’s. The Danes laid waste to the area in the 1000’s. The destruction was so severe that the rulers invited German farmers to the area to rehabilitate the land, and this interesting language developed.
Speakers are all elderly and scattered, and the language is moribund. This language was also decimated by the expulsion of Germans from Poland after WW2.
There are five major dialects:
West Prussian (Westpreußisch) was formerly spoken in West Prussia.
Western Further Pomeranian (Westhinterpommersch) was formerly spoken in western Further Pomerania.
Eastern Further Pomeranian (Osthinterpommersch) was formerly spoken in eastern Further Pomerania.
Bublitzisch was formerly spoken in Bublitz (now Bobolice), Poland.
Pomerelian (Pommerellisch) was formerly spoken in a region called Pomerelia.
Pomeranian is not intelligible with Low Prussian or other Low German languages. There is still a significant Pomeranian community in Brazil, and there are some Pomeranian speakers in the US too.
North Margravian or Mittelpommersch is a Markish dialect spoken in the northern part of Brandenburg State around Prenzlau and Wittenberg and in Mecklenburg-West Pomerania around Pasewalk-Ueckermünde.
This dialect has or had two subdialects, West Middle Pomeranian (Westmittelpommersch) and East Middle Pomeranian (Ostmittelpommersch).
East Middle Pomeranian was spoken in the Lower Oder River region of Poland around Stettin and Stargard and may well be extinct since 1945.
West Middle Pomeranian is still alive and is spoken in the areas discussed above. There is some activity to keep this language going, and there are still some speakers left. Before 1945, this language was still the main medium of communication in almost all of the villages in the area.
North Markish (Nordmärkisch or Altmärkisch) is a Markish dialect spoken in Salzwedel, Gardelegen and Stendal in far northern Saxony-Anhalt. This dialect has Eastphalian influences.
Westprignitzisch is a Markish dialect spoken in Perleberg, Pritzwalk, and Wittstock in far northwestern Brandenburg.
Ostprignitzisch is a Markish dialect spoken in Löwenberg, Templin, Zehdenick, and Fürstenberg in far northern Brandenburg. The Prignitz dialects show less Dutch influence than other dialects in the area. They are also close to Mecklenburgish.
New Markish (Neumärkisch) is a Markish dialect spoken in Angermünde and Schwedt/Oder in northeastern Brandenburg.
Flämingisch is a Markish dialect spoken in Jüterbog and Buchenwald in Brandenburg south of Berlin near the border with Saxony-Anhalt and in Saxony-Anhalt in areas north of Wittenberg. Flämingisch is transitional between Low German and Middle German. It is little spoken anymore except by the elderly who are partial speakers. Persons composing a dictionary have only been able to come up with about 1,500 words. It is clearly dying out.
Havelländisch is a Markish dialect spoken in Rathenow, Premnitz and Nauen in Brandenburg west of Berlin.
Brandenburgish or Central Margravian is an East Low German dialect spoken west of Berlin in Staaken, Potsdam and Brandenburg and west of Berlin in Potsdam, Brandenburg an der Havel.
It has been influenced heavily by Dutch from guest workers who came in the 1700’s, and it also has a Westphalian influence. South Brandenburgish (Südbrandenburgisch) and Eberswalder are dialects of Brandenburgish.
There are suggestions that this language is nearly extinct, and may even be extinct, and that speakers for the most part have reverted to a Berlinisch sort of dialect of German. However, this new lect (or whatever lect they area speaking) is unintelligible with Standard German. This language, whatever form it is taking, is still going very strong as of three years ago.
New Mecklenburgish is spoken north and northwest of Berlin around Oranienburg and Neuruppin. It is not in good shape and is under heavy pressure from Berlinisch and Standard German. In fact, it may well be de facto extinct. Investigation is needed to determine if this dialect even exists anyone, or has reverted to some sort of Berlinisch dialect.
Low Prussian (Niederpreußisch) is a separate branch of Low German spoken in eastern Poland. It is spoken in the region where the Slavic Kashubian language is spoken, so it received some influence from that language. This area was Slavic in the 1200’s and became German in the 1700’s.
This is a full language, not intelligible with Standard German or with any other German language. It used to have many speakers, but now it is moribund. There are a few elderly speakers left, but no language community.
It has 11 major dialects.
Low Prussian-East Pomeranian (Übergangsmundart zum Ostpommerschen) was a transitional dialect with East Pomeranian.
Vistula Delta (Weichselmündungsgebietes) was spoken around Danzig (Gdansk) at the mouth of the Vistula River.
Frischen-Danzig Spit (Frischen-Nehrung Danziger-Nehrung) was spoken around the Vistula Lagoon.
Elbing Heights (Elbinger Höhe) was spoken around Elbing (Elblag).
Kürzungs was spoken around Braunsberg (Braniewo).
West Käslausch (Weskäslausch) was spoken around Mehlsack (Pieniezno).
East Käslausch (Ostkäslausch) was spoken around Rößel (Reszel).
Natangian-Bartish (Natangisch-Bartisch) was spoken around Bartenstein (Bartoszyce).
West Sambian (Westsamländisch) was spoken around Pillau (Baltiysk).
East Sambian (Ostsamländisch) was spoken around Königsberg (Kaliningrad), Labiau (Polessk) and Znamensk (Wehlau).
Eastern (Ostgebietes) was spoken around Insterburg (Chernyakhovsk), Memel (Klaipeda) and Sovetsk (Tilsit).
Another dialect was Haff (Haff Niederpreußisch). It is not known where this dialect was spoken.
Dialect intelligibility is not known.
It became moribund due to the expulsion of Prussian speakers from Poland after WW2. It is not intelligible with other Low German languages or with Pomeranian. There are apparently still some speakers in Wisconsin.


Auer, Peter. The Construction Of Linguistic Borders And The Linguistic Construction Of Borders. 2005. In Filppula, Markku, Palander, Marjatta and Penttilä, Esa (eds.) Dialects Across Borders: Selected Papers From the 11th International Conference on Methods in Dialectology (Methods XI), Joensuu, August 2002. Current Issues in Linguistic Theory 273. Amsterdam: John Benjamins Publishing Company.
Bock, Karl Nielson. 1933. Niederdeutsch auf dänischem Substrat. Studien zur Dialektgeographie Südostschleswigs. Mit 51 Abbildungen und einer Karte. Kph.
Denison, Norman. 1971. Some Observations on Language Variety and Plurilingualism, Chapter 7 in Ardener, Edwin. Social Anthropology and Language. London: Tavistock Publications.
Gosch, Christian Carl August. 1861. The Nationality of Slesvig. London: Chapman and Hall.
Harms, Biggi. German and Düsseldorferisch Bergisch native speaker. March 2009.
Jeep, John M., ed. 2001. Medieval Germany: An Encyclopedia. New York and London: Garland.
Myhill, John. 2006. Language, Religion and National Identity in Europe and the Middle East: A Historical Study. Amsterdam: John Benjamins Publishing Company.
Ross, Charles. 1989. The Dialects of Modern German: A Linguistics Survey. London: Routledge.
Wiggers, Heiko. 2006. Reevaluating Diglossia: Data from Low German. PhD dissertation. Austin, TX: University of Texas at Austin.

This research takes a lot of time, and I do not get paid anything for it. If you think this website is valuable to you, please consider a a contribution to support more of this valuable research.

North Sea Fisherman Patois

Interesting quote from the Free Republic:

I used to work with a Dutch guy who said that he was at some sort of North Sea confab, and when people spoke in their national languages they couldn’t understand one another but when they spoke their local dialects, they found them mutually intelligible. Who knows?

I have heard this before. A fisherman who lived in Heikendorf on the Eastern coast of Germany on the border between Schleswig and Holstein said that when he spoke his particular North Low Saxon dialect to any fisherman in the North Sea region, they could all understand it.
It was as if all of the fisherman, and only the fishermen, of the North Sea, all spoke a common language (in Linguistics we often call this a “jargon”) that they could all understand. The fishermen must have been from northern Germany, western Denmark, southern Norway, northern Netherlands, northern Belgium and the east coast of England and Scotland.
Far northern Low German looks like Danish, English and Dutch. Flemish, a Dutch language, is spoken in coastal Belgium. Frisian is close to English, Low German, Scots and Danish and is spoken on the Netherlands coast. Dialects of southern Norway look like Danish. Scots sounds similar to Frisian, English, Low German, and Danish. The English and Scots dialects of the eastern coast of the UK received major Scandinavian input. West Danish languages like Jutish look like Scots.
Somehow, all those fishermen just learned to talk to each other. Why? Maybe because they had to.
Jargons are interesting. We call them trade languages in Linguistics.
Chinook Jargon is a famous one. This was a mixed language made up of I think English, French and many Indian languages that was spoken by White and Indian traders in the Pacific Northwest.
Jargons seem to be full-fledged languages, unlike pidgins, which really are just poor excuses for languages. The reason for this is that the jargons are made up of the first languages of many speakers and pidgins are based on the inferior second language acquisition of adult language learners, who never really get the language right. I will discuss pidgins and hopefully creoles and koines in another post.

Questions About the German Language Classification

Updated March 28. In response to some criticisms, I agreed with Professor den Besten in part by removing Limburgs and South Gulderish from Macro-German. I put them in Macro-Dutch, which I will hopefully do a post on in the future. With languages like these, it’s pretty hard to tell where Dutch ends and German begins.I also split Veluws into East Veluws, which stayed in German, and West Veluws, which moved over to Dutch.

In response to the German language post, Hans den Besten, a top Dutch Germanist linguist, first agrees broadly with my classification, but then makes a critique about the scope of Macro-German:

But what is lacking is a definition of what counts as a German dialect. This may differ from case to case. The Low Saxon dialects of the Netherlands may have been included because they are an extension of Low Saxon Niederdeutsch, even though they are under the roof of Schrift-Dutch rather than Schrift-Deutsch. That Limburgish has been included may be due to a couple of High German Low German isoglosses running through that area. But if even South Guelderish and Veluws are included (in so far as I know Veluws is Franconian but for the eastern strip along the Overijssel) I don’t see any reason why Brabantian-East Flemish, West Flemish, Zealandic and Hollandic should be excluded. Furthermore, the exclusion of West and North Frisian also needs some justification. Referring to Anglo-Frisian isoglosses will not be enough because unlike English these languages share a lot of syntax with the rest of Continental West Germanic, since they are SOV cum Verb Second. And if we try to set the two Frisians apart by referring to the idiosyncrasies of the verbal cluster (no Ersatz-Infinitiv, strict head-final order but for the Frisian “kortstaarten” litt. ’short tails’) then — maybe — Gronings should be taken out of your list of German dialects/languages because it shares a lot of verbal cluster syntax with West Frisian.

Hans poses a number of interesting questions about the borders of Macro-German. Let’s look at them one by one.
The Dutch dialects and languages are on this page. As Brabantian-East Flemish, West Flemish, Zealandic and Hollandic are listed as Dutch on that page, I am treating them as Macro-Dutch and not Macro-German.
As far as Gronings goes, I am treating it as Macro-German due to Ethnologue’s grouping here. As you can see, Low Saxon is treated as separate from Macro-Dutch there and also in my treatment. To me, it is better to see Macro-Dutch as something equal to Low Franconian and to put Low Saxon in with Macro-German.
In terms of Frisian, Ethnologue places it outside of Macro-German altogether along with English in the West Germanic Family. Keep in mind that Germanic and German are not synonymous. After all, Swedish is Germanic but not German.
As far as Veluws, Ethnologue sees it as Low Saxon and not as Low Franconian.
Ethnologue has no listing for South Guelderish or for anything similar. South Guelderish is a very confusing classification, that, if anything, looks like a sister language to Limburgs, if not a part of Limburgs itself.
After conferring with a Dutch linguistics professor, I have now decided to remove Limburgs, South Guelderish and the related Low Rhenish lects spoken across the border in Germany from Macro-German.
I have put them in Macro-Dutch, and hopefully will redo the classification of Dutch soon in a separate post. As far as two languages, one called Southeast Limburgs/Aachen, and another called Low Dietsch, they have stayed in Macro-German, because my professor friend described at least the SE Limburgs lects as Ripuarian, with Ripuarian automatically going to Macro-German.
The languages described above are collectively known as Meuse-Rhenish, and to be honest they are transitional between Low Franconian (Dutch) and Low Saxon (German).
We run into a situation like what one finds in Alsace-Lorraine, where curious travelers said that, “Some people speak German, some people speak French,  and others seem to speak languages that are neither French nor German.” Substitute “Dutch” for “French” in the above situation and you have a pretty good portrayal of the confusing language situation on the Dutch-German border.
These languages are confusing because really they are  transitional languages between Dutch and German.
Hans also raises some questions about Dutch Low Saxon. I have decided to throw all of Dutch Low Saxon into Macro-German, as this seems to be the consensus these days. Hans says that Veluws is Franconian, but I am not so sure. My professor said Veluws is regarded as marginally Low Saxon. I am going to hold to my guns and keep Groningen in Low Saxon. A friend of mine remarked today that in some ways, “Dutch” in terms of linguistics is almost a political construct.
A very tricky language to classify was East Frisian Low Saxon.  It’s clearly not Low Franconian (Dutch) but neither is it Low Saxon (Low German). So what is it? It’s really in its own category, which is something like Friso-Saxon, a Low German language with a heavy Frisian base. I put it in Low German because that is where most folks seem to be tossing it.
There have also been criticisms that my treatment was overbroad in scope. If anything, it is conservative.
There may be up to 40 separate languages within Swiss German. The Ripuarian lects are so diverse that 150 of them are different enough that they have had separate dictionaries written for them. Of those 150, about 120 of those have serious differences in lexicon, phonology and morphology. Speakers of Ripuarian frequently refer to “150 Ripuarian languages.” There are probably a number of separate languages within Tyrolean South Bavarian.
However, barring solid documentation for these separate apparent separate languages, it’s not reasonable to split them off yet.
The question comes up about where you split a dialect chain. Indeed, this is one of the trying questions of Linguistics. Once you get to the point where there are some dialects in Lect A that cannot talk to some dialects in Lect B, you have yourself as dialect chain. Hence, Czech and Slovak are split even though Eastern Czech can understand Western Slovak, because Western Czech can’t understand Eastern Slovak.
A commenter points out that the problem of doing this is you are going to end up with separate languages that have communicable dialects. Indeed this is true, but it’s the case in many world languages that they have dialects that communicate with dialects of neighboring tongues.
There is a dialect chain running from Belgium to Austria where each village can talk to the next. There is another dialect chain running from Portugal to Sicily. There is yet another running from about Turkey way over to Siberia.
A greater problem with dialect chains is refusing to split them at some point into separate tongues, because then you have one language with noncommunicative dialects which makes less sense than separate languages with communicative dialects.
I use the term “lect” to mean something that may be either a dialect or a language, or some speech form that we can’t figure out if it is a dialect or a language.
Finally, the perennial question of intelligibility came up. You often read that this or that lect can easily communicate with some other lect, that they are mutually intelligible, more or less mutually intelligible, etc.
It is commonly noted that, for instance, Dutch and Afrikaans are highly mutually intelligible. In fact, my investigation revealed that Dutch speakers say that they have ~80% intelligibility of Afrikaans. That’s probably about what it is.
80% is not mutually intelligible. It means separate languages. The problem with 80% intelligibility is that it is just enough lack of communication to cause what I would call “significant disruption in communication.” This gets more important as we discuss more high-level things. It is almost impossible to discuss complicated and important topics well with less than 90% intelligibility. That’s just enough disruption to throw a serious monkey wrench into things.
At the other end of the spectrum, when we are discussing, say, the weather, much lower levels of intelligibility may be tolerated and we are still able to get our point across.
Some of these determinations were made simply by intuition. On this page, you can look at many different translations, often in Low German, of a single text. Looking at that text in different lects, it become clear which are dialects and which are so different that they may be languages.
Let us take a look here: Hamburgisch, Ollands and
Oldenburg, three Low Saxon lects:
Quick observation shows us that Hamburgs and Ollands obviously must be dialects of one tongue. Yet Oldenburg seems so different that it seems dubious that Oldenburg speakers can converse with the others at 90%+ intelligibility.
Conclusion by “simple observation” (I prefer to call it “direct observation”) was criticized as somehow unscientific. However, direct observation is a well-known scientific technique involved in the hypothesis – testing – conclusion dance of the empirical method.
Keep in mind that much of science is simply observational, hunches, intuition, etc. Francis Crick visualized the double helix structure of DNA via sheer intuition while tripping on LSD.
Sir William Jones famous discovery of Indo-European certainly was simply obervational and intuitive also.

A Reworking of German Language Classification

Updated June 27, 2016. This post will be regularly updated for some time. Warning! This essay is very long; combining all 3 parts together, it runs to 255 pages.
This post has been broken into three parts: Part 1: Low German, Part 2: Middle German and Part 3: High German.
This classification divides German from 20 languages to 137 languages based on the criteria of intelligibility. >90% intelligibility = dialect, <90% intelligibility = language.
The German languages, and German is not a single language, but, like Chinese and Italian, a family of languages, have been in need of a good reclassification for some time now. Ethnologue has done an excellent job, dividing German into 20 separate languages.
However, Ethnologue’s treatment does not go nearly far enough, as they themselves admit in the entry for Standard German: “Our present treatment in this edition is incomplete.” The entry for Low German itself formerly stated that LG is made up of 20-30 separate mutually unintelligible dialects, although this has been revised to “differing intelligibility, depending on distance” in the latest edition.
Hence, this treatment will attempt to expand the 20 German languages listed at present into a higher number. Many of the language divisions noted below are arbitrary, and admittedly based on more intuition than hard evidence. In many cases, intelligibility testing could clear up a lot of confusion. This treatment, like my prior treatment of Chinese, is best seen as a series of often very tentative hypotheses rather than a set of conclusions.
The classification scheme (for instance, the decision to include Low Saxon as a part of Macro-German rather than a part of Macro-Dutch) is fairly arbitrary and is not the purpose of this treatment, which deals mainly with intelligibility. This treatment makes no statements about classification, generally just following Wikipedia and Ethnologue. There are others doing major work on classification, and I will leave that up to them.

The Frankish Kingdom and West Europe in general, probably around 700-900 or so. The Frankish Kingdom gave rise to the German, Dutch and French languages.
The Frankish Kingdom and West Europe in general, probably around 700-900 or so. The Frankish Kingdom gave rise to the German, Dutch and French languages.

So far, this classification expands German from 20 separate languages to 142 separate languages. It is incomplete, and it is also a pilot study intended to spur further research, analysis and especially evidence-based criticism.
Criticism is welcome, as long as it is rational and evidence-based. Keep in mind that we have valid intelligibility data for a lot of these languages, so wild claims of widespread intelligibility are likely to be ignored. Further splitting is certainly warranted, and lumping may be too. Both will require evidence before proceeding.
Method: Literature and reports were examined to determine the intelligibility of the various dialects of German. Native speakers of various lects were also interviewed, and the results of scientific intelligibility testing were examined. There was an appeal to authority – if states or the ISO recognized that a lect was a language, this determination was simply accepted.
>90% intelligibility was considered to be a dialect of German. <90% intelligibility was considered to be a separate language from Standard German.
The emphasis was on intelligibility rather than structural factors. Certain sociolinguistic factors also went into the calculation, but their use was minimized. Overtly political argumentation was ignored.
This piece may be seen as a companion piece to my other similar pieces. A reclassification of Chinese expands Chinese from 14 languages into 343 languages. A reclassification of Catalan reanalyzed it from 1 language to 2 languages. A reclassification of Occitan changes it from 6 languages to 12 languages. A reclassification of Dutch changed it from 15 to 30 languages.
As far as my qualifications for writing this, I have a Masters Degree in Linguistics, and I have been employed as a linguist for an American Indian tribe where I created an alphabet, ran the language program, worked on a dictionary and phrasebook and did fieldwork with native speakers.
German, like Chinese, is a pluricentric language, with a standard version and many typically mutually unintelligible major dialects surrounding it. Interdialectal comprehension is achieved via the use of Standard German.
Hence, the intelligibility estimates by the ignorant are going to be biased. What these people mean when they say that everyone in Germany can understand each other is that they can when they speak Standard German to each other. However, there are still a few older folks in Germany who cannot speak Standard German and can only speak their regional form of German.
There are 27 main German dialect families, and all are considered to be separate languages.
The German dialects exist as a dialect chain where dialects are normally intelligible to the dialect regions next door, but not to those more distant. At the same time, it is frequently stated that the major German dialects are not mutually intelligible. This makes delineating languages from dialects quite difficult and is why intelligibility testing is needed.
Most German “dialects” have low intelligibility (below 90%) to speakers of Standard German, because they are quite divergent and hence hard for a Standard German speaker to understand. There is a strong suggestion that all of the strong forms of the regional lects are not intelligible to a speaker of Standard German.
Germany is awash in dialects. There are over 4000 (!) different dialect groups within Low German alone, and there are 150 dialects in Ripuarian Franconian that were different enough to have dictionaries written for them.
In addition to not being intelligible with Standard German, the major German dialects are in general not mutually intelligible with each other either. Inside of that, there is the even more alarming suggestion that many of the major dialects are so diverse that they are not even completely intelligible among themselves.
A graph of the major German languages is here, and an even better one is here.
A map of the major German dialects by their German names.
A map of the major German dialects by their German names.

Separate languages or suspected separate languages are bolded below. Dialects or extinct languages are generally italicized. Macrolanguages that do not deserve separate language designation are generally printed in normal typeface. All languages and dialects are spoken in Germany unless otherwise noted.
Languages or dialects marked by an asterisk were definitely full languages 50 years ago, but whether they still are today is less certain. Some may still be languages, others may have dwindled to dialects and others may have disappeared. 50 years ago, those languages were still alive and well and probably even being taught to children. Most or all still have speakers, though the youngest speakers may be over 50 in some cases. These starred lects are very tentative additions to this classification.
The German dialects as they existed a while back, possibly about 150 years ago. This situation was still extant around World War 2. Note all of the isolated German speakers in the Slavic lands to the east.
The German dialects as they existed a while back, possibly about 150 years ago. This situation was still extant around World War 2. Note all of the isolated German speakers in the Slavic lands to the east.

This research takes a lot of time, and I do not get paid anything for it. If you think this website is valuable to you, please consider a a contribution to support more of this valuable research.

Response To Mike Campbell on Chinese Language Classification

An autodidact named Mike Campbell has issued a long critique of my Chinese language classification.
There are problems with his analysis.
First of all, Campbell says we need to defer to the Chinese on what is a dialect and what is a language. But top Sinologists in the West are saying that the Chinese are falling down on the job and not working according to the modern scientific definition of what is a language and what is a dialect.
The Chinese linguists operate, like Chinese medicine, according to a completely different format that is pretty much at odds with the one used in the West and in much of the rest of the world.
One element of this format is the fangyan. A fangyan has many meanings, but in Chinese it tends to mean “dialect,” or better yet, “topolect.” It also tends to mean the speech form of a given county. But the Chinese definition of the word “dialect” differs radically from the definition used by linguists elsewhere in the world. For one thing, questions of intelligibility with other lects are left out of the definition of fangyan.
Chinese linguists also use hua, which means something like “speech.” This tends to be more expansive than fangyan, but at the same time it can occur down to the level of dialect. Examples include Putonghua, Shanghaihua, Beijinghua, etc, but also Pinghua and Tuhua. It tends to be geographically based – the speech of a particular geographical location, however that geographical location can be expansive or very restricted. But this is not the case in Putonghua, which is just “average speech”, and is spoken all over China.
The third category is yu. Yu is probably the category that Western linguists would most commonly associate with “language” or even “language family.” Yu only refers to separate languages within Chinese. Outside Chinese, the word wen tends to be used. Examples are Wuyu, Minyu, Huiyu, etc.
No one seems to quite know exactly what the Chinese classification is at any given time.
According to Campbell, we must not do anything until the Chinese act first, but they only make a new language maybe once every few years, and they are failing even at that.
Campbell states that Scots and Bavarian are dialects, not languages. He says that Scots is a dialect of English and Bavarian is a dialect of German. However, Ethnologue says that Scots is a separate language and so is Bavarian. The intelligibility of Bavarian and German is only 40%. I lack figures for Scots, but clearly intelligibility is lower than 90%.
Ethnologue is run by SIL. SIL has been granted the task of assigning all of the new ISO numbers. An ISO number means that a lect has been officially recognized by the world linguistic community as a separate language. So SIL are the linguistic scientists who world community has given the task of deciding what is a language and what is not. Campbell is saying that SIL does not know what they are talking about.
Campbell states that mutual intelligibility cannot be determined by talking to speakers and simply asking them whether or not they can understand “those people over there.”
According to Campbell, this is inaccurate. He says the only way to determine intelligibility is through scientific testing methods looking for % in phonology, lexicon, morphology, syntax, etc. He also says that tonal differences are irrelevant for Chinese, because differences in tones do not impede communication, but I would beg to differ on that. Chinese speakers have told me that closely related lects with much different tones can be very difficult to understand, at least at first.
On Ethnologue’s Mexico page, extensive tests have been done on various lects spoken in small villages determining intelligibility between one lect and another. Intelligibility testing is commonly done by simply sitting a speaker of Lect A down in front of a recorded corpus of Lect B and see how much they can understand.
Campbell says that intelligibility testing on human informants is inherently erroneous because as speakers of Close Lect A hear more and more of Close Lect B, they can understand it over a period of time (the exposure factor). This is the problem of interdialectal learning.
Interdialectal learning (the tendency of closely related lects to hear each others’ lects and quickly learn to speak them and hence muddy the waters of intelligibility), trumpeted by Campbell as a reason that intelligibility testing cannot be done on human informants, is regarded by SIL as different from inherent intelligibility. Inherent intelligibility is best regarded as a test of the ability to use the mother tongue.
In other words, when two lects are said to be “inherently unintelligible” this appears to be referring to “virgin” speakers who have not yet had the opportunity to learn each other’s dialects.
Similarly, members of Lect A may simply be bilingual in Lect B, which also invalidates intelligibility testing. However, measures have already been developed to determine bilingualism and the degree of it. A favorite one is SLOPE. SRT is also used in bilingualism testing. Like other intelligibility testing instruments, they have been subjected to tests for reliability and validity over the years.
Further, testing has evolved to the point where we can begin to ferret out bilingualism from inherent intelligibility. In Casad 1974 the author describes testing done on speakers of Mazatec, a Mexican Indian language.
Intelligibility testing was done to see how well they understood Huautla, a related language. Three female speakers had scores in the 50-60% range, and three males had scores in the 90-100% range. Huautla is a local market language that is learned as a second language by many non-Huautla in the surrounding area. I would gather that 55% represents true inherent intelligibility and the 95% speakers represent practiced bilinguals.
At any rate, in the survey, the figures were averaged together so that Mazatec speakers had 76% intelligibility with Huautla and Mazatec and Huautla were said to be separate languages.
Campbell also throws out a red herring in the notion that certain members of a group may simply refuse to hear the language of another group and insist that they do not understand it. Although existent, this problem has little relevance in intelligibility testing. SIL does testing with cross sections of communities.
Furthermore, SIL notes that intelligibility is typically distributed evenly across a community with regard to sex, class and age.
The SD’s for inherent intelligibility in a community are narrow, less than 15%, whereas the SD’s for bilingualism are much higher. This is because in the case of bilingualism, communities differ. Some feel a strong need to learn the other language, others feel no need at all. Further, members differ in their access to an opportunity to learn the other language, even though they may wish to learn it.
This should throw out the notion that females, the aged, the young or the old, the wealthy or the poor, will automatically give us false data on intelligibility.
Campbell hints that intelligibility is poorly defined. However, SIL has listed a hierarchy of intelligibility. SIL says that intelligibility below 70% is “unintelligible” and intelligibility over 90% is “adequately intelligible” (this usually conforms to our ideas of a dialect). Between 71-89% is what SIL calls “marginally intelligible.” Lately, SIL throws most lects with under 90% intelligibility into separate languages.
Campbell recommends throwing out all intelligibility testing with informants as inherently inaccurate and focusing instead of measures of language similarity.
However, SIL notes that linguistic similarity is not an adequate single predictor of intelligibility. For instance, testing in the Philippines revealed pairs of lects with vocabulary similarity of 52, 66, 72 and 74% which had over 90% intelligibility (were inherently intelligible). Over 80% vocabulary similarity for lect pairs resulted in several cases of inherent intelligibility. So lexical similarity is not an adequate measure at all for measuring intelligibility.
In testing of Polynesian, Siouan and Buang, it was found that the higher the level of lexical similarity up to a certain point, the lower the intelligibility scores were. This is counterintuitive, but it shows once again that lexical similarity is poor measure.
Morris Swadesh was the founder of lexicostatistics, the study of lexical similarity. Lexicostatistics has its uses, but determining between closely related languages and dialects is apparently not one of them.
This myth seems to be dying a hard death. Robert Longacre and Sarah Gudschinsky were involved in long debates with Swadesh about the validity of lexical similarity measures, and they seem to have been proven right. The latest findings calculate that any study that uses lexical similarity alone to determine intelligibility of lects has a 4.5-1 chance of failing to do so with any reliability.
Word lists still have their uses. Where word lists show similarities between lects below 60%, odds are that we are dealing two separate languages, and there is no need to do any further intelligibility testing. And they have obvious uses in historical linguistics and in determining genetic relationships between languages.
Vocabulary similarity below 67%, though, typically reveals intelligibility estimates below 60%. Intelligibility below 60% is inadequate for all but the very simplest communication. Before any kind of even slightly complex or revealing messages can be conveyed, intelligibility usually needs to be over 85%. Casad found that 90% intelligibility on a narrative test was necessary before one could move to more complex kinds of communication. Here once again we get into the dialects.
Intelligibility is usually asymmetrical. In other words, Lect A can understand 80% of Lect B, but Lect B can only understand 70% of Lect A. There are arguments about the reasons for this, but one suggestion is that higher figures result from some sort of bilingual learning.
Campbell also points out that it is not uncommon that people speaking the same language cannot always understand each other. He asks how often we have heard a fellow English speaker of the same dialect say something and we did not catch what they were saying for some reason or other. The implication is that we need to throw out all testing with informants due to this.
SIL has actually examined this, and they often include a test called “home-town” in which people are presented with narratives within their own dialect and an intelligibility score is given for that. It is true that sometimes this is lower than 100%, but it is typically not much lower. Nevertheless, using the “home-town factors” of Lects A and B as controls in factor analysis helps greatly when moving on to actual intelligibility between Lect A and Lect B.
One thing to do is to throw out all sentences or questions that score less than 100% on home-town, since if the speakers can’t even understand these sentences well when their own people speak them, how can we measure how well they understand them when speakers of other lects speak them?
Campbell suggests that there are no tests available to use on human informants that pass the smell test of empiricism. This is not the case.
One test, the Sentence Repetition Test (SRT), has been used for decades, subjected to many papers and studies, and criticized and modified in many ways.
In this case of SRT, testing of group members individually has been shown to be superior to testing them in groups. The reason for this is because when you do intelligibility testing in a group of say eight people, you can run into a strong personality or high-ranking male in that group who might say he understands much more than he really does for some reason or another,  possibly to show off. The other less dominant group members then follow his lead and give false high readings on the intelligibility test.
Many linguists, led by SIL, have been leading the way in intelligibility testing for decades now. Some of the top figures in in this subfield are the couple Joseph and Barbara Grimes of SIL. Joseph Grimes is a retired linguistics professor from Cornell.
In addition, a number of computer programs have been created that help the researcher to test intelligibility.
Another charge, that intelligibility testing lacks adequate controls, has been shown to be false. Bias in both experimenter and subject has been shown to be a problem, as is the case in most or all science, and measures have been undertaken to deal with it.
The notion that this subfield of Linguistics, intelligibility testing, is unscientific should be laid to rest.
Ethnologue seems to place tremendous importance on mutual intelligibility, however defined. Mutually unintelligible lects are assumed to be separate languages by Ethnologue. Their criteria for splitting off a dialects into languages seems to be 90%. Below 90%, separate languages. Above 90%, dialects of a single language.
In conclusion, Mr. Campbell’s principal contentions in his critique are all incorrect.
First, he suggests that the very concept of mutual intelligibility between lects is impossible to define or prove. SIL has shown that the concept can be defined and tested by reliable instruments.
Second, he says that the use of human informants in mutual intelligibility testing is so prone to error that it cannot guarantee satisfactory results. This is not the case. SIL has proven, through decades of testing, that mutual intelligibility is best done, or possibly can only be reliably done, through intelligibility tests with human informants.
Third, he throws up a number of red herrings that supposedly prove the inherent unreliability of human informants in intelligibility testing. All of these are shown to be the very red herrings that I claim they are, although it is true that unrecognized bilingualism is a problem, but it can often be ferreted out.
Fourth, he says that the only way to reliably test for intelligibility is to compare lects via tones, phonology, morphology, syntax and lexicon. This is an extremely complicated process utilizing math and computer programs and can only be undertaken by practiced linguists. In truth, such elaborate testing, while interesting, is entirely unnecessary.
Fifth, he suggests that any Western reformulations of Chinese language classification need to first defer to the Chinese. The problem here is that the Chinese have completely fallen down on the job. We cannot defer to the Chinese without upsetting our entire system of language classification. The Chinese are entitled to their system, but it is at odds with that used by the rest of the world.


Casad, Eugene H. 1974. Dialect Intelligibility Testing. Summer Institute of Linguistics Publications in Linguistics and Related Fields, 38. Norman, OK: Summer Institute of Linguistics of the University of Oklahoma.
Casad, Eugene H. 1992. “State of the Art: Dialect Survey Fifteen Years Later.”‭ In Eugene H. Casad (ed.), Windows on Bilingualism, 147-58. Summer Institute of Linguistics and the University of Texas at Arlington Publications in Linguistics, 110. Dallas, TX: Summer Institute of Linguistics and the University of Texas at Arlington.
Grimes, Barbara F. 1992. “Notes on Oral Proficiency Testing (SLOPE).”‭ In Eugene H. Casad (ed.), Windows on Bilingualism, 53-60. Summer Institute of Linguistics and the University of Texas at Arlington Publications in Linguistics, 110. Dallas, TX: Summer Institute of Linguistics and the University of Texas at Arlington.
Grimes, Joseph E. 1992. “Calibrating Sentence Repetition Tests.”‭ In Eugene H. Casad (ed.), Windows on Bilingualism, 73-85. Summer Institute of Linguistics and the University of Texas at Arlington Publications in Linguistics, 110. Dallas, TX: Summer Institute of Linguistics and the University of Texas at Arlington.
Grimes, Joseph E. 1992. “Correlations Between Vocabulary Similarity and Intelligibility.”‭ In Eugene H. Casad (ed.), Windows on Bilingualism, 17-32. Summer Institute of Linguistics and the University of Texas at Arlington Publications in Linguistics, 110. Dallas, TX: Summer Institute of Linguistics and the University of Texas at Arlington.

The Classification of the Vietnamese Language

One of the reasons that I am doing this post is that one of my commenters asked me a while back to do a post on the theories of long-range comparison like Joseph Greenberg’s and how well they hold up. That will have to wait for another day, but for now, I can  at least show you how some principles of Historical Linguistics, a subfield that I know a thing or two about. I will keep this post pretty non-technical, so most of you ought to be able to figure out what is going on.
Let us begin by looking at some proposals about the classification of Vietnamese.
The Vietnamese language has been subject to a great deal of speculation regarding its classification. At the moment, it is in the Mon-Khmer or Austroasiatic family with Khmer, Mon, Muong, Wa, Palaung, Nicobarese, Khmu, Munda, Santali, Pnar, Khasi, Temiar, and some others. The family ranges through Vietnam, Cambodia, Laos, Thailand, Malaysia, Burma, China, and over into Northeastern India.
It is traditionally divided into Mon-Khmer and Munda branches. Here is Ethnologue’s split, and here are some other ways of dividing up the family.
The homeland of the Austroasiatics was probably in China, in Yunnan, Southwest China. They moved down from China probably around 5,000 years ago. Some of the most ancient Austroasiatics are probably the Senoi people, who came down from China into Malaysia about 4,000 years ago. Others put the time frame at about 4-8,000 YBP (years before present).
A major fraud has been perpetrated lately based on Senoi Dream Therapy. I discussed it on the old blog, and you can Google it if you are interested. In Anthropology classes we learned all about these fascinating Senoi people, who based their lives around their dreams. Turns out most of the fieldwork was poor to fraudulent like Margaret Mead’s unfortunate sojourn in the South Pacific.
The Senoi resemble Veddas of India, so it is probably true that they are ancient people.  Also, their skulls have Australoid features. In hair, they mostly have wavy hair (like Veddoids), a few have straight hair (like Mongoloids) and a scattering have woolly hair (like Negritos). Bottom line is that ancient Austroasiatics were probably Australoid types who resembled what the Senoi look like today.
There has long been a line arguing that the Vietnamese language is related to Sino-Tibetan (the family that Chinese is a part of). Even those who deny this acknowledge that there is a tremendous amount of borrowing from Chinese (especially Cantonese) to Vietnamese. This level of borrowing so long ago makes historical linguistics a difficult field.
Here is an excellent piece by a man who has done a tremendous amount of work detailing his case for Vietnamese as a Sino-Tibetan language. It’s not for the amateur, but if you want to dip into it, go ahead. I spent some time there, and after a while, I was convinced that Vietnamese was indeed a Sino-Tibetan language. One of the things that convinced me is that if borrowing was involved, seldom have I seen such a case for such a huge amount of borrowing, in particular of basic vocabulary. I figured the  case was sealed.
Not so fast now.
Looking again, and reading some of Joseph Greenberg’s work on the subject, I am now convinced otherwise. There is a serious problem with the cognates between Vietnamese and Chinese, of which there are a tremendous number.
This problem is somewhat complex, but I will try to simplify it. Briefly, if Vietnamese is indeed related to Sino-Tibetan, its cognates should be not only with Chinese, but with other members of Sino-Tibetan also. In other words, we should find cognates with Tibetan, Naga, Naxi, Tujia, Karen, Lolo, Kuki, Nung, Jingpho, Chin, Lepcha, etc. We should also find cognates with those languages, where we do not find them in Chinese. That’s a little complicated, so I will let you think about it a bit.
Further, the comparisons between Chinese and Vietnamese should be variable. Some should look quite close, while others should look much more distant.
So there’s a problem with the Vietnamese as ST theory.
The cognates look like Chinese.
Problem is, they look too much like Chinese. They look more like Chinese than they should in a genetic relationship. Further, they look like Chinese and only Chinese. Looking for relationships in S-T outside of Chinese, and we find few if any.
That’s a dead ringer for borrowing from Chinese to Vietnamese. If it’s not clear to you how that is, think about it a bit.
Looking at Mon-Khmer, the case is not so open and shut. There seem to be more cognates with Chinese than with Mon-Khmer. So many more that the case for Vietnamese as AA looks almost silly, and you wonder how anyone came up with it.
But let us look again. The cognates with AA and Vietnamese are not just with its immediate neighbors like Cambodian and Khmu but with languages far off in far Eastern India like Munda and Santali. There are words that are found only in the Munda branch in one or two obscure languages that somehow show up again as cognates in Vietnamese.
Now tell me how Vietnamese borrowed ancient basic vocabulary from some obscure Munda tongue way over in Northeast India? It did not. How did those words end up in some unheard of NE Indian tongue and also in Vietnamese? Simple. They both descended long ago from a common ancestor. This is Historical Linguistics.
The concepts I have dealt with here are not easy for the non-specialist to figure out, but most smart people can probably get a grasp on them.
A different subject is the deep relationships of AA. Is AA related to any other languages? I leave that as an open question now,  though there does appear to be a good case for AA being related to Austronesian.
One good piece of evidence is the obscure AA languages found in the Nicobar Islands off the coast of Thailand. Somehow, we see quite a few cognates in Nicobarese with Austronesian. We do not see them in any other branches of AA, only in Nicobarese. This seems odd,  and it’s hard to make a case for borrowing. On the other hand, why cognates in Nicobarese and only in Nicobarese?
Truth is there are some cognates outside of Nicobarese but not a whole lot. In historical linguistics, one thing we look at is morphology. Those are parts of words, like the -s plural ending in English.
In both AA and Austronesian, we have funny particles called infixes. Those are what in English we might call prefixes or suffixes, except they are stuck in the middle of the word instead of at the end or the beginning. So, in English, we have pre- as a prefix meaning “before” and -er meaning “object that does X verb”. So pre-destination means that our lives are figured out before we are even born.  Comput-er and print-er are two objects, one that computes and the other that prints.
If we had infixes instead, pre-destination would look something like destin-pre-ation and comput-er and print-er would look something like com-er-pute and prin-er-t.
Anyway, there are some fairly obscure infixes that show up not only in some isolated languages in AA but also in far-flung Austronesian languages in, say, the Philippines. Ever heard of the borrowing of an infix? Neither have I? So were those infixes borrowed,  and what are they doing in languages as far away as Thailand and the Philippines, and none in between? Because they  got borrowed? When? How? Forget it.
Bottom line is that said borrowing did not happen. So what are those infix cognates doing there? Probably ancient particles left over from a common language that derived both Austronesian and AA, probably spoken somewhere in SW China maybe 9,000 years ago or more.
Why is this sort of long-range comparison so hard? For one thing, because after 9,000 years or more, there are hardly any cognates left anymore, due to the fact of language change. Languages change and tend to change at a certain rate.
After 1000X years, so much change has taken place that even if two languages were once “sprung from a common source,” in the famous words of Sir William Jones in his epochal lecture to the Asiatic Society in Calcutta on February 2, 1786, there is almost nothing, or actually nothing, left to show of that relationship. Any common words have become so mangled by time that they don’t look much or anything alike anymore.
So are AA and Austronesian related? I think so, but I suppose it’s best to say that it has not been proven yet. This thesis is part of a larger long-range concept known as “Austric.” Paul Benedict, a great scholar, was one of the champions of this. Austric is normally made up of AA, Austronesian, Tai-Kadai (the Thai language and its relatives) and Hmong-Mien (the Hmong and Mien languages). Based on genetics, the depth of Austric may be as deep as 30,000 years, so proving it is going to be a tall order indeed.
What do I think?
I think Tai-Kadai and Austronesian are proven to be related (more on that later). AA and Austronesian seem to be related also, with a lesser depth of proof. Hmong-Mien seems to be related to Sino-Tibetan, not Austric.
The case for Vietnamese being related to S-T is still very interesting, and I still have an open mind about it.
All of these discussions are hotly controversial, and mentioning it in linguistics circles is likely to set tempers flaring.


Author and date unknown, What Makes Vietnamese So Chinese? An Introduction to Sinitic-Vietnamese Studies.

The Place of Mandarin in Sinitic

In the comments, James Schipper suggests that Mandarin is to Sinitic what German and Russian are to Germanic and Slavic. He also offers that most Sinitic speakers also speak Mandarin and makes a comparison with Welsh and English and Frisian and Dutch, where every Welsh speaker speaks English and every Frisian speaker speaks Dutch, and each one would rather write in English or Dutch than in Welsh or Frisian.
My comments:
English and German have 60% lexical similarity. English and French have about 25% and English has about 29% with Russian (I need to check on that one!). I need to look at some charts here.
It’s not uncommon for Chinese lects to have 5-30% lexical similarity. Further, there are deep differences in tones, and even grammar and structure. Even the pronouns can differ. But clearly they are all related to German and they all derived form Chinese.
So yes, your analogy with Russian and German as super-languages on top of their families is correct, but it is important to note the vast differences in the lects. It was said that no one could understand Chairman Mao’s dialect, Xiang Nan (Mandarin dialect). Apparently his secretary could understand him, but few others could. I’m not sure how he got his points across.
Further, at this point probably most speakers of the Sinitic languages for sure speak Putonghua, which is the Standard Mandarin. It’s a standard the same way that High German is Standard German and Standard Italian is the standard for that language. However, overseas, many do not speak Putonghua, and in the Cantonese area, I believe many still do not speak Putonghua. English is a Germanic language.
Look at the vocabulary – closest language is Frisian with 64%. Dutch is 62% and German is 60%. French is 25%. English is clearly a Germanic language. There are similar cases with the English Latin layering in the Chinese languages. Some of them have heavy layers of non Sinitic tongues like Zhuang or Hmong.
Besides Putonghua, you are correct that the vast majority of Sinitic speakers are native speakers of some kind of Mandarin.
I believe that a lot of the older folks do not have very good Mandarin and may be monolinguals of their Sinitic tongue, but I’m not sure. The government has been pushing Putonghua very hard for the past decade or so, almost too hard. It’s been killing the smaller tongues. So it’s not quite the same way with Frisian and Welsh yet. I believe it’s pretty common in the South to find Cantonese speakers who don’t speak Mandarin, and it’s for sure the case overseas.
As far as writing, I don’t believe it’s a problem. An ideographic system was perfect for Chinese as it was the one way that all of the speakers of the various Chinese lects could communicate. My father was in China in 1946 and he said that the rickshaw drivers often could not understand each other, but they could all write Chinese, so they would communicate by writing notes.
All Chinese can write to each other, no matter what language they speak, assuming they are literate. A decade ago in a college in Henan, a professor said that the students would come to the college from all over the province and for the first month would communicate by writing notes to each other, so they all wrote a common language. In that province, every county has its own language, and there are even separate languages within counties. It took them about a month or so before they could start working out each other’s languages.
Some comment that the Chinese languages are like a Cockney accent of English. On a website, a commenter said that that’s not true. He said he can understand Cockney, but they had a speaker of an Anhui Mandarin lect as a professor at the university and no one could understand what he was talking about. So it’s quite common for the various Chinese lects to be pretty much incomprehensible to each other.
There are other comments around the Net that say that the Chinese lects are close enough to pick them up if you spend a bit of time there. That’s not really true. The differences between the Chinese lects are often as different as English and German. Now suppose you are an English speaker and you go to Germany. Are you going to “just pick up” German really vast? Forget it. I mean, if you stay there 3 years, maybe. Maybe! Someone else compared the differences between Chinese lects to the gulf between English and Irish. That may be too distant, but it may also be correct.
Differences between the lects ecompass tones, grammar and lexicon. All of them boil down to intelligibility. The major Chinese lects regularly score around 50-60% intelligibility. That is pretty bad and certainly does not qualify them as dialects. A dialect should have 90% intelligibility or more.
This is especially true in the center and south of the country. In Anhui, Fujian , Henan, Hunan , Jiangsu and Zhejiang there is an incredible diversity of tongues. It is said in Fujian that every 3 miles the culture changes and every 6 miles the language changes.
In these parts of China, there are lots of mountains and it is very rural. Many people never left their home village to go over the mountain to talk to the people over there, so a multitude of tongues arose. I understand that in this part of China there are even incomprehensible tongues inside major cities where the downtowners can’t understand the suburbs.

A Reworking of Chinese Language Classification

Updated December 3, 2014. This post runs to 112 pages so far. On March 6, 2011, Sinologist Victor Mair took on the question of Mutual Intelligibility of Sinitic Languages.
The Chinese languages have undergone a lot of reclassification lately (Mair 1991), from one Chinese language a couple of decades ago up to 14 Chinese languages today according to the latest Ethnologue.
However, Jerry Norman, one of the world’s top experts on Chinese, says that based on mutual intelligibility, there are 350-400 separate languages within Chinese (Mair 1991). According to Gong Xun, a Sichuan Mandarin speaker in Deyang, China, by my criteria of distinguishing between language and dialect, there would be 300-400 separate languages in Fujian alone.
So far, 2,500 dialects of the Chinese language have been identified, and a number of them are separate languages.
I have been doing research on this issue recently. Based on the criteria of mutual intelligibility, I have expanded the 14 Chinese languages into 365 separate languages.
There are different ways of doing mutual intelligibility. I decided to put it at 90%, with >90% being dialect and <90% being a separate language. This is based on what appears to be Ethnologue‘s criteria for establishing the line between a dialect and a language.
In the cases below where I had intelligibility data available, a number of Chinese languages had no more than 65% intelligibility between them (Cheng 1991).
Intelligibility is hard to determine. I am not interested in typological studies of lects involving either lexicon, phonology or tones, unless this can be quantified in terms of intelligibility in a scientific way (see Cheng 1991). For the most part, what I am interested in is, “Can they understand each other?”
Reasonable, fair-minded and professional comments, additions, criticisms, elaborations, presentations of evidence, etc. are highly encouraged, as long as politics and emotions are left out of it. The purpose of the classification below is more to stimulate academic interest and sprout new thinking and theory. It is not intended to be an end-all or be-all statement on the subject, in fact, it is quite the opposite.
Interested scholars, observers or speakers of Chinese languages are encouraged to contribute any knowledge that they may have to add to or criticize this data below. So far as I know, this is the first real attempt to split Chinese beyond the 14 languages elucidated by Ethnologue.
There are lapses in the data below. I mean to present this data in outline form to make it more readable.
There are also problems with the data below. In many cases, “separate language” just means that the lect is not intelligible with Putonghua. Unfortunately, I currently lack intelligibility data within the major language groups such as Gan, Xiang, Wu and the branches of Mandarin. There is probably quite a bit of lumping still to be done below. Where lects are mutually intelligible below, I have tried to lump them into one language with various dialects.
It is reasonable to ask what background and expertise I have to write such a post. I have a Masters Degree in Linguistics and have been employed as a salaried linguist for a US Indian tribe. I also sit on a peer review board for a linguistics journal and will soon publish my first work in book form via a book chapter in a book on Turkic languages that will come out soon.
I assume it will be controversial. Keep in mind that this work is extremely tentative and should not be taken as the last word on the subject by a long shot. There are claims that this study claims to be “accurate and precise.”
In truth, it claims nothing of the sort. Initial studies, which is what this is, are de facto never “accurate and precise,” and you can take an extreme argument from scientific philosophy that no science is really “accurate and precise” but is simply “correct for now” or “correct until proven otherwise.”
Gan is a separate language, already identified as such. Many individual Gan lects are unintelligible to other Gan lects. In fact, it is possible that all Gan lects are unintelligible with each other, but that remains to be proven.
Outside of Gan Proper, Leping, while very diverse, is nevertheless intelligible with nearby Gan lects and with Nanchang (Campbell January 2009).
Nanchang and Anyi are apparently separate languages within Gan based on a 200 word Swadesh test (Ben Hamed 2005). Nanchang has a great deal of dialectal diversity, with several dialects covering different cities and the rural areas. Intelligibility is not known.
Jiangyu, spoken in Hubei, is very strange and at least unintelligible to Putonghua speakers, as is Huarong (evidence). Huarong is surely a separate language.
Similarly, Wanzai must surely be a separate language, as must Yichun, Ji’an, Wanan, Fuzhou, Yingtan, Leiyang, Huaining and Dongkou.
Nanchang and Anyi are within the Changjing Group of Gan, which has 15 different lects. Yingtan and Leping are members of the Yingyi Group has 12 lects. Jiangyu and Huarong are members of the Datong Group of Gan, which has 13 lects. Yichun is a member of the Yiliu Group of Gan, which has 11 lects. Wanzai is a member of the Yiping Group of Gan, of which it is the only member.
Leiyang is a member of the Leizi Group of Gan, which has 5 lects. Wanan is a member of the Jilian Group of Gan, of which it is the only member. Ji’an is a member of the Jicha Group of Gan, which has 15 lects. Huaining is a member of the Huaiyue Group of Gan, which has 9 lects. Fuzhou is a member of the Fuguang Group of Gan, which has 15 lects. Dongkou is a member of the Dongsui Group of Gan, which has 5 lects.
Gan has 102 separate lects in it. There are 30 million speakers of the Gan languages.
Within the Min group, Northern Min (Min Bei) and Central Min (Sanminghua) have already been identified as separate languages. There are 50 million speakers of all of the Min languages (Olson 1998). Northern Min has only 0-20% intelligibility with Min Nan.
Central Min has three lects, Shaxian, Sanming and Yongan, but we don’t know if there are languages among them. Central Min has 3.5 million speakers.
Northern Min is said to be a single language, although it has 9 separate lects. Most dialects are said to be mutually intelligible, but Jianyang and Jian’ou have only about 75% intelligibility. Northern Min has 10 million speakers.
The standard dialect of Min Dong or Eastern Min is Fuzhou.
Eastern Min has only 0-20% intelligibility with Min Nan.
Chengguan, Yangzhong and Zhongxian are separate languages, all spoken in Youxi County (Zheng 2008).
Beyond that, Eastern Min is reported to have several other mutually unintelligible languages. One of them is Fuqing, located near Fuzhou but not intelligible with it, according to Wikipedia, but others say the two are mutually intelligible, although speakers are divided on the question.
It appears that possibly Fuzhou speakers can understand Fuqing speakers better than the other way around. Fuzhou and Fuqing are about 65% intelligible in praxis, and it about the same with the rest of the Hougan Group (Ngù 2009).
Ningde, Fuding and Nanping are probably other languages in this family (evidence). Of these three, Ningde is definitely a separate language. According to George Ngù, a passionate proponent of Fuzhou, “Fuzhou is not intelligible even within its many varieties.”
It’s not clear if that applies to all of Eastern Min, but it appears that it does. Therefore, Changle, Gutian, Lianjiang, Luoyuan, Minhou, Minqing, Pingnan, Pingtan, Yongtai, Fuan, Fuding, Shouning, Xiapu, Zherong and Zhouning are all separate languages.
There are two other lects lumped in with Eastern Min. Manjiang is spoken in the central part of Taishun County, and Manhua spoken in the eastern part of Cangnan County. Both of these names mean “barbarian speech.”
Both are probably mixtures of Southern Wu (Wenzhou etc.), Eastern Min, Northern Min, and maybe even pre-Sinitic languages. Manhua and Manjiang are not intelligible with Fuzhou. However, Manjiang has affinity with Shouning in phonology, vocabulary and grammar. Whether or not it is intelligible with Shouning is not known. Min Nan speakers who have looked at Manjiang data say that it doesn’t even look like a Sinitic language.
Manhua is best dealt with as a form of Wu. I discuss it further below under Wu.
Fuding, Fuan, Shouning, Xiapu, Zherong and Zhouning are in the Funing Group of Eastern Min, which has 6 lects.
Fuzhou, Fuqing, Chengguan, Yangzhong, Zhongxian, Ningde, Changle, Gutian, Lianjiang, Luoyuan, Minhou, Minqing, Pingnan, Pingtan, Yongtai and Nanping are in the Houguan Group of Eastern Min, which has 16 lects.
Eastern Min contains 23 separate lects.
Within Min Nan, Xiamen and Teochew are separate languages (evidence). There is even a proposal to split Xiamen, Qiongwen and Teochew into three separate languages before SIL.
Amoy, Taiwanese, Jinjiang, ZhangzhouTainan, Taibei, Yilan, Taichung, Quanzhou and Lufeng are part of the Xiamen group.
Jinmen is apparently a separate language, as it has poor intelligibility with Taiwanese.
A much better name for Xiamen according to the Chinese literature is Quanzhang (Campbell January 2009).
Quanzhang is a combination of Quanzhou and Zhangzhou, two of the most important dialects in the language. Xiamen has only 51% intelligibility with Teochew. Whether or not Zhangzhou and Quanzhou are intelligible in China itself is still somewhat of an open question.
Nevertheless, Quanzhou speakers in Singapore can no longer understand Taiwanese or Xiamen well, though they have partial understanding of them. They have only 30-40% intelligibility with Yilan. Nevertheless, they have good understanding of Zhangzhou. This implies that much of the understanding between at least some of the Xiamen lects was due to bilingual learning.
The Yilan dialect on Taiwan is so different that it alone has posed serious problems for the task of standardizing Taiwanese Min Nan, yet it is intelligible with the rest of Taiwanese (Campbell January 2009). Lugang is also very different but is also intelligible with Taiwanese (Campbell 2009).
There are some communication problems for Tainan speakers hearing Taipei, but it appears that they are still intelligible with each other (Campbell January 2009).
JieyangRaoping, Chaoyang, Shantou (Swatow) and Hailok’hong (Haklau) are lects in the Teochew Group (evidence) of Teochew. Teochew (Chaozhou) is the prestige version of Teochew. Chaoyang speakers can understand Jieyang, Raoping (evidence) and Shantou, but intelligibility is difficult with Haifeng and Lufeng. Shantou, Raoping, and Jieyang are then dialects of Chaoyang.
Zhangzhou and Quanzhou have marginal intelligibility with Teochew varieties. They are both spoken in Taipei, Taiwan. After all, Taiwanese itself is just a mixture between Zhangzhou and Quanzhou. The situation in Taipei was interesting. The dialects of the city were a mix of Zhangzhou and Quanzhou. The dialect of the center of the city was mixed between the two, with a slight Quanzhou lean to it. In Sulim (Shilin), people spoke with a dialect that heavily favored Zhangzhou. Other districts spoke a Tang’oann-type dialect, which is just Quanzhou mixed with a bit of Zhangzhou.
All these conditions are more common with the older generation because the new generation either does not speak Teochew at all or they favor the mixed Zhangzhou-leaning “Southern” style favored in the media, or they just do not speak the language at all. Hailok’hong (Haklau) is spoken down the coast between the Teochew zone and the Hong Kong area. It has marginal intelligibility with other Teochew lects. Nevertheless, Taiwanese speakers can no longer understand the pure Quanzhou spoken in the Chinese city of that name.
On the other hand, Chaoyang itself is unintelligible to some other Teochew lects. Shantou speakers cannot understand some of the other Teochew lects, and speakers of other lects often find Shantou hard to understand.
Sources report that Teochew lects can vary greatly in the pronunciation of even single words, and the tones can be quite different too.
There are claims that Teochew is intelligible with Zhangzhou and Quanzhou, but these claims appear to be incorrect (see above). That might make some sense, as Teochew are a group of Min speakers who broke off from Zhangzhou Min about 600-1,100 years ago. They moved down to northeast Guangdong, after hundreds of years, a heavy dose of Cantonese went in, producing modern Teochew.
chinese language map
Teochew has only 51% intelligibility with Xiamen.
Haifeng and Shanwei are members of the Luhai Teochew subgroup of Teochew, which differs markedly from Teochew and may be a separate language. Luhai is said to be halfway between Teochew and Zhangzhou. Luhai probably represents a later move from Zhangzhou towards northeast Guangdong by the same group that formed Teochew. This move may have occurred around 400 years ago.
Lufeng is said to have over 90% intelligibility with Xiamen, but if it is really halfway between, it should have 75% intelligibility. Intelligibility testing may be needed.
The Teochew spoken in Indochina – in particular, in Vietnam and Cambodia (Indochinese Teochew) may be a separate language. Some Indochinese Teochew speakers who have returned to their family villages say they could only understand 70% of the speech there.
Furthermore, intelligibility is difficult between Malay Teochew and other Teochew, such as SE Asian Teochew and Teochew on the mainland. Malay Teochew is spoken in Malaysia, Singapore and Indonesia.
The Teochew variant spoken in Malaysia is composed of many highly variant lects. Whether or not they are mutually intelligible with each other is not known. The variety spoken in Medan, Indonesia is particularly interesting. It has heavy Malay and Cantonese influence and cannot be understood by other Teochew speakers. Teochew has 10 million speakers.
Zhangping, though close to Xiamen, is a separate language according to a 200 word Swadesh test (Ben Hamed 2005).
Sanjiang appears to be a separate language .
Datian, in Fujian, is also a separate language.
A version of Hokkien called Malay Hokkien is spoken in Malaysia and in Indonesia in Sumatra and Kalimantan. In Indonesia, it is spoken in the city of Medan, the state of Riau, the city of Bagansiapiapi on Sumatra and in a few places on Kalimantan, such as Kuching and especially in Brunei. Malay Hokkien is heavily laced with Teochew.
Northern Malay Hokkien is spoken from Taiping along the coast formerly all the way to Phuket but now only to Pedang in Malaysia and in Indonesia in the city of Medan, the state of Riau, the city of Bagansiapiapi on Sumatra and in a few places on Kalimantan, such as Kuching and especially in Brunei. Speakers of Northern Malay Hokkien have a hard time understanding the Southern Malay Hokkien (see Singapore Hokkien below) spoken in Kelang, Malacca and Singapore. Northern Malay Hokkien is creolized, with Malay and Thai embedded deeply in the language.
Southern Malay Hokkien is less creolized, if at all. Singapore Hokkien lies between Northern Malay Hokkien and Taiwanese on the continuum. A very pure variety of Hokkien is spoken in the Indonesian city of Bagansiapiapi. It has avoided the Mandarinization of Hokkien that is occurring elsewhere. They speak like the Hokkien speakers of Tang’oann (Tong’an), China.
Kelantan Hokkien is spoken in the Malay state of Kelantan. It is wildly creolized with Malay and is probably not intelligible with any other form of Hokkien.
The version of Hokkien spoken in the Philippines is often called Binamhue, Banlamhue or Minanhua (Philippines Hokkien) by speakers, derives from a dialect on the outskirts of Quanzhou, and it may have drifted into a separate language. At present, it is sometimes not intelligible with Quanzhou or Xiamen. That is, some Philippines Hokkien speakers claim that they can only understand about 70% of Taiwanese television.
The version of Min Nan, Singapore Hokkien (Southern Malay Hokkien), spoken in Singapore, Kelang and Malacca is similar to that spoken in Taiwan, but many Singapore Hokkien speakers have a hard time understanding Taiwanese Hokkien, while others can understand it just fine. Older Singapore Hokkien speakers can understand Taiwanese Hokkien better than younger ones. This is due to bilingual learning more than anything else because younger Singapore Hokkien speakers are no longer good at understanding other Min Nan dialects due to lack of exposure to them.
The reason that Taiwanese speakers can seem to speak communicate well with Singapore Hokkien speakers is because they are using a simpler vocabulary. A Singapore Hokkien speaker, if immersed in Taiwan, could pick up Taiwanese fairly quickly, within say 3 months.
An umbrella term covering Malay Hokkien, Singapore Hokkien and Philippines Hokkien may be Nusantaran Hokkien.
Another language in the same group is best called Wan’an, comprising a number of dialects and possibly languages in Wan’an County of Fujian (Branner 2008).
Zhaoan, Pinghe and Yunxiao, also of Fujian, are separate languages.
Wan’an and Longyan are not mutually intelligible (Branner 2008). Longyan seems to have about 85% intelligibility with Taiwanese. Koongfu and Shizhong are apparently dialects of Longyan Min and are probably intelligible with it. Koongfu is spoken in Kanshi Township in Yongding County. Shizhong is spoken in southern Longyan County.
There are many varieties of Southern Min spoken in Western Fujian that may or may not be independent languages.
Liancheng Gutyan Junbao, Longyan Wan’an Wuzhai, Longyan Wan’an Songyang, Longyan Wan’an Tutuan, Longyan Baisha Youshui, Shiahtsuen Buhyun Liling, Shanghang Buhyun Liling, Liancheng Xuanhe Shengxing, Shanghang Gutian Laifang, Liancheng Xinquan Linguo, Liancheng Xinquan Lelian, Liancheng Pengkou Wangcheng, Liancheng Miaoqian Zhixi, Liancheng Gechuan Zhuyu, Liancheng Miaoqian Jiangshe, Liancheng Sibao Shangjian Zhenbian, Liancheng Juxi Gaoding, Liancheng Tangqian Dikeng, Liancheng Wencheng Hengming, Liancheng Xinquan Dongnancun, Liancheng Quxi Puxi Dongxiduan, Liancheng Quxi Qiaotou and Liancheng Liwu Nanban Zhangwu are spoken in Western Fujian. Shiahtsuen is spoken in Laiyuan Township in southeastern Liancheng County. (Branner 2000).
Whether or not these lects are dialects or separate languages is difficult to say. With many of these lects, they don’t understand each other at first, but after they talk to each other for a while, they start to figure out the other lect. (Branner 2008). Intelligibility testing needs to be done for these lects.
Quanzhou, Zhangzhou, Singapore Hokkien, Philippines Hokkien, Xiamen, Amoy, Yilan, Tainan, Taipei, Taichung, Taiwanese, Jinjiang, Lufeng, Lugang, Jinmen, Zhangping, Koongfu, Shizhong, Nanjing, Zhaoan, Pinghe, Yunxiao, Longyan, Wan’an, Liancheng Gutyan Junbao, Longyan Wan’an Wuzhai, Longyan Wan’an Songyang, Longyan Wan’an Tutuan, Longyan Baisha Youshui, Shiahtsuen, Shanghang Buhyun Liling, Liancheng Xuanhe Shengxing, Shanghang Gutian Laifang, Liancheng Xinquan Linguo, Liancheng Xinquan Lelian, Liancheng Pengkou Wangcheng, Liancheng Miaoqian Zhixi, Liancheng Gechuan Zhuyu, Liancheng Miaoqian Jiangshe, Liancheng Sibao Shangjian Zhenbian, Liancheng Juxi Gaoding, Liancheng Tangqian Dikeng, Liancheng Wencheng Hengming, Liancheng Xinquan Dongnancun, Liancheng Quxi Puxi Dongxiduan, Liancheng Quxi Qiaotou and Liancheng Liwu Nanban Zhangwu are all members of the Quanzhuang Group of Min Nan, which has 50 lects.
Teochew, Shantou, Lufeng, Haifeng, Chaoyang, Jieyang, SE Asian Teochew and Malaysian Teochew are members of the Chaoshan Group of Min Nan, which has 12 lects.
Datian is in its own group in Min Nan.
Min Nan consists of 68 separate lects. Clearly, the dialectal relationships of Min Nan are confusing, as many of the lects are very closely related, if not fully intelligible. Intelligibility testing may be needed to sort out some of these issues. There are 30 million speakers of Southern Min.
Zhenan Min, spoken in Zhejiang Province around Pingnang and Cangnan and in the Zhoushan Islands, is a separate language. Zhenan Min contains 4 lects, Pingyang, Cangnan, Dongtou and Yuhuan, which may or may not be languages. Zhenan Min has 574,000 speakers. Zhenan Min is influenced by Eastern and Northern Min.
Qiongwen (Hainanese) is a separate language with 8 million speakers. It has the lowest intelligibility with the rest of Southern Min as any other Min Nan lect. Qiongwen itself has 14 separate lects, all spoken on Hainan. Whether or not any of them are separate languages is not known.
Longyan (Branner 2008) is a separate language, apart from Southern Min. It is spoken in Longyan City’s Xinluo District and Zhangping City and has 740,000 speakers. It has heavy Hakka influence due to the large number of Hakka speakers in the surrounding areas.
Another split in Min is Leizhou. Leizhou Min is a separate language and is now recognized by some as a separate branch of Min altogether, along the lines of Southern and Northern Min. Leizhou consists of 7 different lects. Haikang appears to be a dialect of Leizhou.
However, at least some of the other 6 Leizhou lects are very different in phonology and lexicon. Intelligibility data is not known, but they may be mutually intelligible. Leizhou Min, with 4 million speakers, has low intelligibility with Min Nan lects and has only 50% intelligibility with Hainanese.
Shaojiang Min, or Min Gan, is said to be a completely separate high-level division of the Min language like Leizhou Min. It has four lects – Shaowu, Guangze, Jiangle and Shunchang – that are said to be mutually intelligible. There are subdialects within these larger lects. The substratum of Shaojiang is not Min, Gan or Hakka – instead, it is the ancient Baiyue language.
Puxian Min has already been identified as a separate language. Puxian has 3 separate lects. There are minor differences between these lects.
However, there is a form of Puxian Min spoken in Singapore, Hinghwa, and presently it lacks full intelligibility with Puxian Min proper. Puxian speakers are a minority in Singapore, and their language has mixed a lot with Singapore Hokkien, Malay, English and other languages spoken in Singapore, resulting in a separate language.
A Min language called Longdu, located in Guangdong, is not only a separate language (evidence here and here) but seems to be in another Min category from Southern Min. It is spoken in the southwest corner of Zhongshan City in Shaxi and Dayong.
In Guangdong Province, there are other divergent lects of Min Nan. Two of these, Nanlang (also spoken in Zhongshan) and Sanxiang, are also separate languages. Nanlang is spoken 10 miles southeast of Zhongshan in Cuiheng. It is also spoken in Nanlang and Zhangjiabian. Sanxiang is spoken to the south of Zhongshan in the hilly rural areas.
In Chinese, Longdu, Nanlang and Sanxiang are referred to as All-Lung, South Gourd and Three Rural, respectively. Sources give Longdu and Nanlang 100,000 speakers and Sanxiang 30,000 speakers. 14% of the population of Zhongshan speaks Min. Nanlang now has mostly elderly speakers.
All of these seem to be in the same group, Zhongshan Min, and all are spoken in the Pearl River Delta near Hong Kong. Zhongshan Min has 150,000 speakers.
This group is possibly a Northern or Eastern Min group stranded way down in Guangdong. They are sometimes referred to in old literature as “Northeastern Min”. That’s not really a category. It often means Northern Min, but sometimes it means Eastern Min. These languages have all borrowed extensively from the type of Cantonese spoken in the Pearl River Delta.
Looking at the whole picture, it appears that various immigrants speaking Puxian Min, Northern Min and Southern Min all settled around Zhongshan. These various Min elements, along with a hefty dose of Cantonese, have gone into the creation of Zhongshan Min.
Sanxiang, Nanlang and Longdu are apparently not mutually intelligible, although Nanlang is close to Longdu. Sanxiang is more divergent. Further, there are more dialects within these three languages, and dialectal divergence is considerable, with possible communication difficulties among them.
Sanxiang has at least two dialects, Phao and Tiopou. Phao is fairly uniform across a number of villages, but Tiopou is quite different. Nevertheless, there is near-full intelligibility between Phao and Tiopou. For now, we will just list Sanxiang, Nanlang and Longdu as separate languages, with possible dialects Phao and Tiopou (Sanxiang); Nanlang A and Nanlang B; and Longdu A and Longdu B, among them.
A very strange lect is spoken by the She people in Zhejiang, Fujian and Guangdong. The She language was originally Hmong-Mien, then added a Cantonese layer, then a Hakka layer, next a Min layer, and in Zhejiang, a Wu layer. It is best described as a Hmong-Mien language that has been Sinicized. There are probably 200,000 speakers of this language.
There is also an original She language that is non-Sinitic (Hmong-Mien) and is spoken by only about 1,000 people in Guangdong.
In Eastern Guangdong, the She speak the Chaoshan She language. They live in the Phoenix Mountains in Chao’an County in Chaozhou prefecture. It has had heavy contact with Chaoshan (Teochew) Min group. This is probably a separate language, unintelligible with other She languages and also with Chaoshan Min.
Within Hakka, besides Hakka Proper (Meixia)Tingzhou is a separate language (evidence). Wuhua Hakka is intelligible with Meixian.
Fangcheng and Dabu are close to Meixian, but intelligibility data is lacking. Fangcheng has five different lects within it, but intelligibility data is not known. Hong Kong Hakka is not intelligible with the Hakka spoken on Taiwan, nor with Dabu.
Dongguan, spoken near Hong Kong, can understand Meixian, but Meixian cannot understand Dongguan.
Taipu or Taipo is spoken in the village of the same name in Hong Kong and is not intelligible with Meixian, nor is Wakia, also spoken in Hong Kong.
A variety of Hakka spoken in a part of Hong Kong called Shataukok is called variously Satdiugok, Sathewkok, Shataukok, Satdiukok or Satdiugok. It is said to be different from other Hakka, and evidence indicates that Shataukok may indeed be a separate language. Shataukok has dialects within it and they are different, but they are generally mutually intelligible.
All three of these are dialects of a more or less intelligible language called Hong Kong Hakka.
Located near Hong Kong, Shenzhen/Bao’an is a separate language.
Haifeng and Lufeng, located near each other in Guangdong, appear to be dialects of a separate language called Hailufeng.
Longchuan in northeastern Guangdong is a separate language (evidence), with poor intelligibility with other Hakka lects. Longchuan has four different dialects, Huangbu, Sidu, Chetian and Tuocheng. Sidu and Tuocheng are close and are probably dialects of Longchuan. Sidu Longchuan has 18,000 speakers.
Boluo and Heyuan are separate languages, not mutually intelligible.
Longchuan, Boluo and Heyuan are quite distant from other Hakka. Heyuan is spoken in central Guangdong.
Huizhou is mutually intelligible with Longchuan and also with Meixia and Dabu.
Sanxiang, spoken in Zhongshan prefecture, is different from all other Hakka, but intelligibility data is lacking.
It is possible that in northern Guangdong, there may be many different Hakka languages, since dialects tend to differ from village to village, and in many cases, communication is difficult.
The Hakka spoken in Kunming, Sarawak, in Malaysia, known as Ho Po Hak, is a separate language.
It is very different from the Hakka spoken in Sabah, Malaysia, and it is similar to Hopo, spoken in Hopo, near Meizhou. Hopo is not intelligible with Dabu, Hailu or Meixian. Hopo appears to be a dialect of Jiaoling. Hopo has deep influence from Teochew Min, because it is located right next to the Teochew area.
The Gannan Group (or Ninglong Group) from Southern Jiangxi, Mingxi from Western Fujian, and the Yuemin Group from Southern Fujian and Southeastern Guangdong are separate languages.
In the Gannan Group are multiple lects. One of them is Xingguo, spoken in Xingguo County in Ganzhuo Prefecture (evidence).
The Gannan Group is extremely diverse compared to the Hakka of Guangdong and Fujian. Gannan lects differ even from village to village.
With Gannan Hakka, we may be dealing with a situation of many different languages, as with Wu, Hui, Tuhua and Xiang. In fact, it quite possible that with Jiangxi Hakka, we may be dealing with every Hakka lect being a separate language, but that remains to be proven.
In Fujian Province, there is the wildly diverse Tingzhou Hakka Group mentioned above. Even within this group, there are separate languages, including Yongding, Liancheng, Changting, Xinquan, Qingliu, Mingxi, Ninghua and Shanghang (evidence). Gucheng is probably also a member of Tingzhou.
Sources say that each Hakka village in Fujian speaks its own lect, and that the lects are far enough apart to make communication from village to village very difficult.
Therefore, we conclude that in addition to the above, we will add Wuping, Longyan, Zhaoan, Yunxiao, Shangsixiang, Fuding, Fuan, Gucheng and Nanjing Qujiang.
Luoyuan She Hakka is spoken in Fujian. It is an extremely diverse form of Hakka that differs from all other Hakka. It must surely be a separate language.
Chengdu is spoken in Chengdu, Sichuan. It is quite different from other forms of Hakka and has poor intelligibility with other forms.
On Taiwan, the Miaoli (Four Counties), Dongshi (Dapu) and Xinzhu (Hailu) lects are not mutually intelligible, nor is the mixed Gaoxiong lect created in order that these three lects could communicate with each other.
Kunbei (Zhaoan) is very different and must be a separate language. Raoping may well be a separate language, but intelligibility data is lacking. In general, speakers of other kinds of Hakka find Taiwan Hakka to be hard to understand, possibly due to Southern Min influence.
Bangka Island Indonesian Hakka, spoken on Bangka Island in Indonesia, has diverged so radically with its tones that it is now a separate language. That is, speakers of other Indonesian Hakka lects say that they cannot understand Bangka Island speakers. It’s actually said to be a Hakka creole more than anything else.
In Indonesia, two other Hakka languages are spoken, Kun Dian Indonesian Hakka, spoken in Borneo, and Belitung (Ngion Voi) Indonesian Hakka. Kun Dian Hakka is the largest Hakka group in Indonesia. Most live at Pontianak and Singkawang, where they speak two different mutually intelligible lects, but they have spread all over Indonesia. Kun Dian Hakka is a dialect of Meixian.
Belitung Hakka is spoken mostly on Sumatra and Borneo, and is characterized by a soft way of speaking. Belitung Hakka and Bangka Hakka say they cannot understand Kun Dian Hakka, but Kun Dian speakers say they can understand the other two for the most part. East Timor Hakka is a dialect of Meixian.
Jiexi is spoken in southeast Guangdong. Dayu is spoken in southern Guangxi. Liannan is spoken northwest Guangdong. Dongguan Qingxi is spoken in south-central Guangdong. Wengyuan is spoken in northern Guangdong. Ningdu is spoken in Jiangxi. Mengshan Xihe is spoken in eastern Guangxi. Hong Kong Hakka is spoken in Hong Kong.
Zhaoan Xiuzhuan is spoken in southern Fujian.
Shanghang Pengxin, Basel Mission and Shanghang Guanzhuang Shangzhuo are spoken in West Fujian (Branner 2000).
Dayu, spoken in Jiangxi, is a separate language, not intelligible at least to Central, or Meixian, Hakka speakers.
Meixian, Wuhua and Bao’an are members of the Yuetai Group of Hakka, which has 23 lects. Within Yuetai, Wuhua and Dabu are members of the Xinghua subgroup, which has 5 lects. Xinghua has 3.4 million speakers. Bao’an and Lufeng are in the Xinhui subgroup of Yuetai, which has 9 lects. Xinhui has 2.4 million speakers.
Gaoxiong, Xinzhu, Dongshi and Miaoli are members of the Jiaying Group of Hakka, which has 7 lects.
Tingzhou, Yongding, Liancheng, Changting, Xinquan, Shanghang, Basel Mission, Shanghang Pengxin, Wuping, Ninghua, Qingliu and Mingxi are all part of the diverse Tingzhou Group of Hakka. All told, Tingzhou has 12 lects, all of which are separate languages.
Longchuan, Boluo and Heyuan are members of the Yuezhong Group of Hakka, which has 5 lects.
Huizhou is in its own subgroup of Hakka.
Xingguo and Ningdu are in the Ninglong Group of Hakka, which has 13 lects. This group is said to be very diverse, with lects differing from village to village.
Liannan and Wengyuan are members of the Yuebei Group of Hakka, which has 11 lects and must surely be a separate language.
Dayu is a member of the Yugui Group of Hakka, which has 43 lects.
Ho Po Hak, Bangka Island, Nanjing Qujiang, Jiexi, Dayu, Hong Kong, Mengshan Xihe, Zhaoan Xiuzhuan, Nanjing Qujiang, Fuan, Fuding and Haifeng are unclassified.
There are 12 major Hakka lects and 210 Hakka lects altogether. Others claim that there are over 1000 Hakka lects spoken in China. There are 30 million speakers of the various Hakka languages. The dialect situation with Hakka, as with Min Nan, is quite confused and somewhat contradictory. Intelligibility testing could clear up some of the confusion. Some speakers report adequate intelligibility between lects, while others report difficulty.
Putonghua is Standard Mandarin, based on the Beijing dialect as of 1949, but it has since diverged wildly and many Putonghua speakers today cannot understand Beijing. Putonghua is being promoted as the national language of China. In addition to Putonghua, there 1,500 other dialects of Mandarin spoken in China. In general, other Mandarin dialects are not intelligible to Putonghua speakers (Campbell April 2009).
However, the Northeastern dialects and the dialects around Beijing may be more intelligible than the Mandarin dialects in the rest of the country. The implication is that there may be as many as 1,500 Mandarin languages in China. However, many of these Mandarin dialects are intelligible with at least some other Mandarin dialects. Hence, despite the lack of intelligibility with Putonghua, there is a lot of potential lumping within Mandarin.
The degree to which Mandarin dialects are intelligible to each other is very much an open question and in general is poorly investigated.
Within Mandarin, besides Putonghua, the main branch, Jinan (New Jinan), Beijing and Tianjin (evidence and here) are not intelligible with Putonghua; however, Tianjin may be intelligible with Beijing, on the other hand, Tianjin is looking more and more like a separate language.
For one thing, Tianjin’s tones are quite different from Putonghua’s, and its tone sandhi is much more complicated and it is more closely related to lects 150-500 miles away, since originally Tianjin speakers came from Anhui (Lee 2002). Some reports say that Tianjin is intelligible with Putonghua, so intelligibility testing may be needed.
Jinan is not intelligible with Putonghua, but may be learned over a period of weeks to possibly months, as it is close enough. Jinan is only 65% intelligible with Beijing.
Since Beijing, Tianjin, Nanjing City, Hebei and all of NE Mandarin may be intelligible, I am just going to make a language called Northeast Mandarin and call Beijing, Tianjin, Hebei and Nanjing City dialects of NE Mandarin for now. Beijing is has low intelligibility with other branches of Mandarin: 72% intelligible with Southwest Mandarin, 64% intelligible with Jilu Mandarin and Zhongyuan Mandarin and 55% intelligible with Jiaoliao Mandarin.
However, many Putonghua speakers claim that Beijinghua is not inherently intelligible with Putonghua. Complaints about unintelligible taxi drivers in Beijing are legendary. At the very least, competing views of the intelligibility of Beijinghua and Putonghua deserve investigation.
On the other hand, Beijinghua may be intelligible with Hebei and Nanjing City. I think that Hebei is clearly a dialect of Beijing. The lect of Beijing’s hutongs and taxi drivers is legendary for being hard to understand. It would be interesting to see whether Tianjin and Hebei speakers can understand it. Tianjin may be a separate language, since it is not intelligible with Beijinghua.
What probably happened was that Beijinghua and Putonghua have taken separate trajectories. This has also occurred in Italian, where, though Standard Italian was based on Tuscan, Standard Italian and Tuscan have taken separate trajectories since. It is said that if you see old Tuscan men on TV in Italy, a speaker of Standard Italian from southern Italy would need subtitles to understand them, but one from northern Italy would not.
Others say that Putonghua was based on the language of the Beijing suburbs, not the city itself.
For whatever reason, Beijinghua often seems to have less than 90% intelligibility with Putonghua, though the question needs further research. Beijinghua, in its pure and least mutually intelligible form, seems to be spoken mostly in the innermost hutongs and among taxi drivers and other low income and working class people. The lect of people with more education and money is probably a lot more comprehensible.
I would describe the real, pure, Putonghua as “CCTV speech”, the lect you hear on Chinese state television. Evidence that Beijinghua lacks full intelligibility with Putonghua is here, here, here, here, here, here, here and here.
The question of whether or not Beijinghua is a separate language from Putonghua is sure to be highly controversial. Perhaps intelligibility testing could settle the question.
Beijing is in a group all of its own called the Beijing Group. It contains 43 separate lects, and may contain more than one language.
We should also note here that even Putonghua, the language that was meant to tie the nation together, seems to be evolving into regional languages.
Guangdong Putonghua is not fully intelligible to speakers of the Putonghuas of Northern China and hence is probably a separate language.
There are also varieties of Putonghua that are spoken in Singapore and Taiwan. Taiwanese Mandarin is about 80-85% intelligible with Putonghua and is a separate language. Claims that Taiwan Mandarin is fully intelligible with Putonghua are incorrect.
Shanghai Putonghua is often not intelligible with Putonghua from other regions. It has heavy interference from Shanghaihua, which seriously effects the Putonghua accent. Even after four years of exposure to it, Standard Putonghua speakers often have problems with it.
In addition, Jianghuai Mandarin Putonghua and Zhengcao Mandarin Putonghua Putonghua are not intelligible with Putonghua from other areas (Campbell April 2009). These varieties of Mandarin cause a particular interference with Putonghua Mandarin that results in a severe dialectal disturbance in their Putonghua.
These Putonghuas are spoken in the regions native to the Jianghuai and Zhengcao dialects of Mandarin. Jianghuai is spoken in Anhui, Jiangsu, Hubei and to a much lesser extent Zhejiang Provinces. Zhengcao is spoken in Anhui, Henan, Shandong, Jiangsu, with one dialect is spoken in Hebei.
Although it is different, Singapore Putonghua is still intelligible with Putonghua. Malay Mandarin is said to be quite different but nevertheless intelligible. Nevertheless Malay Mandarin speakers say they have to make speech adjustments with Chinese speakers otherwise their speech is poorly intelligible. This implies that Malay Mandarin is indeed a separate language.
Yunnan Putonghua is intelligible with Putonghua from other regions (Campbell January 2009).
Cangzhou, spoken in southeastern Hebei, is a separate language. It is only partly intelligible with Putonghua. Renqiu, Huanghua, Hejian, Cangxian, Qingxian, Xianxian, Dongguang, Haixing, Yanshan, Suning, Nanpi, Wuqiao and Mengcun, all spoken in Cangzhou prefecture, are all dialects of Cangzhou.
Cangzhou shares some similarities with Tianjin, but it is only partly intelligible with it.
Jinan is a member of the Liaotai Group of the larger Jilu Group, which has 37 lects.
The Baotang Group of Jilu has 52 lects. Tianjin forms its own subgroup within Baotang. Cangzhou, Renqiu, Huanghua, Hejian, Cangxian, Qingxian, Xianxian, Dongguang, Haixing, Yanshan, Suning, Nanpi, Wuqiao, and Mengcun are members of the Huangle subgroup of Baotang, which has 25 lects.
Jilu itself consists of 170 lects.
Taiwanese Mandarin, while different from Putonghua, is intelligible with it. Singapore Mandarin has fewer differences then Taiwanese. Both are dialects of Putonghua.
Luoyang, Kiafeng, Changyuan and Zhengzhou, all in Henan Province, are not intelligible with Putonghua. However, all four are mutually intelligible with each other, so they are dialects of a single language, Henan Mandarin.
Xinyang, also spoken in Henan, is a separate language and cannot be understood by Luoyang speakers.
Nanyang has high but not complete intelligibility with Luoyang. After a few weeks of close contact, Luoyang speakers can understand Nanyang, but initially, comprehension is poor due to different tones. Nanyang has 15 million speakers.
Luoyang and Gushi are unintelligible with Putonghua. In addition, Gushi is different from Nanyang and may not be intelligible with it. Intelligibility between Xinyang, Gushi and Nanyang is not known. In general, intelligibility between many lects in Henan is not good, but after a week or two of close contact, they can start to understand each other.
In Shaanxi, Yanan, Xian, Huxian (evidence), Zhouzhi (evidence), and Hanzhong are not intelligible with Putonghua. Let us call this language Shaanxi Mandarin. Xi’an, for instance, is about 65% intelligible with other Mandarin groups.
Xining, spoken in Xinghai, seems to be very different from other Shaanxi lects, and is probably a separate language altogether (evidence here and here) .
In Gansu Province, Tongwei is not intelligible with Putonghua, and Gansu Mandarin seems to be very different from other forms of Mandarin. Gansu Mandarin appears to be a separate language.
However, within Gansu, there are divergent lects, such as Sale, which are unintelligible with other Gansu lects.
Bozhou (evidence), Yingshang (evidence), and Fuyang (evidence), spoken in Anhui, are at least unintelligible with Putonghua. Fuyang is very different. The lect spoken 300 km south of Jinan, around Mengcheng in rural Anhui, is said to be completely unintelligible with Putonghua, Tianjin and Beijinghua. For the time being, we will refer to this as one language, Anhui Mandarin. Intelligibility between lects of Anhui Mandarin is not known.
Anhui Mandarin Putonghua has poor intelligibility with Standard Putonghua due to its phonology. Therefore, it is a separate language.
Xian, Huxian and Zhouzhi are members of the Guanzhong Group of Zhongyuan, which has 45 lects.
Yanan, Hanzhong and Xining are members of the Qinlong Group of Zhongyuan, which has 67 lects.
Luoyang is a member of the Luoxu Group of Zhongyuan, which has 28 lects.
Kiafeng, Nanyang, Zengzhou, Changyuan, and Bozhou are members of the Zhengcao Group of Zhongyuan. The Zhengcao Group has 93 lects.
Xinyang and Gushi are in the Xinbeng subgroup of Zhongyuan, which has 20 lects.
Tongwei and Sale are part of the Longzhong Group of Zhongyuan, which has 25 lects.
Yingshang is a member of the Cailu Group of Zhongyuan, which has 30 lects.
The Mandarin spoken in Qinghai is very different from that spoken in Gansu, but it’s not known if it is a separate language. They are both usually two types of Zhongyuan Mandarin.
Zhongyuan has a shocking 388 lects. Zhongyuan Mandarin is not fully intelligible with Putonghua. Zhongyuan Mandarin has 130 million speakers (Olson 1998).
Yichang (evidence), Longchang (evidence), Chengdu, Chongqing (evidence), Guilin and Nanping (spoken near Mt. Wuyi evidence), Longcheng (evidence), Luocheng (evidence), Luzhou (evidence here and here), Lingui (evidence), Jiuzhaigou (evidence) Xindu, Wenshan (evidence), Mianzhu (evidence here and here), Yangshuo (evidence), Wuhan (evidence), and Leshan (evidence) are all unintelligible with Putonghua.
Guilin is not intelligible with general Southwest Mandarin speech. Wenshan at least is not intelligible with other Southwestern varieties (Johnson 2010).
Chengdu is part of a Sichuan Mandarin koine that is spoken in many of the larger cities in Yunnan. It includes Kunming, Bazhong, Dazhou, Neijiang, Zigong, Yibin, Luzhou, Chengdu, Mianyang, Deyang and Guiyang and is broadly intelligible (Xun 2009). Ziyang is intelligible with the koine but has a heavy accent (Xun 2009). Leshan is unintelligible with the koine, but it can be learned in a few weeks of exposure (Xun 2009).
Dali is also not intelligible with Putonghua, but that is because Tibetan Mandarin has heavy Tibetan admixture.
Chongqing speakers cannot understand Chengdu or Luzhou speakers. The many small lects around Mt. Emei are not intelligible with Chengdu, appear to be be very different, and may one or more separate languages.
Wuhan is not intelligible to speakers of Southwest Mandarin from other provinces, for instance, it is only 80% intelligible with Chengdu. The intelligibility of Wuhan and Yichang is not known.
Dahua, spoken in and around Dahua village on the Puduhe River near Dongchuan in Yunnan Province, is apparently a separate language .
Lanping, may be a separate language. Kunming not intelligible with Tuoyuan., so Tuoyuan may be a separate language also. The language spoken in Kunming is part of the Sichuan Mandarin koine that includes Kunming, Bazhong, Dazhou, Neijiang, Zigong, Yibin, Luzhou, Chengdu, Mianyang, Deyang and Guiyang.
Chuanlan is a little-known language spoken by the Tunbao people of Guangxi Province.
Yingshan is a separate language based on a 200 word Swadesh test (Ben Hamed 2005).
Menghai (evidence) may well be a completely separate language. The mutual intelligibility of Menghai, Guiyang and Kunming is not known. Guiyang is at least not intelligible with Putonghua. Guiyang is evolving into the Sichuan Mandarin koine, which is broadly intelligible with Kunming, Bazhong, Dazhou, Neijiang, Zigong, Yibin, Luzhou, Chengdu, Mianyang and Deyang.
Shaoshan, apparently Mao Zedong’s lect, spoken in Hunan Province, is a separate language. It was said although Mao had a secretary who could understand him well, not many others could.
Another language spoken in Hunan, in Zhangjiajie County, is called Zhangjiajie Maoxi. The Maoxi are a tribal group there that speak a strange variety of Mandarin.
Tuoyuan in Hunan is not fully intelligible with other Southwest Mandarin lects, or at least not with Kunming.
Junhua, or military language, is a language spoken by an ethnic group on Hainan in the city of Zonghe. It is said to be “Old Mandarin” and is probably not intelligible with other lects. It is a form of Southwest Mandarin known as the Junhua Group, which contains 4 lects .
Guilin, Luocheng, Yangshuo, Liuzhou and Lingui are members of the Guiliu Group of Southwest Mandarin, which has 57 lects. Guiliu Southwest Mandarin is at least not comprehensible with Putonghua or Chengyu Southwest Mandarin.
Leshan and Longchang are members of the Guanchi Group of Southwest Mandarin, which has 85 lects. Within Guanchi, Longchang is a member of the Renfu Group , which has 13 lects.
Yichang, Chengdu, Chongqing and Yingshan are members of the Chengyu Group of Southwest Mandarin, which has 113 lects. Chengyu Southwest Mandarin is not comprehensible with Putonghua or Guiliu Southwest Mandarin.
Menghai, Kunming, Wenshan and Guiyang are members of the Kungui Group of Southwest Mandarin. The Kungui Group itself has an incredible 95 lects.
Lanping is in the Dianxi Group of Southwest Mandarin, which has 36 lects. Within Dianxi, it is a member of the Baolu subgroup, which has 21 lects.
Taoyuan is in the Changhe Group of Southwest Mandarin, which has 14 lects.
Wuhan is a member of Wutian Group of Southwest Mandarin, which has 9 lects.
Dali is a member of the Dianxi Group of Mandarin, which has 36 members. Within Dianxi, Dali is a member of the Yaoli Group, which has 15 members.
Nanping, Chuanlan, Shaoshan, Jiuzhaigou, Zhangjiajie Maoxi and Dahua are unclassified.
Southwest Mandarin itself has a stunning 519 lects and is not fully intelligible with Putonghua. There are 240 million speakers of Southwest Mandarin (Olson 1998).
Jianghuai Mandarin is a separate language.
Yangzhou is considered to be a separate language by a 200 word Swadesh test (Ben Hamed 2005). Yangzhou has about 52% intelligibility with the other branches of Mandarin.
Nanjing (evidence and here) is also a separate language – now mostly spoken in the suburbs, as city speech is not a separate language anymore. The city language is said to be intelligible with the general northeastern China lect spoken in Beijing and Hebei.
So I will call Nanjing Suburbs a separate language.
Lianyungang is a separate language, as is Yancheng and Huaian (evidence for both).
Nantong, a very strange variety of Mandarin on the border of Wu and Mandarin that shares many features with Wu languages, is a separate language, as is its sister language, Tongdong. Jinsha is a dialect of Nantong.
Rugao, next to Nantong, is also a separate language.
Also within Jianghuai, Hefei is considered to be a separate language by a 200 word Swadesh list (Ben Hamed 2005).
Rudong is at least not intelligible with Putonghua.
Anqing, in Anhui Province, is also not intelligible with Putonghua.
In 1933, there were three different languages spoken in Tongcheng, Anhui – East Tongcheng, West Tongcheng and Tongcheng Wenli. Tongcheng Wenli was the classical-based language spoken by the educated elite of the city. Whether these three languages still exist is not known, but surely some of the speakers in 1933 are still alive.
Chuzhou, spoken in Anhui, is not intelligible with Putonghua, although it is said to be close to Nanjing. Dangtu, also spoken in Anhui, is not intelligible with Putonghua.
Dongtai is a separate language (evidence).
The lects spoken in Dafeng, Taizhou, Xingua and Haian are said to be similar to Dongtai, so for the time being, we will list them as dialects of Dongtai.
Jiujiang, spoken in Jiangxi Province, is a separate language, as is Xingzi, located close by.
Intelligibility between Rudong, Anqing, Chuzhou, Dafeng, Taizhou, Xingua, Haian and Dangtu is not known.
Yangzhou, Lianyungang, Yancheng, Huaian, Nanjing, Hefei, Anqing, the Tongchengs, Chuzhou, and Dangtu are in the Hongchao Group of Jianghuai, which has 82 lects.
Dongtai, Dafeng, Taizhou, Haian, Xinghua, Jinsha, Nantong, Tongdong, Rudong, and Rugao are in the Tairu Group of Jianghuai. Tairu has 11 different lects.
Jiujiang and probably Xingzi are members of the Huangxiao Group of Jianghuai, which has 20 lects.
Jianghuai is composed of an incredible 120 lects and is not fully intelligible with Putonghua. Some suggest that all of the lects of Jianghuai are mutually unintelligible, but that remains to be proven. Jianghuai Mandarin has 65 million speakers (Olson 1998).
Northeastern (Dongbei) Mandarin is a separate language. Within Northeast, Shenyang is a separate language according to a 200 word Swadesh list (Ben Hamed 2005). Harbin is often listed as intelligible with Putonghua, but some Putonghua speakers can barely understand a word of it. Harbin may be a separate language. That classification is sure to be controversial, so intelligibility testing may be required to sort it out.
Shenyang is a member of the Jishen Group of Northeastern Mandarin, which has 44 dialects. Within Jishen, Shenyang is a member of the Tongxi Group, which has 24 dialects.
Harbin is a member of the Hafu Group of Northeastern Mandarin, which has 64 lects. Within Hafu, it is a member of the Zhaofu Group, which has 18 lects.
Lanyin Mandarin in the far northwest is also a separate language (Campbell 2004). Though Lanyin is said to be intelligible with Putonghua, that does not appear to be the case. Minqin (evidence) and Lanzhou (evidence) in Gansu are not fully intelligible with Putonghua, nor is Yinchuan (evidence) in Ningxia.
Intelligibility within Lanyin is not known, but Jiuquan at least appears to be a completely separate language inside Lanyin.
Jiuquan is a member of the Hexi Group of Lanyin, which has 18 lects.
Yinchuan is a member of the Yinwu Group of Lanyin, which has 12 lects.
Lanzhou is a member of the Jincheng Group of Lanyin, which has 4 lects.
Lanyin is composed of 57 separate lects. Lanyin Mandarin has 9 million speakers (Olson 1998).
The Jiaoliao Mandarin spoken in Shandong contains lects such as Qingdao (evidence here and here) and Wehai (evidence) which are not fully intelligible with Putonghua. Dalian is quite different from Putonghua. Intelligibility between Qingdao, Wehai and Dalian is not known.
Wehai and Dalian are members of the Denglian Group of Jiaoliao, which has 23 lects.
Qingdao is a member of the Qingzhou Group of Jiaoliao, which has 16 lects.
Jiaoliao is composed of 45 lects. Jiaoliao is not fully intelligible with Putonghua. Intelligibility inside of Jiaoliao is not known, but there may be multiple languages inside of it, because some Shandong Peninsula lects sound very strange even to speakers used to hearing Shandong Mandarin.
Karamay is an unclassified Mandarin language spoken in Xinjaing.
The Mandarin spoken around Tiantai in Zhejiang is not intelligible with Putonghua and may be a separate language. It is also unclassified.
Mandarin has 873 million speakers. There are an incredible 1,526 lects of Mandarin.
Although it is related to Mandarin, Jin is a completely separate language. Besides the Main Jin branch Baotou are apparently separate languages (evidence). As is possibly Taiyuan (evidence).
Within Hohhot Jin, there are two separate languages.
One is Hohhot Xincheng Jin, a combination of Hebei Jin, Northeastern Mandarin and the Manchu language.
The other is Jiucheng Hohhot Jin, spoken by the Muslim Hui minority in the city. It is related to other forms of Jin in Shanxi Province.
Yuci is a separate language from Taiyuan on a 200 word Swadesh test (Ben Hamed 2005).
Fenyang, the language used in Chinese director Jia Zhanke’s movie Xiao Shan Going Home is not intelligible with Putonghua.
Jingbian, in Shanxi, is a separate language.
Yulin is also a separate language.
Hohhot is a member of the Zhanghu Group of Jin, which has 29 lects.
Baotou and Yulin are members of the Dabao Group of Jin, which has 29 lects.
Taiyuan and Yuci are members of the Bingzhou Group of Jin, which has 16 lects.
Fenyang is a member of the Luliang Group of Jin, which has 17 lects.
Jingbian is a member of the Wutai Group of Jin, which has 30 lects.
Jin is composed of 171 lects, and some of them are separate languages. Jin has 48 million speakers (Olson 1998).
Besides Xiang Proper, assuming there even is such a thing, Shuangfeng and Changsha are separate languages, having only 47% intelligibility.
In fact, Changsha itself is divided into multiple languages in the city itself. We do not know how many there are, but we know that they exist. For the moment, we shall just add one lect to Changsha, and divide it into Changsha A and Changsha B, but there may be more. Furthermore, there are significant differences within the Changsha spoken in Changsha City and in the surrounding countryside.
Shuangfeng is also very different within itself, as the vocabulary changes every 10 miles or so. Intelligibility data is lacking.
Mao Zedong spoke Xiangtan, a notoriously difficult Xiang language in Hunan, about which it is said, “No one can understand it.” Xiangtan itself is internally diverse, with differences between the dialect of the city and rural areas, but intelligibility data is lacking.
Hengyang is apparently a separate language, as is Jishou (evidence). There is significant dialectal diversity in Hengyang, but intelligibility data is lacking.
Liuyang is a separate language, actually a macrolanguage, spoken in Liuyang county-level city in Changsha prefecture in Hunan. Liuyang is split into 5 divisions – Liuyang North, Liuyang South, Liuyang West, Liuyang East and Liuyang City.
Liuyang South and Liuyang East are separate languages, mutually unintelligible with the others. Liuyang City has recently arisen as a sort of a Liuyang “Putonghua” that is understandable to speakers of all Liuyang lects. So within Liuyang, we have three dialects – Liuyang City, Liuyang North and Liuyang West. Outside of Liuyang Proper, there are also two separate languages – Liuyang South and Liuyang East. None of the three Liuyang languages is intelligible with Changsha.
Even within this classification, each of the 5 Liuyang lects has multiple dialects. Each village is said to have its own lect in Liuyang.
Hengshan (evidence) is a separate language with vast dialectal divergence divided by Mount Hengshan.
There are two Xiang Hengshan lects on either side of the mountain – Qianshan and Houshan – that are very different and must be separate languages. Huayuan (evidence) is at least not intelligible with Putonghua.
In the city of Yiyang, Henan Province, 3 lects are spoken. One is a Yiyang Changyi Xiang lect, another is a Yiyang Luoshao Xiang lect, and a third is Luoyang Southwest Mandarin, a dialect of Henan Mandarin, described above. All appear to be separate languages.
We will call the two Xiang lects Yiyang Changyi and Yiyang Luoshao.
Baojing at least is not intelligible with Putonghua, yet it is said to be intelligible with Chengdu Southwest Mandarin.
Lingshuijiang, also spoken in Hunan by 300,000 people, may well be a separate language.
Ningxiang is said to be very different from Changsha. Given the dramatic divergence present even as background in Xiang, this must mean that Ningxiang is at least not intelligible with Changsha.
According to good sources, there is a tremendous amount of lect diversity in Western Hunan, and most of it probably involves Xiang lects, while most or all of these lects are not mutually intelligible. But until we get more data, we cannot carve any languages out of this mess yet.
Shuangfeng and Lingshuijiang are a members of the Luoshao Group of Xiang, which has 21 lects.
The Changshas, Hengyang, Xiangtan, Hengshan, Ningxiang and the Liuyangs are members of the Changyi Group of Xiang, which has 32 lects.
Baojing, Jishou and Huayuan are members of the Jixu Group of Xiang, which has 8 lects.
Xiang is composed of 74 lects. Many, or possibly all of them are separate languages. The various languages of Xiang have 50 million speakers.
Wu is a major group of diverse Chinese languages that is often divided into Northern Wu and Southern Wu. Northern Wu and Southern Wu are definitely mutually unintelligible languages. Southern Wu has 18 million speakers. In general, the list below just lists Wu lects that are utterly unintelligible with Putonghua. My opinion is that in general, the Wu lects are mostly separate languages, however, some are merely dialects of other Wu lects.
A good general rule for Zhejiang lects is that people say they can sort of understand the next city over, but two cities away was incomprehensible. For instance, in the Taizhou prefecture region, there are 4-5 unintelligible dialects across a 12 mile area. In Zhejiang, the mountains go all the way down to the sea, so there are few flat areas where language can spread out and become comprehensible.
Suzhou, Shanghaiese, Wuxi (evidence), Huzhou (evidence), Changzhou (evidence), Xiaoshan (evidence), Songjiang (evidence), Jiaxing, Hangzhou (evidence), Kunshan (evidence), Ningbo and Yixing (evidence) are separate languages.
Tongxiang also appears to be a separate language, as does Yuyao (evidence) and Zhoushan.
Qidong, spoken in the city of Qidong, is a separate language.
Lvsi, Qisi or Tongdong, spoken in the nearby town of Qisi, is a separate language from Qidong. Qidong is said to be very close to Chongming, so for the time being, we will list Chongming as a dialect of Qidong.
Haimen also appears to be a dialect of Qidong. However, there are 2 lects spoken in Haimen, and they are apparently not mutually intelligible. We will leave Haimen A as a dialect of Qidong, while we will set Haimen B as a separate language as it is not intelligible with Haimen A.
There are differences between Chongming and Haimen A, but the degree of them is not known. Changyinsha is very similar to Haimen, Chongming and Qidong, so it is probably a dialect of Qidong also. Another name for Qidong is Qihai, which refers to the speech of Qidong, Haimen and Tongzhou. For the time being, we will list Haimen A, Changyinsha and Chongming as dialects of Qidong. Chongming, and hence Qidong, is not intelligible with Shanghaiese.
Zhangjiagang, Changsha and Kunshan may be intelligible with Suzhou, but data is lacking. Suzhou is only 43% intelligible with Wenzhou. None of these lects is intelligible with Shanghaiese.
Ningbo has good intelligibility with Shanghaiese, but not vice versa.
Reports vary on the intelligibility of Shanghaiese and Suzhou. Some say they understand each well, but that is probably not the case at first due to serious differences in tones. Intelligibility testing is needed.
Pudong, the older form of the Shanghai language, is still spoken in the Pudong District of the city, but it is dying out. There is a question of whether or not it is mutually intelligible with Shanghaiese, but Shanghaiese speakers seem to feel it is not mutually intelligible (Gilliland 2006).
Several lects are spoken in the suburbs of Shanghai. Reports vary, but Shanghai residents generally report that these lects are not mutually intelligible with Shanghaiese (Gilliland 2006).
They are Baoshan, Fengxian, Nanhui, Jiading, Jinshan, Pudong (or Chuansha) and Qingpu.
Hangzhou is reportedly much different from the lects of Shanghaiese, Ningbo, etc. to the northeast, and is not intelligible with Shanghaiese, nor with Suzhou. Hangzhou has 1.2 million speakers.
Changzhou and Wuxi are not intelligible with Shanghaiese or Suzhou. Changzhou and Wuxi have high, but not full, intelligibility. Changzhou and Wuxi are part of a dialect chain in which eastern Changzhou speakers can communicate with western Wuxi speakers, but as one moves further west into Wuxi or east into Changzhou, intelligibility drops off. Like Czech and Slovak, it is best then to split Wuxi and Changzhou into separate languages.
Changzhou itself has considerable dialectal divergence, though apparently all dialects are intelligible. Changzhou has 3 million speakers.
Yixing, near Changzhou, is not intelligible with Shanghaiese.
Jiangyin is spoken in Jiangyin city. It is related to Changzhou and has high intelligibility with Changzhou and Wuxi.
All of the above are in the Taihu Group.
Taizhou, centered around the city of Tuzhou in Eastern Zhejiang, is composed of 11 separate lects, all of which are separate languages, Huangyan (evidence), Jiaojiang, Linhai, Sanmen, Tiantai (evidence), Wenling (evidence), Ninghai (evidence), Xianju, Leqing (evidence), Yubei and Yuhuan (evidence). (Evidence for all).
A single subgroup of Wuzhou, Yiwu – contains 18 separate languages, all mutually unintelligible. We will call them Yiwu A, Yiwu B, Yiwu C, Yiwu D, Yiwu E, Yiwu F, Yiwu G, Yiwu H, Yiwu I, Yiwu J, Yiwu K, Yiwu L, Yiwu M, Yiwu N, Yiwu O, Yiwu P, Yiwu Q and Yiwu R for the time being.
Pucheng is a separate language. Pucheng has 2 dialects, Nampo and North Dabei. Intelligibility data is not known. Pucheng is so diverse that some say it is a language isolate and is not even a part of Wu (Norman 1988).
There are two groups of Southern Wu which are said to be both highly divergent and to have very low intelligibility internally. These groups are sometimes called Jinqu and Shangli.
Jinqu consists of at least 30 languages: Jinhua, Jinhua Xiaohuang, Tangxi, Lanxi, Pujiang, Yiwus A-R, Dongyang, Pan’an, Yongkang (evidence), Wuyi (evidence), Quzhou (evidence), Longyou and Jinyun. Lanxi has 660,000 speakers (Rickard 2006). Quzhou is apparently not intelligible with Wenzhou. Jinqu is roughly equivalent to the Wuzhou Group.
Shangli contains at least 18 languages: Shangrao City, Shangrao County, Guangfeng, Yushan, Kaihua, Changshan, Jiangshan, Lishui (evidence), Suichang , Songyang, Xuanping, Qingtian (evidence here and here), Yunhe, Jingning, Longquan, Qingyuan, Taishun and Pucheng.
This group is roughly equivalent to the Longqu and Chuzhou Groups of Chuqu. Some members of this group extend beyond Zhejiang and into northeastern Jiangxi and northern Fujian.
We are going to cautiously classify all of these lects as separate languages since they are said to be much more divergent and much less mutually intelligible than Taihu, and Taihu itself seems to have pretty low internal intelligibility.
Wenzhou (evidence) is a separate language.
Ouhai, Yongjia and Ruian appear to be dialects of Wenzhou, but all of them are probably separate languages, since if you go 5 miles in any direction in Wenzhou, there’s a new dialect, and it’s hard to understand people.
Wenzhou is 43% intelligible with Suzhou.
Wencheng (evidence) appears to be a separate language.
Wenxi is a separate language within Oujiang, not intelligible with Wenzhou. It is spoken in one town in Qingtian County.
Jinxiang also has its own Wu lect, with Mandarin influences. This is a Taihu (Northern Wu) outlier.
In addition, in Taishun County, there is also an aberrant Wu lect spoken in the town of Luoyang, influenced by both Manjiang and Oujiang Wu.
There is another Wu lect similar to Manjiang Eastern Min spoken in the town of Hedi in Qingyuan County in Lishui.
Manhua is quite different. There is a controversy over whether or not Manhua is Macro-Min or Macro-Wu. It is probably Macro-Wu based on phonology and it also shares some similar Min-like traits with other Wu lects such as those in the Chuqu group.
Within Manhua, there is a northern group spoken in the town of Yishan and a southern group spoken in the towns of Qianku and Jinxiang. Qianku is the standard for Manhua. The northern/southern divide may impede intelligibility, but we have no information yet.
Wuhu is a separate language, unintelligible with Shanghaihua.
Nanjing Wu is a separate language
Jiaxing, Shanghaiese, Suzhou, Wuxi, Songjiang, Tongxiang, Qidong, Lvsi, Yunhe and Kunshan are all in the Hujia Group of Taihu. The Hujia Group contains 32 lects.
Changzhou, Yixing, Jiangyin and Haimen are in the Piling Group of Taihu. Piling has 12 lects. Piling has 8 million speakers.
Wenzhou, Ouhai, Yongjia, Ruian and Wencheng are in the Oujiang Group of Taihu, which also contains 12 lects.
Hangzhou has its own group, the Hangzhou Group of Taihu.
Shaoxing, Fuyang, Xiaoshan, Linan, Yuyao and Zhuji are in the Linshao Group of Taihu which also contains 12 lects.
Fenghua and Zhoushan are in the Yongjiang Group of Taihu. The Yongjiang Group contains 11 lects and has 4 million speakers.
Changxing is in the Taioxi Group of Taihu, which has 5 lects.
The Taihu Group is composed of 75 separate lects, many or all of which are separate languages. Taihu has 47 million speakers.
Lishui, Qingyuan, Jingning, Jinyun and Taishun are in the Chuzhou group of Chuqu, which contains 9 lects. Chuzhou has 1.5 million speakers. Chuqu itself contains 35 separate lects.
Pucheng, Shangrao County, Shangrao City, Jiangshan, Songyang, Guangfeng, Longquan, Kaihua, Changshan, Suichang, Longyou, Yushan and Quzhou are members of the Longqu Group of Chuqu, which has 14 lects and 5 million speakers (Olson 1998).
The Yiwu languages, Dongyang, Jinhua, Jinhua Xiaohuang, Lanxi, Tangxi, Wuyi, Pan’an, Pujiang and Yongkang are all members of the Wuzhou Group, which contains 27 separate languages. Wuzhou has 4 million speakers.
Nanjing Wu is unclassified.
The various Wu languages have 85 million speakers.
Within Hui, there are at least six separate languages (Hirata 1998). Actually, there are many more.
Xidi, spoken in a village at the foot of Huangshan Mountain, is a separate language. Xidi is unintelligible even to villages a few miles away.
Tunxi, Wuyuan and Xiuning are separate languages. The first two are spoken in Anhui, but Xiuning is spoken in Jiangxi Province.
Within the Jingzhan Group of Hui, JingdeNingguo, Qimen, Chilingkou, (spoken in Chiling, Qimen County), Meixi Xiang, and Shitai are separate languages.
Within Qimen County itself, there are 6 different Hui lects, with low intelligibility between them. It is quite possible that we are talking about 6 different languages here. One of them appears to be Chilingkou above. The others we will just call: Qimen A, Qimen B, Qimen C, Qimen D and Qimen F. All except Meixi are spoken in Anhui Province. Meixi is spoken in Meixi, Jiangxi.
Jixi, Hongmen and Shexian are separate languages.
Within Shexian, there are two different languages that we will only call Shexian A and Shexian B for now. Jixi and the Shexian languages are spoken in Anhui.
Dexing and Dongzhi are separate languages, the first spoken in Jiangxi and the second spoken in Anhui.
In the Yanzhou Group of Hui, Jiande and Chunan are separate languages.
There are two other lects in the group, Suian and Shouchang. Chunan and Suian are very diverse and are in all probability separate languages. Shouchang is also extremely diverse, and Jiande has some differences with Shouchang.
The Yanzhou languages are interesting because there is controversy whether they are Wu or Hui languages. Careful examination reveals that they cannot be subsumed under Southern Wu due to their great divergence, despite having some similarities with Wu. Some authors feel that they are Hui-Wu merged lects, and their similarity with both is given as a reason for merging Wu and Hui into a supergroup.
While it is best to classify them as Hui, they are much different from most Hui lects. All are spoken in western Zhejiang. The Yanzhou Group has four languages. Discussion here.
Huangshan, Tunxi, Wuyuan and Xiuning are members of the Xiuyi Group of Hui, which has 6 lects.
Meixi, the Qimens, Chilingkou, Shitai, Ningguo and Jingde are members of the Jingzhan Group of Hui. Jingzhan has 12 lects, all of which are separate languages.
Jixi, Hongmen and the Shexians are members of the Jishe Group of Hui. The Jishe Group has 6 lects .
Dexing and Dongzhi are members of the Qide Group of Hui. The Qide Group has 5 lects.
Xidi is unclassified.
The various Hui languages have 3.2 million speakers . There are 34 different Hui lects, at least 24 of which are separate languages. There is a possibility that all Hui lects are separate languages, but that remains to be proven.
Cantonese is a major language spoken in the south of China. They are said to be a mix between the Yue people and the Han. They have great pride in their speech which appears to be closer to ancient Chinese than Mandarin is. When Sun Yat-Sen was President of Republican China, a vote was held on which language to base Standard Chinese on. Cantonese only lost by one vote in favor of Mandarin.
Some Cantonese activists denounce Mandarin as a pidgin language spoken Manchu and Mongol invaders glommed onto the Chinese of the people they conquered.
Attempts to determine intelligibility through the use of complex lexical, tonal, grammatical and phonological formulae produce results that are excessively high in terms of percentage of intelligibility. A better method is presented in Szeto 2000, in which sentences in other lects are played to speakers of Lect A, and speakers of Lect A are asked to give the basic meaning of the sentences played to them. A sentence is recorded as correct if the basic meaning was ascertained.
By this better method, Standard Cantonese has only 31.3% intelligibility of Siyi, 7.2% of Hakka, 2.7% of Teochew and 2.5% of Xiamen. This paper also highlights the very important role morphological and syntactic differences play in intelligibility, even apart from phonology and other factors.
In contrast, the more complex method not relying on actual informants gives false positives. By this method, Cantonese has 54.7% intelligibility of Hakka, 47.4% of Xiamen 43.5% of Teochew. This method falsely overestimates the intelligibility of Hakka by 7.6 X, of Teochew by 16.1 X and of Xiamen by 19 X.
Cantonese is traditionally said to have nine tones, but phonemically, there are only six tones, since the last three are just three of the first six with a voiceless stop consonant on the end. These are often called entering tones in traditional Chinese scholarship.
Entering tones have disappeared from most Mandarin lects, probably about 800 years ago due to the influence of invading Mongols speaking Turkic languages, but are still present in Cantonese, Hakka and Min. The original entering tones of Middle Chinese have merged into one or the other or Mandarin’s four tones.
Traditional Chinese tones or contour tones end in a vowel or a nasal. However, in Cantonese, the entering tone has retained its original short and sharp character from Middle Chinese, so in a sense, it has a different sound quality.
Besides Standard Cantonese (the Guangzhou lect in the Yuehai Group), there is Siyi, or Sze Yup, a separate language. Siyi has 8 dialects, however, there are reports that there are intelligibility problems within the Siyi lects.
In particular, Enping speakers cannot understand some other dialects. Therefore, Enping is a separate language.
Kaiping, or Chikan, is not fully intelligible with Enping until they get used to each others’ sounds. Kaiping is so different from Taishan that it is hard to imagine how they can communicate well, though there is partial intelligibility.
In Xinhui, there is a dialect called Hetang that is very divergent and has many strange features not found in other dialects. Doubtless it is less than fully intelligible with other Siyi lects.
Actually, there seems to be many more than 8 dialects of Siyi. In Taishan County alone, there are 20 townships there may be a different lect in each one. For certain, there are at least three distinct dialects of Taishan, Taishan A, Taishan B and Taishan C. Even the lects in Taishan County can be quite different. However, all lects in Taishan County appear to be mutually intelligible.
Xinhui is somewhat different from Taishan, but appears to be intelligible. Heshan is said to be intelligible with Xinhui and Taishan.
Nevertheless, there are calls from Taishan speakers to split their lect off from the rest of Siyi. If Taishanese is unintelligible with the rest of Siyi, this would make sense, but that does not appear to be the case.
150 years ago, there was less, but still significant, difference between Siyi and Sanyi (Standard Cantonese), but Siyi was disparaged as a “hill dialect” of poor farmers, while Sanyi was elevated as the prestige lect of the cultured and cosmopolitan. This is why Sanyi became the Standard Cantonese lect. The Siyi incorporated this negative view into their self-image even to the point where they held overseas meetings meeting in Sanyi speech.
There are 3.6 million speakers of Siyi.
Vietnamese Cantonese is quite different from Standard Cantonese, but it is nevertheless intelligible with it. Malay Cantonese is also quite different from Standard Cantonese. Intelligibility data between Malay Cantonese and Standard Cantonese is not known. Both are dialects of Cantonese.
Hong Kong is a dialect of Guangzhou. Foshan and Nanhai are close to Guangzhou and may be intelligible with it. Nanhai and Shunde are mutually intelligible.
Some say that Shunde and Zhongshan are intelligible with Standard Cantonese, but others disagree. This requires further study, as they are obviously close. However, both are said to at the same time be quite different from Standard Cantonese.
Even within Yuehai, Panyu is said to be a separate language (Chan 1981).
Namlong, a poorly understood lect from the Pearl River area, is also a separate language, or at least it was one in 1949. Whether it still exists is not certain, but speakers must still be alive. Yuehai itself has 31 separate lects.
Danija, the Cantonese lect of the Tanka fisherpeople who live on boats off the coast of Guangdong, Guangxi and Hainan, may well be a separate language.
In Hong Kong, another Cantonese language, Gashiau, is spoken by a group of fisherpeople related to the Danija. This language is related to Danija but apparently not intelligible with it.
Maihua, a Cantonese lect spoken on Hainan, may well be a separate language also.
Nanning is a dialect of Cantonese, easily understandable by a Standard Cantonese speaker.
However, Lizhou is a separate language, with difficult intelligibility with Standard Cantonese.
Dongguan and Zhanjiang (evidence), are separate languages.
Shiqi, spoken in Guangxi, is a separate language. Speakers of Standard Cantonese cannot necessarily understand Shiqi, but Shiqi people can understand Guangzhou. Shiqi is spoken in the urban part of Zhongshan City.
Huazhou is a very divergent Cantonese lect that is very hard even for other Cantonese speakers to understand. It is surely a separate language (evidence here and here).
Maoming is an extremely diverse Cantonese lect that must also be a separate language.
Beihai and Hepu are reported to be very different, but intelligibility data is not known, nor is it known to what extent these two lects differ from other Cantonese.
But the Quinlian Group of which they are members must surely be a separate language.
One division holds that the Standard Cantonese (Guangzhou), Siyi, Zhongshan, Gaoyang and Guangfu groups are mutually unintelligible groups.
The Goulou Group of Cantonese appears to be a separate language from all of the rest of Cantonese, and is probably in a group of its own away from the rest of Cantonese, and linked with Pinghua and Tuhua. Yulin is a representative lect in Goulou, and is said to present form of Chinese that is closest to Old Chinese.
Siyi has at least 11 dialects, includes the famous Taishanese (includes Taishan A, Taishan B and Taishan C), along with Heshan, Jiangmen, Siqian, Doumen, Xinhui, Enping and Kaiping.
Nanning is in the Yongxun Group of Cantonese, which has 12 lects.
Zhanjiang and Maoming are members of the Gaoyang Group of Cantonese, which has 10 lects. Gaoyang has 5.4 million speakers.
Dongguan, Shunde, Foshan, Zhongshan, Nanhai, Panyu and Hong Kong are members of the Guangfu Group of Cantonese, which has 31 lects. Guangfu has 13 million speakers.
Shiqi is a member of the Zhongshan Group of Cantonese , which contains at least 3 lects.
Huazhou is a member of the Wuhua Group of Cantonese, which has 2 lects.
Beihai and Hepu are members of the Quinlian Group of Cantonese, which has 6 lects.
Namlong is unclassified.
There are 100 lects of Cantonese, and Cantonese has 64 million speakers.
Pinghua, now recognized as a major split off from Cantonese, is composed of Guinan and Guibei, which are separate languages. The Guibei lects are very different, but we don’t have any intelligibility data.
Guinan has 22 lects, and Guibei has 8 lects .
There is one Pinghua lect that is unclassified.
Pinghua has