A few months ago I finally decided that I'd had enough of repeatedly forgetting most of the Japanese I knew and then spending time getting back to an intermediate comprehension level, and that what I ought to aim for once and for all was to attain native-level fluency, at which point I'd then be able to rely on reading newspapers and listening to radio and TV broadcasts to maintain my newfound proficiency. This effort has thus far been surprisingly successful, in that I can now follow most TV dramas without much difficulty and can now read Japanese newspapers well enough that I regularly see stories in them that are only translated days later if at all, but one fairly obvious point my efforts to reach this level have alerted me to is that by far the biggest obstacle adults face in learning languages like Japanese which have very regular grammars and which are unrelated to the ones they already know isn't mastering grammatical rules but vocabulary: reasonable languages* have at most a few hundred rules of grammar to master, but in order to comprehend most of what people are talking about in a language without continuously reaching for a dictionary, one has to be able to effortlessly recall several thousand terms at the very least.
With languages like German, Spanish and French which are closely related to English, the burden of vocabulary acquisition is not so noticeable due to the huge numbers of terms shared in common, but when one no longer has the crutch of a Germanic and/or Latin inheritance to rely on, the immensity of this task becomes much more obvious. If we define a "word" to mean not just the root term but also all its inflections and derivatives - so that "vary", "variant", "variable", "invariant", "invariably" and "unvarying" all count as the same "word" - the average adult speaking any language probably actively uses something on the order of 5,000 to 10,000 such words in speech or writing, while passively knowing the meaning of perhaps 2 or 3 times as many. Considering that native speakers of a language have had their entire lives to accumulate this store of knowledge, it's hardly surprising that adult language acquisition is so difficult; even if it were not true that children have an innate facility for languages that adults lack, they'd still have an easier time of picking up new tongues due to the fact that no one expects them to be possess fully stocked vocabularies, while the vocabulary gap between them and their peers to whom the new language is native is much smaller than it is with grownups. Adult language learners, on the other hand, must go from being able to use terms like "aesthetics", "itinerant", "peroration" and "luminary" to struggling to learn new ways to say mundane things like "pothole", "engine", "leaky" and so forth.
Now, this is, as I've said, all rather obvious once you think about it, but even so it's still far too vague for my liking: what I want to do is quantify all of the above in at least a half-way rigorous fashion, so that I can try to give an answer to the following question: how many words does one really need to know to be able to call oneself "fluent" in a language, and given the number "X", what does this imply for the number of new terms adult learners must master per day? Obviously this answer will depend in part on what time frame one is looking at to getting up to speed in a language, but there are realistic limits to how many new words anyone can learn in a day well enough to remember them over the longer term, and this in turn will set some hard limits on how quickly one can hope to gain fluency even under the most intensive training regimen.
In planning my "mastering Japanese" program I set myself a base target of 20 new words a day, though I have exceeded this number pretty much every single day, and am averaging about 30-35; as I didn't start from a baseline of zero, this has been sufficient to move me from almost total incomprehension of teenage dramas to a passable understanding of even exceedingly formal NHK news broadcasts, but I'll only know I've reached my goal when it becomes impossible for me to meet my target of 20 new terms a day without resorting to reading highly technical or academic treatises, and it's precisely to estimate how far I am from this endgoal that I need the kinds of numbers I'm searching for. Japanese being a highly inflected (or rather, agglutinative) language, it doesn't require of its speakers nearly as many base "words" as the highly analytic English does, but if this English vocabulary size estimator** were indicative, I'd guess I'm still perhaps 3-4,000 new terms short of where I'd like to be.
*And here I count English as being among the least reasonable languages in widespread use; nearly every possible grammatical "rule" seems to have an exception or other, and usually there are several.
**Native English speakers who are prone to belittling foreigners for having an imperfect command of their language should consider just how basic many of the terms at the 12,000 to 18,000 word level on this test are: if one learnt English terms in their order of frequency, one would still not know the meaning of "adaptable", "capsule", "justify" and "liberate" even after mastering some 12,000 terms! Then there's the hell that is English grammar to consider ...
Full caveat: After about six years without practice my Korean has mostly gone to shit.
That said, what I found useful in Korean when learning the vocab was, if learning a sino-Korean word to learn what the hanja meant (the sound if not the character). The result was that once I had a decent enough store of those in my head, I was often able to use them in the same manner that an Anglophone may not know Latin but still has a grasp of the Latinate roots of the language to help him understand words he's never seen.
Of course, Japanese is something of a different kettle of fish (since as I understand even many Kanji have a pure Japanese and sino-Japanese pronounciation), but I wonder if a similar sort of approach might be helpful.
Oh, and I think you might have inspired me to try and return my Korean to a functional level.
Posted by: Andrew Reeves | February 16, 2006 at 12:53 PM
"what I found useful in Korean when learning the vocab was, if learning a sino-Korean word to learn what the hanja meant (the sound if not the character)."
This is a useful suggestion, and one I do try to make use of; unfortunately, the sheer abundance of homonyms in Japanese, as well as the multiplicity of possible readings characters may have (as you've noted yourself), makes this rather less useful in practice than one might think, especially when trying to go from spoken -> written Japanese.
"Oh, and I think you might have inspired me to try and return my Korean to a functional level."
I expect this should be rather less painful than you might expect; it's amazing how quickly what one thought one had forgotten comes back to one; it's the genuinely new stuff that bogs me down.
Posted by: Abiola | February 16, 2006 at 01:50 PM
The Asian character-based languages have another disadvantage for Westerners, of course, since memorizing the sight and sound of words are two different things. I am learning Chinese (unfortunately Mandarin, not Cantonese), which adds another layer of tonality.
I have heard that it takes kids in China more effort to become fluent in Chinese than most other countries/languages. Do you think this might be true?
Posted by: Andy | February 16, 2006 at 02:31 PM
"I am learning Chinese (unfortunately Mandarin, not Cantonese), which adds another layer of tonality."
Why unfortunate? At least you have "only" 4 tones instead of 6/9...
"I have heard that it takes kids in China more effort to become fluent in Chinese than most other countries/languages. Do you think this might be true?"
This might be because many kids in China learn Mandarin as a second language (their mother tongue being the local "dialect," i.e. language that is mutually incomprehensible with Mandarin).
Posted by: Andrew | February 16, 2006 at 02:50 PM
First you have to remember that the distinction between "vocabulary" and "grammar" is only provisionsl. In many languages the distance betwen the two is great enough that it looks absoulte, but in others there is no real separation between the set of rules that governes the derivation of words and the inflection of words so that you can use them in sentences.
And this has obvious implications for your main question. A language with overt or explicit rules is going to be a lot simpler for an adult learner. Probably the number of adult learnes a language has influences this. A guy who had had to learn Turkish said that it was almost creepy how easy it was to deduce what this or that word would be to expres some concept - very transparent. You would expect that in a langauge that took on a huge number of foreign adult learners fairly recently in its history. The other extreme is a langauge liek Irish, where the grammar is not only inordiantely complex, especially in the morphophonetic part of the grammar, but also contains a huge mass of exceptions. This doesn't make Irish somehow les reasonable than Turkish, it just maens that Irish as a cultural subsystem is maximazed for the opposite result - it serves to excludes foreign learners. It is very efficiently parochial.
A further implication is that a language like Chinese, which has only a few extremely broad rules for word derivation, has an especially difficult vocabulary. The lack of detail in the rules leaves you guessing or just defeated a lot of the time. This is pretty much the problem in English, which has a decent stock of word-formatives, but not much statndardization in the semantic results they yield.
Again, this doesn't mean that languages with highly transparent vocabularies are somehow more reasonable that the others, just that they two types have developed in opposite directions. Highly transparent vocabulary is good only for transmitting bare facts. It is useless for word play, or any kind of evocative use of the language where a certain degree of vagueness is crucal for achieving the desired effect. A individual person may prefer one or the other, and in language individual preferences don't count for much; the point is that language is that vague and inconsistent structures in vocabularies serve an obviously important social purpose and are more common than not.
Bottom line - there is not going to be one answer to how many discrete words a person is going to need. It is language dependent.
Posted by: Jim | February 16, 2006 at 09:19 PM
Okay, so one could learn 12000 vocabulary words before coming up with something like "justify"--but I would guess that many fewer words would have to be learned before one could come up with "show or prove that something is right."
My point is that vocabulary is only one many aspects of fluency. A similar case could be made for grammar. One can know all the rules of grammar but still not be fluent. Why? Because fluency is ultimately about the ability to communicate, and communication is, yes, vocabulary and grammar, but it is also about personality and creativity and confidence.
I have, for example, a Japanese student, a medical researcher with an enormous and sophisticated vocabulary from years of reading scientific journals. He could come up with words like "aesthetics", "itinerant", "peroration." (I had to look up "peroration" by the way.) He has the vocabulary and he knows the grammar and his listening is excellent. Is he fluent? No. Why? Because a conversation with him is excruciating. Oh, he can craft a perfect sentence given half a minute or more--a time which is too long to make a native speaker of English wait for a reply during a conversation.
Another point is that as a Westerner in Japan, I can wow the locals with a few words strung together with very bad grammar. Why? That has to do, I believe, with the monocultural nature of Japan. The reverse is not true in multicultural America. The point here is that Americans at least judge fluency with a different standard because we are accustomed to contact with people of all ethnic backgrounds who are fluent in English. In America, someone who looks Japanese (because their parents emigrated from Japan) will speak English as well as or better than I do, because they grew up in America.
Fluency is not necessarily quanifiable. Knowing X number of words and every grammar rule in the book doesn't equal fluency. Fluency is about grammar and vocabulary, personality, creativity, manners and customs, cultural knowledge and understanding. True fluency in English is very difficult to attain and retain.
Posted by: brenda Garcia | February 16, 2006 at 10:05 PM
Brenda,
there is a difference between fluency and accuracy and range and a whole lot of other measures of language competence. Fluency is not some specialized for competence.
A three-year-old can be completely fluent - like a broken pipe.
Hah! There's an example of the value of non-transparent vocabulary.
Posted by: Jim | February 16, 2006 at 10:51 PM
"Highly transparent vocabulary is good only for transmitting bare facts. It is useless for word play, or any kind of evocative use of the language where a certain degree of vagueness is crucal for achieving the desired effect."
It isn't clear what you mean here by "transparent vocabulary", but if it's what I think it is - that the language is agglutinative, as with Turkish, Finnish and Japanese - then I can quite definitely say that you could not possibly be more wrong, at least where Japanese is concerned: the language is rife with vagueness, evocative wordplay and subtle allusion, as anyone familiar with its poetry will tell you.
"Bottom line - there is not going to be one answer to how many discrete words a person is going to need. It is language dependent."
Who's said anything about giving one answer? If anything, what I had to say about the analytic nature of English making for more base terms ought to have tipped you off that I'm already aware that there's no such number. What I'm looking for is to be able to say "Given language X, you'll need to master the Y most common terms in it to be able to read, say, 95% of a typical broadsheet newspaper without needing to reach for a dictionary, and at that point you'll only encounter terms you don't understand on TV once or twice per week."
"My point is that vocabulary is only one many aspects of fluency. A similar case could be made for grammar."
The thing is, nowhere have I made the claim that mastery of either is *sufficient* for fluency, only that they are *necessary*, and that for English-speakers learning Japanese, vocabulary acquisition is a much more daunting task than getting the grammar down: as languages go, Japanese, with a sum total of 3 irregular verbs, is one of the most grammatically sensible on this Earth.
"The point here is that Americans at least judge fluency with a different standard because we are accustomed to contact with people of all ethnic backgrounds who are fluent in English."
No, what is happening here is that you're mistaking the erroneous but common Japanese belief that theirs is an impossibly difficult language (along with their tendency to butter up ignorant foreigners for politeness' sake) with the idea that they think you're fluent, when nothing could be further from the truth.
What you're describing is the linguistic equivalent of hearing "ohashi jouzu desu ne" and taking it at face value: the reality is that the more fluent you actually are, the more ready native Japanese speakers will be to tell you just how far you have to go. Japanese people are, if anything, *less* forgiving in terms of how they judge fluency than English-speakers are, for the simple reason that they aren't used to dealing with individuals who don't have a firm command of 標準語.
"Fluency is not necessarily quanifiable. Knowing X number of words and every grammar rule in the book doesn't equal fluency."
I don't say that it does: what I *do* say is that there are certain prerequisites one must have in order to be fluent, and a large enough vocabulary is one of them. As such, it makes perfect sense to wish to quantify just how large that vocabulary must be, as indeed linguists already try to do; in fact, I even have a rather good idea of how I'd go about doing so myself.
Posted by: Abiola | February 17, 2006 at 12:38 AM
[...ohashi jouzu desu ne...]
Which of course immediately invokes a stereotype threat - and I guess this happens linguistically also.
Posted by: Chuckles | February 17, 2006 at 01:07 AM
I think that these are interesting points, although I'd be more interested in how you get to a decent level of a language- i.e. focus on the following 5 tenses..., learn these 150 verbs, 100 adjectives and 200 nouns, learn these 50 phrases, know how to use each of 'who, what, where, when, whose, why', learn 'if, can, have to/must, try', then just practise, practise, practise.
The point that I'm trying to make is that it'd be nice to quantify a core syllabus and have confidence that mastery of this will lead to genuine competency.
Posted by: Kendo | July 18, 2006 at 03:22 PM
(Old thread, I know, but...) I had this same revelation in my Japanese learning several months ago as well. Despite how simple the "mathematics" of needed vocab is, I find it is a point that is often overlooked in language training.
Yes, as was discussed, there is more to fluency than vocabulary, but you cannot be fluent without a much larger one than most language learners realize, I think.
Posted by: Jason Braswell | December 13, 2009 at 04:12 PM
I know this message line is waaay out of date, but maybe a few people are still following it...
I wanted to add, there is an additional element to fluency that is not mentioned in any of the posts; listening comprehension.
I am going from 0 to living in Japan and trying to learn the language as quickly as possible, and listening comprehension is the biggest hurdle for me so far. I have finished all the introductory textbooks and I think I'm at about 3000 spoken/hiragana words (4 months in), but the listening comprehension is just killing me; so many times I ask people to write down an undecipherable phrase only to find out I knew every word they were saying! I suppose this is a side effect of trying to learn the language so quickly...
it's quite frustrating though, because the only thing I can think to do about it is keep listening to tapes and stuff with subtitles and pray that it somehow gets better.
For the record, everyone has their own definition of fluency. I doubt I will ever stop learning English words (I had to look up all of the words except "aesthetics" that the original post gave as examples...)
Posted by: Nicolaus Schmandt | December 17, 2009 at 05:03 AM