The Hexacoto

Listening to the sound of one hand clapping

Category: Academic

The persistence of comprehension

chinesefup

Some time ago, Instagram user jumppingjack posted the above image of a note she left to her mum. She said that her brother secretly added extra strokes to the characters in the note. The result is interesting though: even though extra strokes were added, the note is still readable to most competent Chinese speakers. This phenomenon is very similar to one not too long ago in English, coined “Typoglycemia,” a portmanteau of “typo” and “glycemia” and a pun on “hypoglycaemia,” where as long as the first and the last letter of the word is preserved, the middle can be scrambled and the words are still understandable.

This is an interesting case in what I call persistence of comprehension, where comprehension of words persists despite efforts to thwart it.

The Preamble

Unlike English, which uses the alphabetic system where each letter is a phoneme, or Japanese, a syllabary system where each character is a mora, Chinese uses a logographic system, using “pictures,” or logographs to represent words. So unlike the other two systems where there are things to scramble, it is hard to “scramble” a picture, and scrambling a picture is no different from adding or subtracting strokes from a character, which is what jumppingjack‘s brother did.

Before I go further, let me type out what the note intends to say:

妈妈,明天的午餐因为
人数不够,他们把这个
活动换到下一个星期
(可是下个星期六中午我的
公司有个午餐)

谢谢 🙂  (我明天应该有吃午餐)

Mum, for tomorrow’s lunch because
there aren’t enough people, they have
changed this activity to next Monday.
(But next Saturday afternoon my
company has a lunch event)
Thanks 🙂  (I should be eating lunch tomorrow)

So how does persistence of comprehension occur in Chinese? I shall illustrate some of the characters that are easily understood despite the scrambling and the ones that threw me off (and my friends) the most. (Also, note that the person mis-wrote the character for 期 where he switched the 月 and 其 around, not of her brother’s doing. But the brother added an extra radical as well)

chinesemessy

The image above sorts some of the words in the note in order of persistence of comprehensibility from top to bottom, with top being easiest to understand despite scrambling and the bottom being the hardest. The scrambled word is on the left and the proper word is on the right. Note that the bottom four scrambled words are all actual Chinese words, which I will talk about shortly.

The scrambling of the 的 character is one of the easiest to understand, because despite the additional stroke, it still mostly resembles its original character, and does not resemble any other words in the language. The added stroke is a not a radical, a graphical component of a word that is often semantic, unlike the scrambling of the character 明 (tomorrow). Similarly for 因, the added stroke turns the 大 in the 因 into a 太, but on the overall the word is not a real word and mostly resembles its original.

Now we look at the addition of a stroke in 明, turning the 日 (sun) radical, usually used for weather-related words, into a 目 (eye) radical, usually used for vision-related words. The resultant scrambled word is still not a word, but the morphing of a semantically-relevant radical into another makes one pause when reading the sentence. Also, the addition of a stroke to the 月 (moon) component turns it into a 用 (use) character, making comprehension even more difficult.

One step after the 明 character is the 们 character, where not a stroke but an entire 中 (middle) word has been inserted in the middle (haha) of 们. Some of my friends disagree that it is harder than the scrambling of 明, and I’m inclined to agree, and I’d put it as a toss-up between the two. However, I feel that the insertion of an entire word as opposed to a stroke or radical morphs the word enough to the point that it becomes alien enough not to even resemble its original, but does not resemble any other word in Chinese.

Lastly, the last four words, 公,午,伞,and 下, have strokes and/or word components added to them, that they actually resemble other words in the language, 翁 (old man), 牛 (cow), 伞 (umbrella), and 卡 (card). With such resemblance to real words, little wonder people have difficult understanding the words as they read them.

The Analyis

How is it that we are able to understand the note with little difficulty?

In the English “Typoglycemia,” it has been suggested that we identify words not solely by letter position in a word, but by context, shape of the word, and position of word in the sentence. I’m going as far to suggest that in English seeing the individual letters of a scrambled word draws upon our stored memory of the word, further aiding comprehension of a scrambled word. Compare:

  1. Adcnirocg to rrasceeh at a ptaruilacr ureitnvisy
  2. Aoincdrg to rcseerh at a plaaicutr uesvtniiy
  3. Aroindg to rearech at a pluiraacr utrisveiy

Example 1 is classic “typoglycemia” where persistence of comprehension is strong, example 2 removes one non-essential letter from each word, and persistence of comprehension is still relative strong. Example 3 removes what I consider an essential component to the memory of the word, which are usually consonants and not vowels. Take this example:

  • I cnt blve u dd tt!

In English, vowels can be removed quite easily and the comprehension of the word is still possible. This suggests that consonants play a slightly more important part in the reading of words. In that aspect, comprehension of written English has some similarity to comprehension of written Arabic or Hebrew, where typically vowels are not included in the writing (in the way English does anyway). Thus, it is harder to understand “aroindg” as “according” because

  1. An essential component has been removed (3 syllables, essential components in bold: a-cc-r-dng). This might be so because consonant representations are tied up with its phonetic properties. This is why “cc” or has to be removed as opposed to just “c” from “according” for comprehension to fail, because “cc” in the word correlates to the /k/ sound in /əˈkɔː(ɹ)dɪŋ/; even with just one “c” or it is sufficient to clue us in that there might be a /k/ or sound in the scrambled word.
  2. It is the first in the sequence of essential components, suggesting that perhaps we process essential components sequentially in our head. It could be that when we see the word “according,” we could be drawing upon the idea that “according” has the components “a-cc-d-ng” in that order. Hence removing the first component “cc” impedes comprehension as it cannot give the subsequent components context of what the word might be (compare understanding: “aroindg” (“cc” removed) with “aocrcdg” (“in” removed)).

How does this relate to Chinese? If we can say that we draw upon essential sequences of components in the comprehension of written English, perhaps there is an equivalent of that in the comprehension of Chinese. I believe that in reading Chinese, there is a stored visual memory of what the character looks like in general, and also an idea of what strokes the character should contain (“legal strokes”), and what it should not (“illegal strokes”).

First, we address whether modifying a Chinese character sets off alarm bells to the reader. Adding legal strokes to scrambled characters should stand out less to the reader, causing him to accept the character as a real word visually. We look at the following example where this is demonstrated:

chinesemessy2

In the note, a floating shuzhe (vertical-bend) stroke is added to the 够 character, and in the Chinese language there is no such occurrence of a floating shuzhe; they are always attached to other strokes, such as in 喝 (with some exceptions, like 断, which may or may not be attached). Being visually alerted that there is something wrong with the character, we immediately visually discount the scrambled 够, and are able to extract the original word. In the example of 他, the pie (leftward-slant) stroke is added on top of the 亻radical, creating a 彳(step) radical, which exists. Thus when reading the scrambled word of 他, it does not jump out at the reader visually as the shuzhe stroke in 够 does, and we are likely to gloss over it and accept it as it appears to us and are less likely to question whether the character is out of place contextually or not.

Next, adding a legal stroke to scrambled words causes more confusion when the stroke turns the original word into a semantically different word. There is extra confusion when the meaning of the new word does not fit in the context of the sentence, especially when the word has been accepted as it is, as explained in the previous paragraph. These can be seen in the following examples:

chinesemessy3

If my premises are right, in example 1, readers should be able to identify the error most easily and yet still read the sentence in its original context. In example 2, they should gloss over the wrong character, and since it still resembles very much like the original, is not a new or any word at all, persistence of comprehension should still be strong. In example 3, this is where comprehension begins to be thwarted, where 他能够卡去吃午餐 (He is able to card go eat lunch) and 他能够下去吃牛餐 (He is able to go down and eat cow meal) don’t make any sense as the scrambled words have both legal strokes and are real words, and the meaning of the scrambled words are contextually out of place in the sentence.

The Conclusion

What I have coined “the persistence of comprehension” is a seemingly little-researched area in English, much less Chinese. I offered the following reasons explaining the persistence of comprehension in English “typoglycemia,” where through the combination of context, length of word, letter position, shape of word, word position in a sentence, and (what I have demonstrated with examples) identifying the letters, which draw upon phonetic representations of the word in our head, we are able to read English.

In the more interesting case of Chinese, which is logographic, I posited that there are legal and illegal strokes which can be added to a character. Legal strokes are less likely to be noticed than illegal ones. If the scrambled word is a real actual word, the effect of having legal strokes masks the fact that the word has been scrambled, and when we read it, the sentence doesn’t make sense because we do not suspect a character has been tampered with.

All in all, more extensive research must be done, than what this blog can provide. I don’t know if I will be able to do so, but if anyone wants to hear my notes on this topic, feel free to reach out to me at ws672[at]nyu[dot]edu.

Note: In the original version of this post, I wrote that jumppingjack was male, when she is female. Corrections have been made.

Speaking fake English, or any other fake language

What qualifies the English language to sound “English” enough? Very often, people in the English-speaking world have impressions of what foreign languages sound like. Chinese (excluding stereotypical “ching-chong” variants) sounds like “Xie shi hao ni jing ling ping dao” to many English speakers, replete with its tonality, French has its velar R’s and lots of Z’s and nasalities, “Le beton est un plus morraise il a son telle fusontique des mon,” Italian has its inflections on certain syllables, and so forth.

What about fake English? Were a foreigner to make fun of what English sounds like to them, how would they reconstruct it?

Turns out faking a language at least requires the basic knowledge of morphemic and phonetic structure of that language. Why do people in the least go “ching-chong” when talking about Chinese and rattle their throats and noses trying to speak fake French? It’s because that they know these languages feature these consonant and vowel relationships.

Knowing the phonetic map is only one part of speaking a fake language, the other, to make the fake language sound convincing, is knowing how they fit together to form words.

The video above speaks fake Chinese, and as a Chinese speaker, I find it very far off, simply because he does not understand the tonal system of Chinese, nor can he reproduce certain syllables.

The video below shows a somewhat convincing fake English, as it imagines what English would sound like to foreign person who does not speak the language.

Any English speaker would realise that in that clip, it actually uses a lot of real English words, but for the most part is unintelligible, yet it still sounds distinctively English. I feel that the writers of the script relied too much on real words and simply garbling the rest, when they could have pushed the boundaries further of words they can change up using English phonomorphemic rules to create a convincing and clear fake English conversation.

I wrote previously that we can extract semantic meaning from nonsense words, through parallel sounds and morphemes attached to them. Likewise, for fake English, to sound most convincing, we need to preserve morphemes, because for some reason, English morphemes are very English to any English speaker. So much so that we attach them to foreign words when we attempt to Anglicise them. For example, we can say a person “kamikaze’d” or that perhaps something could be “taco-licious”. What that means exactly, I’m not sure, but we often use English affixes to bring foreign words to make them fit into our language.

Likewise, if we were to create nonsensical, fake English conversation, we must preserve these affixes, for they give words their purposes. For example, we use “-tion” to turn something into a process, such as “crown” to “coronation,” “investigate” to “investigation.” If I used a word like “hakilimation,” chance are, a competent English speaker can probably draw inferences that the root word would be “hakilimate.” If I said a person is “taffing,” the root verb is probably “to taff.”

Here’s my attempt at speaking fake English, using the rules I have highlighted. I think if someone weren’t paying close attention and heard this in the background, it could pass for real English. Also included are fake Chinese and Japanese, that, in my opinion, sound a lot more legit than those without knowledge of how the language is structured.

Here’s an example of a Microsoft ad that uses fake Chinese convincingly. Granted, a lot of the words are slurred, given its more conversational nature, but to those who know the language, some actual Chinese can be teased out from that blur of words.

Linguistic superiority is bunk

Someone once said to me, “哎呀,你的中文那么不标准!”

That basically meant: Aiya, your Chinese is really substandard!

And that was in response to me telling them my Chinese name. That someone was from Beijing, China, and I am from Singapore. We both speak Chinese, but upon hearing my pronunciation of certain words different from the way they do it, they denounced it as being substandard, for not being the “Beijing standard.”

Thus, they claim linguistic superiority of Chinese over any other regional differences.

It’s not even the way Cockney differs from RP in England, or African American Vernacular differs from General American English — in Wikipedia, the Chinese spoken in Singapore and China are both called “Standard Chinese,” but inevitably there are bound to be phonological differences, that even Wikipedia cannot capture.

A very basic example is the way my name is pronounced.

A character in my name, 俊, is transcribed in pinyin as “jùn”. As many of my friends from China would pronounce it, and the way Wikipedia transcribes it, they say:

/t͡ɕyn/

with a /voiceless alveolo-palatal affricate + high front rounded vowel + alveolar nasal/. There is a very audible “tch” sound at the onset of the word.

In Singapore, that character in my name would be pronounced:

/d͡ʒyn/

with a /voiced palato-alveolar affricate + high front rounded vowel + alveolar nasal/. That means that the initial “j” sound in Singaporean Chinese is similar to the way “judge,” “gee,” and “job” is pronounced in English. There is no “ch” sound audible at the onset of the word.

Another difference would be the character 需, xū, as in “to need.” In China, it would be pronounced:

/ɕy/

with a /voiceless alveolo-palatal sibilant + high front rounded vowel/. There is a very audible, thin “sss” sound at the onset of the word.

In Singapore, that character would be pronounced:

/ʃy/

with a /voiceless palato-alveolar sibilant + high front rounded vowel/. It is almost indistinguishable from the way “she” is pronounced in English.

Here is an example of how Standard Chinese sounds are generally pronounced by people from Mainland China:

Note the “j” “q” “x” sounds at the 41 second mark.

Compare with this Singaporean Chinese news clip:

Note at the 23 second mark, the news broadcaster even says a name that has my 俊 “jun” character in it, and the initial “j” is a lot softer than the Chinese from Mainland China. Also, the Chinese spoken by the interviewee immediately is closer to how most Singaporeans speak Chinese — with consonants closer to Taiwanese Chinese than Mainland China Chinese.

Another video clip of Singaporean Chinese, as spoken by kids, with a lot of usage of the “xue” word. Note that they all say /ʃyœ/ (sh-ü-eh).

A very simple reason why there is that difference is in the way we learn Chinese. Those in China learn Chinese via the “bopomofo” method (see video embedded above), where there is an emphasis on preserving the initial sounds (“ji-yu=ju” “xi-yu=xu”). In Singapore, Chinese is taught via the hanyu pinyin system, where its English letters are used as a springboard to understanding the sounds of Chinese. That makes sense in Singapore, given that its bilingual education system begins even in kindergarten, whereas English is only introduced in the Chinese education at a much later age in elementary school.

As such, there are some overlap between the consonants of English and Chinese in Singapore, where “xu” is pronounced “she” and “you” is pronounced “you/yew” (as in English), rather than “yo-uu” (as Mainland Chinese people would).

Furthermore, given that the influence of Chinese dialects such as Hokkien (Southern Min/Min-nan), Teochew, Cantonese, and Hakka, all of which are southern Chinese dialects, you get pronunciations of certain consonants that mimic Taiwanese or Hong Kong Chinese, such as interchanging “chi” “shi” “zhi” with “ci” “si” zi” in casual speech sometimes (that is, people who are not broadcasters or taking exams). An example would be the first video of Singapore Chinese I embedded (about the iPhone 4), where the guy said “zè” instead of “zhè” (这).

Does this make any of the Chinese spoken in Singapore, Malaysia, Taiwan, and other parts of the world less “standard” than the Beijing standard?

Were that so, then wouldn’t all variants of English but British English, not even American, be the gold standard of English in the world? Languages change and adapt to the locale, and to insist that only one type of the language is proper and the rest substandard is arrogance in its linguistic superiority.

Extracting meaning in nonsense

Image credit to Wikipedia

‘Twas brillig, and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe.

“Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
The frumious Bandersnatch!”

He took his vorpal sword in hand:
Long time the manxome foe he sought—
So rested he by the Tumtum tree,
And stood awhile in thought.

And as in uffish thought he stood,
The Jabberwock, with eyes of flame,
Came whiffling through the tulgey wood,
And burbled as it came!

One, two! One, two! and through and through
The vorpal blade went snicker-snack!
He left it dead, and with its head
He went galumphing back.

“And hast thou slain the Jabberwock?
Come to my arms, my beamish boy!
O frabjous day! Callooh! Callay!”
He chortled in his joy.

‘Twas brillig, and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe.

— Lewis Caroll, “Jabberwocky”, 1871

This is one of the most well-known nonsense poems in the English language, and yet, as Alice in Caroll’s Alice’s Adventures in Wonderland and Through the Looking Glass says

‘It seems very pretty,’ she said when she had finished it, ‘but it’s rather hard to understand!’ (You see she didn’t like to confess, even to herself, that she couldn’t make it out at all.) ‘Somehow it seems to fill my head with ideas—only I don’t exactly know what they are! However, somebody killed something: that’s clear, at any rate’

Even though the words are nonsensical, we still get a distinct sense of their meaning. How is that achieved? What components of the words in this poem contribute to their meaning? From Wikipedia, it says “The poem relies on a distortion of sense rather than “non-sense”, allowing the reader to infer meaning and therefore engage with narrative while lexical allusions swim under the surface of the poem.” What that means is that when we see the words and hear the sounds of the words, the components draw upon our existing knowledge to draw parallels to words and meaning we already know, and extrapolate the meaning onto the poem.

Thus, the frications, the hisses and lullings of the tongue bring about certain images and parallels to words we already know. A modern example would be the word:

Professor Severus Snape

from the Harry Potter books. It’s a very simply usage of the visual and audio clues as to the kind of person a character with that name might be. From “Severus,” we can break it down phonologically — the repeated sibilant ‘s’ draws upon hissing sounds, starting and ending with an ‘s’ makes the word sound harsher, and the the labio-dental ‘v’ sound draws the speaker’s mouth into an involuntary snarl in order to pronounce the ‘v’. Orthographically, “Severus” looks like the word “severe,” and the “-us” suffix lends it the gravitas of faux-Latin, adding a touch of snobbery and sombreness. Similarly, for “Snape,” phonologically, it leads with an “s” sibilant, and the “SN” consonant cluster makes the reader involuntarily sneer. Ending the word with the plosive “p,” and a released, aspirated one at that, adds to the ideas of a curt, no-nonsense character. One can plausibly imagine Severus Snape (with Alan Rickman as him, of course) saying the words “Get. Up.” with an extra hard release of the final “p” sound. Orthographically, “Snape” looks like “snake,” contains the word “snap” in it, and words that begin with “sn” have usually a slight negative connotation to it. (Snide, sneer, snap, snore, sneak, snoot, snarl, sniffle, snark)

So we’re incredibly able to draw so many allusions just from a person’s name via its sounds and its sights, now imagine extending it to the entire Jabberwocky poem. Let’s just examine the first stanza of the poem:

  1. ‘Twas brillig, and the slithy toves
  2. Did gyre and gimble in the wabe;
  3. All mimsy were the borogoves,
  4. And the mome raths outgrabe.

And see if we can annotate it with relevant information that we know.

  1. It was brillig [N? Time of the day? ADJ? Brilliant?], and the slithy [Definitely ADJ. Slithering and lithe] toves [Definitely N, because of following line]
  2. Did gyre [V. Plural object-verb agreement (“toves gyre”). Gyroscope] and gimble [V. Gyrate and tumble? Rotating movement] in the wabe [N. Wet, plus extra wet connotations from “slithy”]
  3. All mimsy [Adj. Whimsy? Whimper? Miserable?] were the borogoves [N. Borrow-dove? A bird? Mangrove?]
  4. And the mome raths [ADJ-N, because of the following V. Home? Mope? Moan? Wrath? Rats? Moths?] outgrabe [out-grab+PAST? Gripe+PAST?]

Wikipedia compiles a possible interpretation of the words, which mine seem pretty close to.

The human mind is incredibly capable, almost desirous, of pulling meaning out of words, such that people arguing about semantics when they disagree with words used by other people seem almost silly. Previously, I have written about how the grammaticality we’re obsessed with contributes little to the understanding of meaning, and people who advocate and insist on a gold standard of grammar are quite misguided. Similarly, we see here even semantic-correctness seems secondary, if the words used have no semantic distinguishing from another, because they are not words in the lexicon in the first place, yet they contain content and semantic meaning.

Does it matter if I say, “The amalgamation of hydrogen and oxygen atoms yields water,” and “The combination of hydrogen and oxygen atoms yield water,”? There will be semantic purists who insist that the act of amalgamation is subtly different from a mere combination; that perhaps amalgamation is more nuanced.

Of course, I don’t deny that there are certainly words that are more nuanced than others. There is certainly a different between the words “happy,” “delighted,” “glad,” and “ecstatic” — they align differently on the superlative scale where one might construe “glad” to be the most slight and “delighted” and “ecstatic” to be on the other end. But even between these words, how is one to distinguish the semantic difference between “delighted” and “ecstatic,” where one is full of delight and the other full of ecstasy, that one is more superlative than the other other? Does ecstasy trump delight?

As such, insisting on absolutism for certain terms is imposition of one’s views on another. Splitting hairs semantically, like grammar-nazism, contributes nothing to the discussion if the intent of the speech is clear.

To end off, I’ll try my hand at “nonsense prose,” to see if I could, without using lexical words, tell a story.

“You seem morried,” Alex said, as he kriched up a klatch, and lit his smube. He took a long wheg. “Is everything milly-willy? Surely nothing fellish happened?”

“I’m afraid I’m a little tatchet,” I said, my shoulders smished, my haiths swanged.

Alex poff-poffed, for he whegged one too big. “Sorry about that. Come on, tich your bin up, kellyvale everything.”

I hished my feet, “You know what my pairrows are; they have viddied not an inch. Every burrise I wake, the same ol’ nubs, the same ol’ tracherns. I am still without work, and my time here is plivered. If I don’t get a job immish, I’m fanade I’m going to go wallyfaloo.”

“Surely it’s not that sapper,” Alex kippered, “You have your tumms around you, being snorm and glideful. Surely that clappas your situation?”

“I’m grateful for my tumms, yes,” I said, “But they can only clappas por piti. It’s been four yardas already, Alex, and the best I’ve bainaged was this mopstep.”

“I can’t movome back, Alex. That finta is unbelfortasible to me; I didn’t swarvvy thousands of loons and cross ninan lashes to come here, only to have to gallivog home. There is no syfe for me there, Alex. Although I have tumms and revelas back home, to have to be washorled by all that sikthorn and snurling pekvork will beshoy me. I’ll sooner slax myself than movome.”

“What are you going to do then?” Alex said.

“I can only prish it will be wingwag, Alex. I can only pope.”