Returning Time to the Reader On naYana, and why English spelling is the largest unacknowledged tax on human attention

Scope. This is naYana for English (v0.1) — specifically General American, since that's the dialect with the most reliable open phonetic data (CMUdict). Other English varieties (RP, Indian English) and other languages will ship as parallel projects sharing the same writing system; their phonetic spellings will differ because their pronunciations differ.

About this version. This essay performs its own argument. Substitution rules are introduced in the text and then applied to the text itself. From the moment a rule is named, the writing changes. By the final paragraph, you are reading a fully phonetic script called Latin naYana.

Hover any underlined word to see its original spelling. The vowels carry a subtle baseline stroke throughout — that is the naYana font's vowel marker, the zero-cost first phase, which is always on. The phase indicator at the top of the page shows where in the substitution arc you currently are.

Honest disclaimer: the substitutions you see were produced by hand to demonstrate what the naYana engine is designed to produce. Some rules (like ti → sh and full schwa marking) are still in development. Treat this as a preview, not as engine output. A companion plain-English version of this essay, with full citations, is available alongside this file.

An eight-year-old girl I know writes Hindi in the Roman alphabet. She has no training in transliteration. She has no teacher correcting her spelling. What she has is an ear, and a set of letters she learned in school. The letters represent sounds, the sounds represent words, and so she writes. Her spellings are her own, and they are perfectly consistent. Mausam, paani, khushi. If you can read English letters and you know Hindi, you can read everything she writes.

A few weeks ago she asked me what I do for a living. I told her I work in philosophy. She wrote it down in her journal: filosafi.

She was, of course, wrong about the spelling. She was also, just as clearly, right about everything else. She had heard a word, identified its sounds, and assigned each sound the nearest letter she knew. This is what every literate child in the world does, except for one population. Children learning to read English do it once or twice in kindergarten and then spend the next several years learning that what they did was wrong. The word is philosophy. The reasons are historical. The teacher will not explain them, because there is no time, and because the teacher does not know them either.

This essay is about what that time costs, why every previous attempt to recover it has failed, and why we believe that, after a century of failures, a different approach is now possible.

I. The cost

English spelling is the single largest learnable inconsistency in human literacy. The numbers are not subtle.

Children learning to read in Finnish, Italian, Spanish, Korean, or Indonesian reach reading fluency in roughly one to two years. Children learning English take three times as long. This is not a difference of effort or pedagogy or intelligence; the gap holds across socioeconomic backgrounds, educational systems, and teaching methods. It tracks one variable: the regularity of the orthography — what linguists call orthographic depth.

Finnish has roughly one spelling per sound and one sound per spelling. Indonesian was rationalized in the twentieth century and is similarly regular. Spanish has small irregularities at the edges (the silent h, the b/v merger), and a child can decode 95% of words by the end of first grade. Hindi in Devanagari is fully phonetic; once a child knows the consonants and the vowel marks, every new word is readable on first sight.

English is none of these. The letter a represents at least seven distinct vowel sounds (cat, father, care, about, late, all, any). The sound /f/ is spelled at least four ways (fish, phone, enough, off). The letter sequence ough is pronounced seven different ways (though, through, tough, cough, bough, thorough, hiccough). There are no rules that resolve these. The child must memorize each word. A literate English speaker has spent years of childhood building a mental lookup table.

The cost of that lookup table is not just instructional. It is opportunity.

Consider what a Finnish child does in the two years that an English child is still learning to spell. She reads stories. She writes her own. She participates in the literate culture of her community as a full member at the age of seven. Her English counterpart, at the same age, is still being told that write and right are different words even though they sound identical, that knight contains a silent letter she has no way to predict, that island and isle contain an s that nobody pronounces because of a sixteenth-century Latin misattribution.

By age eleven, the Finnish child has read perhaps three hundred books. The English child has read fewer. By age eighteen, the gap has narrowed only because the English child has grimly memorized the lookup table. But the years of reading lost cannot be reclaimed. The cultural participation she could have had — the early entry into the conversations of literate humans, the formation of identity through text — was deferred. For some children, who could not bear the opacity of English spelling and were diagnosed as dyslexic or simply gave up, it was deferred permanently.

The numbers are catastrophic. English is the first language of about 400 million people and the second language of perhaps another 1.5 billion. If even one year of childhood is lost per child to spelling mastery — a conservative estimate — and that year contains roughly 1,000 hours of school instruction, the total annual cost across the world is on the order of two trillion hours. The dollar cost, valued at the prevailing wage of teaching time alone, runs into the hundreds of billions per year. The energy cost, valued in the electricity to light the classrooms and the food to fuel the children, is its own catastrophe, multiplied across decades.

But the cost that matters is none of these. The cost that matters is the creative time — the time a child of seven could be spending on stories, on music, on building things, on absorbing the literate culture she was born into — that she instead spends decoding the spelling history of a language that has neglected its own writing system for five hundred years.

We are asking children to pay, in childhood, for the failures of dead typesetters and pedants. The children pay. We do not even count.

II. Why every previous attempt has failed

This problem is not new. It has been seen, named, and attacked repeatedly. Each attempt has failed for reasons worth understanding, because their reasons remain instructive.

George Bernard Shaw, in his will, left a substantial sum for the design of a new English alphabet — what became the Shavian alphabet, published in 1962. Shavian was technically sound: a phonetic script with distinct, simple shapes, designed by a typographer who knew what he was doing. It was used to print exactly one book: a parallel-text edition of Androcles and the Lion. It was never adopted by anyone. It had no place to live. The technology of 1962 could not deliver a typeface to a reader without printing presses, and printing presses do not adopt new alphabets on a whim.

The Deseret alphabet of the nineteenth-century Mormon settlers was a similar effort, similarly doomed by its dependence on infrastructure that would not be repurposed. Type was cast. Books were printed. A generation of children was taught to read it. Then the church moved on, the type was melted, and the alphabet went extinct within fifty years.

The Initial Teaching Alphabet (ITA) of 1960s Britain came closer. ITA was used in primary schools to teach early reading: forty-four characters, each representing one sound. Children learned to read ITA quickly — much faster than traditional spelling — and then were transitioned to standard English. The program ran for years and was eventually abandoned. The problem turned out to be the transition. Children who learned to read ITA had a harder time, not easier, switching to standard English than children who had been taught traditionally from the start. The crutch became an obstacle. The new alphabet had not failed to teach reading; it had failed to let go.

More recent efforts — SoundSpel, Cut Spelling, SaypU, Unifon — have all foundered on the same rock: the network effect. Any new spelling exists in a world where every existing book, sign, contract, website, and trained reader is in the old spelling. To adopt the new script is to lose access to everything written before, and to be unreadable to everyone who has not adopted it. The benefit is small and individual; the cost is large and immediate. Rational individuals do not adopt. The reform dies.

There is a deeper pattern across these failures. They all asked the reader to choose between two scripts at a moment in time — the old or the new. They all required commitment before the reform proved itself. They all underestimated how much existing literacy was a sunk cost the reader had no incentive to abandon. And they all operated in a physical world of paper and type, where every reform required the cooperation of printers, publishers, schools, and governments — all conservative by structure.

These are the failures we are trying not to repeat.

III. What is different now

Four things have changed since the last serious attempt. None of them is small, and together they are decisive.

The substrate is now digital. A reader's text is no longer captured on paper. It is rendered by a font, served by a browser, or displayed by an application. The same underlying text — the same sequence of Unicode characters — can be displayed differently on different devices, for different readers, at different times, with no change to the source. A reform of the rendering layer does not require the cooperation of any printer, publisher, school, or government. It requires only that the reader install a font.

This is the change that makes everything else possible. Shaw could not have done what we can do, because Shaw lived in a world where the appearance of a text was fixed at the moment of printing. We do not live in that world. We live in a world where every text is remade, pixel by pixel, every time it is read. The remaking is under software control. Software is malleable.

The reform can be phased. No previous attempt allowed a reader to move gradually from the old script to the new, learning one substitution rule at a time, with the previous month's progress preserved and the next month's introduced only when the reader is ready. Every previous attempt asked the reader to commit fully or not at all. The phased approach inverts this: at each stage, the reader has learned a small, named, well-defined thing. After one phase, ph becomes f. After another, ck becomes k. After many phases, the reader is reading a phonetically faithful script without ever having confronted a wall of unfamiliar text.

Phase 1 turns on. From this point in the essay forward, the rule ph → f applies. Hover any underlined word to confirm the original spelling.

This solves the cold-start problem that killed every previous reform. There is no moment at which the reader cannot read. The script the reader sees at every stage is a script the reader can already mostly read, with one small new rule applied.

The reform is reversible. A reader who has installed the font and the engine can, at any moment, turn it off. The original text is preserved underneath. The new spelling sits as a rendering of the original, not a replacement of it. This means there is no risk to the reader of being cut off from existing literature. Every book ever written remains available in its original form by toggling a switch. The reader keeps full backward compatibility with five centuries of English text.

This is the answer to the ITA problem. ITA failed because the transition back was hard. Our system has no transition back to fail at, because the original is never erased. It is one click away at all times.

The encoding is IPA. Under the surface, the script the reader eventually arrives at is not a new invention. It is the International Fonetic Alfabet — the standard used by every linguist, every language-learning textbook, every speech-recognition system, every text-to-speech engine on earth. The Unicode codepoints we use are not ours. They are the codepoints assigned by Unicode to IPA, and they have been stable for decades and will be stable for decades more.

This means three things. It means that a reader who learns naYana can also read IPA — that the script they learn is not a private language but a passport to every dictionary and language resource in the world. It means that text written in naYana is automatically compatible with speech synthesizers and language tools, because those tools already accept IPA. And it means that the script can be extended to other languages, because IPA is universal — any sound in any language is already represented in it.

The novelty in naYana is not the encoding. The encoding is borrowed from a hundred years of careful linguistic work. The novelty is the glyph design: the shapes that the IPA characters take when rendered. IPA was designed by linguists for transcription, not by educators for learning. Its shapes are scientifically precise but cognitively expensive — mirror-image pairs, near-duplicates, marks that depend on tiny distinctions. naYana redraws each IPA character with a shape chosen for learnability: distinct from every other shape, easy to write by hand, easy to recognize at small sizes, free of mirror confusion. The underlying Unicode is IPA. The visible script is naYana.

IV. What this is for

Phases 2 and 3 turn on. Two more rules now apply: c → k or s (depending on pronunciation — the dictionary decides), and qu → kw. The letter c begins to disappear from the text, replaced by whichever letter matches the sound it was making.

We have built the first fazes of this system. They work. A kild can install a font on her browser, slide a kontrol from 0 to 1, and see filosofy appear in plase of philosophy. Slide it again, and cat bekomes kat, city bekomes sity. Each slide is small. Each is named. Each is reversible. Over months, the slide can move from 0 to 20, and by the end, the kild is reading a fonetic skript in wich every sound has eksaktly one spelling and every spelling has eksaktly one sound. The skript she is reading is not English. It is also not a new invention. It is the akchual sounds of English, written down at last with the konsistency that every other major language in the world has long enjoyed.

We do not believe this will be easy. We do not believe everyone will adopt it. We do not even believe most people will adopt it. What we believe is that the option should exist — that a kild should not be required, in the twenty-first sentury, to spend five years memorizing the etymolojikal aksidents of reseive and believe when she could have been reading stories. The choise should be hers, or her parents', or her teacher's. The kost of not offering the choise is the kontinued, unmeasured loss of trillions of hours of human kildhood every year, forever, with no end and no audit.

That is the kase for naYana. It is not a manifesto. It is a proposal. The proposal is: let us build the rendering layer that makes fonetic English available to anyone who wants it, without asking them to give up English; let us see whether kildren take to it; and let us measure whether the time saved is real.

If it works, the saved time is the answer to a kweschun humanity has been ignoring for five hundred years. If it does not, we will have spent some years and some money trying, wich is the smallest expense we could have made on a problem this size.

V. The longer ark

Phases 4, 5, and 6 turn on. Silent letters in kn-, wr-, -mb, and -gh are dropped (since they correspond to no sound). The letter s becomes z wherever it is pronounced /z/ (plurals, is, was). And the letter x becomes ks.

The name naYana iz a Sanskrit word meaning "eye" or "guidance." It echoz the Nyāya skool of Indian lojik, founded on the prinsipal that careful seeing precedes korrekt reasoning. The skript iz meant to guide the eye, gently, from the spelling it noz to the spelling that matchez the world.

The name iz also a fonetic palindrome. Read it forward: n-a-Y-a-n-a. Read it backward: a-n-a-Y-a-n. The kapitalized Y in the middle iz the aksis of symmetry. Thiz waz not an aksident. A skript that works for English must, in prinsipal, also work for languages that read rite-to-left like Arabic and Hebrew, or top-to-bottom like klassikal Chineze and Japaneze. The palindrome marks the skript's intent: riting iz a recording of speech, and speech does not have a preferred direction. The orthography should not impose one.

We are starting with English because English iz the language with the largest mismatch between its sounds and its riting — and because English, as the global second language, eksports that mismatch to every country where it iz taught. A kild in Maharashtra learning English does not just learn English; she learns to spend years of her kildhood on the same fossilized irregularitiz that English-speaking kildren spend years on. The kost iz global. The fiks would be global.

Phase 12 turns on: schwa marking. The unstressed vowel sound /ə/ — the most common sound in English — is now made visible as ə. It appears in nearly every multi-syllable word. The letter you see is the actual IPA character, the first non-Latin shape to enter the script. The becomes thə. About becomes əbout. Children becomes kildrən. The IPA has arrived, and it is welcome.

But thər iz no rezən thə skript needz tə stop at English. Thə IPA encoding ʌndərneeth naYana iz universəl. Eny language ever transcribed in IPA — French, Mandarin, Arabic, Yoruba, Tamil — can be rendered in naYana shapez. A kild who learnz naYana fər English has also, without knowing it, learned thə skript fər every other language she might encounter. Thə literacy iz transfərabəl thə way no orthography in history has been.

Thiz iz thə larger vision, stated as carefully as we can state it: that thə time stolen from billyənz of kildrən by inconsistent riting systems iz rekʌvərabəl; that thə rekʌvəry rekwirez no new laws, no government adoption, no schoolboard approval — only ə font, ən engine, and thə consent of one reader at ə time; and that thə human kəpasity for creative work, for cultural participation, for early entry into thə literate world, can be restored to its natural age of seven instead of being deferred to fourteen.

What that early entry would mean, at scale, we do not know. We have never had it. Every literate adult alive today was educated under thə old system. Whole generations of kildrən, in every English-speaking country, have spent thə most plastic years of thair lives on ə problem we could have removed.

It iz time to remove it.

The closing passage is rendered in full Latin naYana. All twelve substitution rules apply. The script you are reading uses Latin letters wherever Latin can be unambiguous, and one IPA character (ə) for the schwa. This is the planned milestone at which the reader can read every English word phonetically without ambiguity — using a script that took roughly fifteen minutes of exposure to learn.

naYana iz ən open projekt. Thə code, thə font, and thə engine are publik. Thə skript itself iz built on IPA and iz unowned. We invite contributors — typografərz, linguists, educators, software developers, parents, and kildrən — tə use it, test it, and improve it. Partikyulərly kildrən. They will tell us, faster than eny study, whether what we have built iz worth using.

If ə kild can rite filosafi and be understood, we have done nothing for her. If ə kild can rite filosafi and be rite, we have given her back her kildhood.

— for the gnowledge lab project, naYana for universal literacy

A note on this version. The substitutions you encountered were introduced in twelve named phases, drawn from the project's planned arc. Some are already shipped in the naYana engine; others are still in development. The complete demonstration requires features that have not yet been built — but the principle, you have just verified for yourself, works.

The script continues beyond what you have read. After Latin naYana come IPA characters for sounds Latin cannot represent unambiguously: θ for the th in think, ð for the th in this, ʃ for the sh in ship, ʒ for the middle sound of vision, ŋ for the ng in sing. Each is one new shape, one clear sound. They come in later phases, and they are where the naYana font's redesigned glyphs replace the IPA defaults with shapes drawn for learners rather than for linguists.

You arrived here by reading. The next reader could too.

A plain-English version of this essay, with full citations, is available as a companion document. The naYana project is open and hosted at gnowledge.org/projects/naYana.