Breandan's Blog

The End of Illiteracy

Reading. It’s something we must all do, and millions of hours are spent each day, passing on that ability to our children. Reciting nursery rhymes, reading children’s books, sounding out each and every syllable for the untrained eye and ear. It’s not something we often think about unless we have children of our own, yet is an inseparable part of our daily life. It is how we learn to write, share, and relive, some have said a path to immortality itself - by inheriting and sharing just a bit of our inner world with the one outside us. Yet some children have no one to teach them, and many more can barely read at a functional level, lacking the education, resources and time. Even those of us lucky enough to receive a college education, most can only read one language. All of that is about to change.

In the next few years, there will be a breakthrough in speech recognition technology. There is a quiet revolution already taking place in the speech recognition community, and one that will change the way we learn to read and communicate forever. It is an engineering solution years in the making, on its way to tremendous fruition. With recent breakthroughs in deep learning using restricted Boltzman machines, the availability of enormous data sets, and a few more steps of Moore’s Law, we will have real-time, native-level speech recognition for the English language. What does this all mean? It means that without a powerful supercomputer, on my smartphone or wristwatch even, and no internet connection whatsoever, I will be able to hold a conversation with a deaf person, using just my voice.

Some say that with the advent of speech recognition and machine translation, we will no longer need to learn new languages and they will all eventually converge into one, I’m not so sure. What I do know, is that we will never be able to upload a language directly into grey matter. Natural language is acquired, reading is not. It requires effort, and constant practice - there is no Broca’s area for reading, at least not yet. Here is where machine learning comes in - we have all the necessary tools to eradicate illiteracy in the next century. And I do not mean reduce illiteracy to ten or one percent, I mean eradicate it, like smallpox - where every single person with the functional ability to read, can read Tolstoy and Hemingway. Where every person above the age of fifteen can learn to become an astronaut or test the speed of light, write stories and teach new ideas to one another. And all this is possible (here is where you will object) to an orphan without access to school or even a single adult.

How is this possible, you may ask? With (1) spaced repetition, (2) speech synthesis, (3) speech verification, and (4) ubiquitous closed captioning, in precisely increasing order of difficulty. The first is easy, but not as easy as you might think. At its heart, is a simple scheduling algorithm, which can be found in plenty of existing flashcard systems, including Anki and Mnemosyne. In order to do it well, we need a few simple machine learning techniques to customize the results. The second is more difficult in general, but can be faked for pre-recorded learning materials, such as audio books. The third is a boundary value problem, but entirely possible at the moment, and several orders of magnitude simpler than speech recognition. If the second is done well, we can achieve the objective of the third without needing perfect accuracy. Finally, automatic speech recognition is the capstone - the crowning achievement, but if we are ever even remotely successful at speech recognition, speech verification is securely within our reach. We only need three.