Art by Moebius
"Once upon a time, I, Zhuang Zhou, dreamt that I was a butterfly, fluttering hither and thither, to all intents and purposes, a butterfly. I was conscious of only my happiness as a butterfly, unaware that I am Zhou".
I am writing this post today with a clouded head, filled with despair and frustration in equal measure. The world seems to be a muddled up morass. The largest economy (and military power) in the world has elected a singularity for its president. By a series of executive orders, this singularity seems bent on annihilating the last bit of logic from this world. How did we get here ?
We are living in the age of supercomputers in everybody's pockets. By now, we should have evolved as a species to put an end to war and poverty. We should be facing the stars and the greatest challenges that we can attempt for the million years to come. So why are we killing each other, trashing the planet and torturing all forms of plant and animal life ? How to fix this paradox of extremely powerful technology with extremely stupid people ?
I think the answer lies in really bad interfaces. Today, we have awful interfaces not only for our technological systems, but also for our political systems. With a lot of pain and effort, we may nudge these systems a little towards what we want. But more often than not, we fail in this. Our human bodies and brains have no reasonable interfaces to control the nightmare unleashed on us. The internet has become home to a growing mound of mind-viruses, whose DNA is made up as much of computer code as of psychological weaknesses of people. We cannot fight this game in the monster's arena. We need to build our own narratives. We need stories of beauty and compassion. We should be able to express our dreams, which sprout from the deep pacific of our own inner minds. We humans are far bigger than the entropy around us. We need to realize that we are still in the infancy of our time as a species.
In my tiny brain, bounded by my own limited human journey, I have a few quiet corners of joy. These are my dreams, incapable of our reach as humans today, but which I wish to see them realized before I die. All of these dreams deal with novel human interfaces for complex systems. In this blog, I will try to give some glimpses to these ideas. To provide some context, I add a few notes below with links to technical papers and my own journey in scientific research. Playing with these interfaces should be a bliss in itself. Like how an expert musician enters into a zone of trance while playing the instrument. The psychologist Mihaly Csikszentmihalyi calls such a mental state as flow. In ancient China, this is known as wu-wei, or "non-doing", a state of effortless ease where you lose track of time. In Indian philosophy, this is known as ananda or supreme bliss. As human beings, we are born with the physical and mental hardware capable of achieving such a state. The Chinese philosopher Zhuang Zi narrates the story of the butcher Ding, who achieves wu-wei through the mundane act of carving the meat of an ox. Owing to our unique experiences in life, each of us needs a different activity and instrument to achieve this state. Some achieve this by doing a sport, some by cooking, some by coding. But I believe there is a method of universal applicability that reaches everyone: music.
Musical instruments are possibly the greatest aid for achieving the state of joy. The greatest of these instruments is undoubtedly the human vocal box, running through the wind pipe and played by the tongue - the most exquisite muscle of our body. In the Sanskrit tradition of India, this vocal box is analyzed as an instrument for achieving the supreme meditative state of Ananda. The architecture of the human body provides half of the user interface for this instrument. The other half is provided by the Sanskrit alphabet, and to a greater degree, by the Raaga system of music. My goal is to trace this inspiration from this Sanskrit tradition of Shiksha (literally, instruction) towards designing computational user interfaces. In this tradition, the task of user-interface design is intimately coupled with the method of training for using the interface. Neither of this stands in isolation. By narrating stories of the Sanskrit alphabet, Mantric chanting and Indian music, I hope to build a general theory of computational design of user interfaces. I will do this in my following blog. Before I trace these inspirations and build a general framework for user-interface design, I want to relate my dreams. I have chosen five such dreams, ranked in a hierarchy of increasing challenge. They describe five shades of bliss in how an individual relates to the universe around him.
The first is Zhuang Zi's butterfly. Can a human being control a "butterfly" - a miniaturized drone, such that he effectively becomes the butterfly, forgetting his own body ? In order to achieve this, his natural human senses of vision and hearing have to be mapped perfectly to the sensors on the drone, which needs to be in soundless flight. His human limbs of action, such as his fingers and the vestibular system, have to be mapped to the flight controls. The mapping can be performed by high precision tracking and wireless relay. No important human sense should be left unmapped, as that will break the illusion. Feedback from active senses, i.e, motion and manipulation, are more important than passive senses. We need to trace the tree of human evolution to see where the underlying neural system in the human body can be mapped to flight. The closest flying relatives of humans are bats, who are also mammals. Indeed, they have the most exquisite control of flight through their wings. The skeleton beneath the wings has a direct mapping to the human fingers, which become elongated in bats as the upper and lower arms shrink. There is undoubtedly some shared neural circuitry between humans and bats. The rest has to be improvised. The interface should be as natural for a human as moving his hands. With such an interface, even a novice will be able to fly. But with training, he or she will be able to achieve the state of immersion, as related by Zhuang Zi.
My second dream is crazier than the first, in that it removes the use of technology altogether. By their very design, human bodies are not capable of flight, but they are capable of swimming. But many humans cannot swim, and even suffer from the fear of water. Is it possible to design a series of exercises such that by doing them, a human will be able to swim automatically, the very first time he sets in water ? In order to do this, we have to hack the human nervous system. Specifically, there are autonomous programs in the neurons in the body which produce periodic motion. These motions are beyond the conscious control of the brain, but they can be achieved by training. If there is an interface that is optimized for training these neural units, the human will be able to move them without conscious control. For example, this training can be performed through an interface for playing music. The played music will be relayed instantly to the ear through ear phones. Now, when the human is plunged into water, he just has to play a song and he will be off swimming.
My third dream is about fixing a broken human body. Due to degenerative diseases like Alzheimer's or Parkinson's, people in old age lose their capacity to walk or perform fine-scale motor control. This loss of outer performance is symptomatic of an underlying neuronal loss in the brain. This neuronal loss is in turn triggered by debilitating changes in the endocrine system and in the immune response of the body, often due to bad diet or lifestyle. But there is hope. We know today that many of the severe symptoms of degenerative diseases can be relieved using neural regeneration. Many such interesting cases are related by Norman Doidge in his book "The brain's way of healing". This miraculous rehabilitation taps into the neuroplasticity of the brain that grows new neurons and make new connections. Certain types of neuronal losses are irredeemable, but work-arounds can be devised for the others. As of present, we do not have an accepted theory of rehabilitative procedures in medicine that can achieve this. The patient also needs a lot of motivation and repetitive training in the face of hardship to overcome this. We do not know what the optimal interface is, for a human to perform this training. But if we hack the neuronal circuitry in the body, this interface can be optimized. Little by little, new neuronal circuits can be developed, starting from the Hippocampus and the Entorhinal cortex which are the ground zero for degenerative diseases. Going back to my example, the human can be tricked into using this interface as if playing a musical instrument. By playing this virtual music, he will progressively move his limbs and achieve fine-scale muscular control.
My fourth dream is about fixing the extended human body i.e, a person's home and living environments. Soon, we will have thousands of sensors embedded in every device in a home, each of which relays junk to the internet. Broadly, this junk is known as the "internet of things", and it is optimized to spy on the person and make him buy more junk on the market. But what if these sensors are instead optimized to be an extended senses for the human body ? In addition to passive sensors for relaying temperature, the composition of the fridge etc., there can be active robotic units. These robotic devices can be stationary, for example, controlling the doors of the house. Or they can be mobile, attending to the garden or manipulating items in the kitchen. Going back to my example of Zhuang Zi's butterfly, can these sensors are so deeply intertwined with the human experience that they are indistinguishable from the physical body of the human ? This is a far greater problem to solve than the flight control problem I mentioned earlier, because we do not have any guidelines for mapping. In this sense, the mapping is general, with the architecture being capable of adding new sensors and devices based on demand. But the ultimate goal is to heighten the conscious awareness of the person about all that is going on in his home. This home need not be just a person's physical home, but also other cherished places, such as homes of his friends or family, or even natural ecosystems in the wild. This vision is crystallized into a concrete application when we build technologies for independent living of elderly people. Social isolation is the main killer in old age. By using technology, we can relieve this and embed a person's conscious experience in caring environments, either in human society or in natural environments.
My fifth dream is about fixing the society. Can technology help us live consciously, such that we are aware of the ecological impact of all our decisions on the market ? This will let us optimize our decisions on what to buy and where to buy, such that the hard ecological limits about fresh water, mining etc. are respected at the local level. In addition to physical limits, we can optimize for greater goals such as compassion in how we treat other people and animals. This is the most complex dream because it needs to address multiple people. It needs to be cognizant of the social and political systems, and their legacy hardware that is often broken. But at its core, this is also a computational problem that can be fixed by an interface that maps to the human body. Imagine we have such an evolved conscious society, where every human act is optimized for delivering the greater good. This is not a Utopian dream, but a call for incremental betterment. Our current politics is broken. Our current economic system has become dangerously out of control. It is time for a brand new framework for solving the problems, which incorporates human consciousness at the core. In other words, we need to devise these social and political systems as computer-human interfaces. These interfaces have to be optimized for joy in the Zhuang-Zi sense of the word.
These are all crazy dreams of midsummer, charmed upon by Shakespeare's fairies. But we need them in the middle of winter. We need inspiration from the ancient sages of China and India. We are living through the noise of the modern age. But our human story is very long and old. We need to summon our best dreams and inspirations, as we face the wild exploding entropy of technology.
(I hope they give some context to what I am talking about. It is actually easier to write technical articles. Writing blogs about half-baked ideas is pretty hard, but researchers need to put some effort in communicating their dream-like ideas and their connections to the general public.)
1. To find my dreams of childlike innocence, unmarred by the horror of Snowden revelations and the dystopic picture of the real world it painted on me, I have to back to 2013 and earlier. I am deeply grateful to my colleagues, friends and mentors who sparked my imagination in those years.
2. "Why are computer interfaces not developed as solutions to optimization problems?" asked Antti Oulasvirtta, who recently joined as a researcher in our computer graphics department at MPI Saarbrücken. This talk made a remarkable impression on me. I was doing a postdoc there at that time, working in the group of Christian Theobalt. Antti is now an Associate Professor at Aalto University in Finland. He has an excellent journal paper on this topic "Can computers design interaction?".
3. Two examples from Antti stood out: He showed trained Ballet dancers on how precisely a movement can be replicated by the human body. Based on this precision, an information-theoretic bandwidth can be assigned to the movements of the human body. He also showed how quickly a person can play a musical instrument (the example was a guitar) and calculate the speed of information transmission through the Fitt's law.
4. Antti and I offered a doctoral seminar on human biomechanics for applications in HCI and computer vision. There were several great presentations by students, where we discussed research ideas, as well as gossip in the news. One of the key participants was Antti's student - Myroslav Bachinskyi, who evaluated the efficiency of touch interfaces and point gestures using optical motion-capture systems and the OpenSim software for biomechanical simulation.
5. The crazy idea of teaching somebody how to swim without ever setting in water, is from this seminar. We discussed several crazy ideas. Once we had a debate about the power of the unconscious mind for arithmetic calculation. I related the incident from the book of Oliver Sacks, where two autistic children are observed by a neuroscientist playing with marbles. Suddenly, the marbles fall on the floor and instantly, one child says "101". The other instantly factorizes them into prime divisors, saying "37, 37, 37". Since I worked in computer vision, I wondered what brain circuits would be capable of instant object recognition and counting. They are definitely sub-conscious, as there is no time for conscious reflection. There is some credence in neuroscience that all humans are capable of doing this in our brain, but we choose to "forget" the calculations. Otherwise, we will go mentally insane by information overload. But this "forgetting" is disrupted in autistic people and to some degree, by trained mental athletes. It is possible that some drugs can also inhibit this.
6. In another episode, we discussed was the Japanese game of Flash Anzan, where several numbers are rapidly flashed on a screen and the participants have to instantly sum them up. They do this by visualizing a mental abacus in the head and moving its beads. We wondered what other imaginary instruments can be simulated in the brain for amplifying other cognitive capacities.
7. We also discussed the memory palace technique (the method of Locii) for remembering long strings of information. Such memory enhancing techniques were commonplace in education worldwide, but discarded in the modern era. I will discuss them in my next blog.
8. At that time, I was working with Thomas Neumann who was visiting our group from Dresden. The project was about capturing skin muscle deformations in high detail using multiple synchronized and calibrated cameras. When reading for the biomechanics seminar, we realized that the previous methods for motion capture in biomechanics were much lower quality. This high-resolution capture using new types of sensors and computer vision methods will revolutionize biomechanics. We had a guest lecture from a trainer for paralympic athlete in Olympics and discussed how to apply these methods for developing better gear. Unfortunately, this project didn't proceed, as we had many other interesting ideas.
9. I worked with Helge Rhodin in the inital stages of his PhD in his group. We wondered whether we can learn a mapping between the movement spaces of two arbitrary motions. The application was a real-time control of a non-humanoid avatar by a human being. We realized that this problem can be solved in a purely learning framework, with rather simple models. The example of Zhuang Zi's butterfly is an idea inspired from this project. Helge did several excellent works afterwards.
10. I continued my collaboration with Christian's group after I finished my postdoc. One of the cool projects we did was mapping the facial expressions and lip movements of people across two different languages. The application was visual dubbing of movies across different languages. The main investigator here was Pablo Garrido, who developed a detailed 3D facial performance capture system from monocular video. One of the contributors for this project was Ingmar Steiner, who works on speech synthesis. He showed us the data from current state of the art systems for the capture of the vocal diaphragm through fMRI. In the end, we used a parametric model for mapping the lip movements, that is learned from a carefully aligned data set of 3D meshes. But ultimately, with enough data, this can also be solved as a learning problem from images themselves, as some new deep-learning methods are demonstrating.
11. After I left MPI, I worked at Technicolor research for improving the tools of visual effects artists. Manipulating 3D facial expression is a fascinating topic because we are visually so sensitive to it. The 3D artists who work in this field have an evolved vocabulary for describing certain grimaces and muscle movements. While trying to improve their tools, I understood the central nature of human artistic experience. The learning problem cannot be divorced from this.
12. The ideas related to neural regeneration for combating degenerative diseases are inspired from several interesting talks I attended at the iScan workshop. I hope to work on these technologies in the future.