Thursday, November 24, 2016

Computational science by Sanskrit



 Saraswati, the goddess of speech, is the personification of  Sanskrit. She is also an ancient river that supported the Indian civilization for thousands of years. (Art by Abhishek Singh)  

Crisis in our understanding of computation: 

Computers are now in everybody's pockets, but most people are unaware of what they are doing. Soon, computers will be integrated into people's clothes and into their own bodies. But for most people, they remain a black box. The renowned journalist Kevin Kelly once stated on Twitter, 

"People mistakenly believe an egg is simple. Nice smooth-rounded corner case. No buttons. But simple needs complexity". 

Imagining a computer as an egg is a horrible analogy, although it sounds profound on first glance. Mr. Kelly is a smart writer with informed opinions. So this makes him an especially good candidate to debunk - what we may call a support vector in machine learning parlance. The fact is people are already panicking about how opaque computers have become. Social media like Facebook and Twitter have precipitated awful political decisions, with nobody having a clue about how to find the truth amidst fake news, innuendo and subtle manipulative messaging. I have previously argued how the web is resembling an octopus-like monster that enslaves the user, instead of being a personalized tool like a bicycle. With growing power of artificial intelligence (AI), this opaqueness of computation is reaching scarier proportions. Even for the technically literate people, the analysis of algorithms is getting more and more complex, especially with data-driven machine learning. How to prove the convergence and optimality of an algorithm that uses a ton of data and encodes it in a multi-layered neural network? When these neural networks are used for financial trading, make insurance policies or evaluate employees, they exhibit complex biases that are much harder to debug than traditional computer programs. But the fundamental problem remains not AI, but our own ignorance. How can we as a society arrive at political decisions about the use of AI, when most people are completely clueless about computers?

Along with this crisis lies a huge opportunity. Computer algorithms are getting increasingly proficient in understanding human languages and interpreting visual images like humans do. With these advances, the borders between programming languages and human languages are getting blurred. New paradigms of computer programming are emerging that imagine computers as  partners that speak with humans, and solve problems through mutual dialogue. However, we as a society still do not see computer programs as natural language dialogue and literature – capable of literary beauty and emotive content. Equivalently, we do not see literature as a computational object with hyperlinks between various literary references (apart from rare visionaries like Ted Nelson who see it thus). We do not use computer programs to speak to each other or to investigate our internal psyches. To the extent that we use them, we do it unconsciously. A  powerful demonstration of this is the failure of semantic web, which was supposed to provide a rich context specific layer of information on the web that is pertinent to the user. Instead of this, we got a million bots and cookies that spy on the user without his knowledge. Now with the internet of things, this networked espionage on users has penetrated household objects. The principal cause of this dystopia is the lack of human communication between users and computers, capable of encoding computational procedures.  

Along with the specter of ignorance about AI lies the risk of technological unemployment. The skills of most people are becoming obsolete in an economy driven by AI bots. This is particularly worrisome for India, which houses the largest pool of working human population in the world. So the disenfranchising of users with respect to machines has an economic exploitation angle to it, that disproportionately affects India and other poor countries. But this might be a situation where Indian cultural experience may be particularly well-suited to build a humanistic vision about AI.
  
Long before AI and computers, we have a historical example where computational thinking has profoundly shaped literature and arts: the Sanskrit language. In the following, I will give different types of motivation why people working in computers and AI need to study Sanskrit: obtaining an alternate perspective, the technical superiority of Sanskrit grammar, the immense heritage of scientific works in Sanskrit, and the virtues of Sanskrit culture towards promoting biodiversity. 

Alternative imagination through linguistic diversity: 

A couple of friends asked me, what would be an Indian perspective about AI? In fact, this is a deep question that needs a lifetime of an answer. When we think with logic, we are inherently bound to the very language in which those logical categories are formulated. So each language offers us a unique perspective about the world. As an Indian who speaks English, French and German, I know this personally. My entire personality changes in response to the language I am speaking. Even within India, there are thousands of languages. For example, in my mother tongue Telugu, there are two words to denote the inclusive and exclusive "we". This facility doesn't exist in English or in Hindi, so it always remains ambiguous when I say in a group of people "You can come for dinner. We will make food." The noted poet Shatavadhani Ganesh once remarked how in Indian languages we never say "I don't feel good", but rather "My body doesn't feel good".  These subtle differences in language make a huge impact in one's consciousness in how one's mind reasons about a given situation.  Apart from the words, there is the poetic aspect to language in how the very sounds are pronounced.  So the same concept might sound differently to the consciousness when expressed in a different language. So it is useful to have multiple linguistic and cultural perspectives about the problem of AI.

It is important to have an immersive cultural experience to trigger the mind to think in new categories. Merely using a few words in a new language will not impact one’s consciousness and yank it out of existing categories. For this reason, Confucians ordered their daily lives in a precise framework of rituals and ethical values. As Sinologist Edward Slingerland  puts it, ritual is a method for hard-wiring the cold calculating processes of the logical brain (Broca’s area, Anterior Cingulate Cortex and so on) into the hot thinking of the subconscious brain (circuits in the brain stem). Until one achieves this hot thinking, one’s conscious experience will not change. This awareness has long been part of Indian culture, which introduced many festivals, rituals, mantric chanting and immersive networked relationships between daily activities. In terms of language learning, such immersive experience is far better in picking up a new language than studying through grammar and dictionaries. This is validated by automatic machine translation systems where even computers learn better when taught through examples. However, human translators fare still better, because they can  make conscious associations between words and contexts. It is important that we as a species preserve this biodiversity in human consciousness. By this, I mean we need to preserve not only the languages, but also the daily living experiences of a large group of people. Bernie Krause wrote about how animal vocalizations are dramatically altered when the biophonies of their ecosystems are destroyed by motor vehicles and industrial sounds. Similarly, human vocalizations and conscious experience are destroyed when their supporting cultural ecosystems are damaged. This is particularly a problem for poor countries and vulnerable tribal communities.The destruction of their languages has a dramatic impact on the physical and mental health of people. When rituals and songs from childhood are preserved, they help maintain the health and cognitive capacity well into old-age.

The scientific enterprise is a global effort, with people from many countries collaborating with each other. However, there is a dominant narrative and it is written in the language of English. The western universities, especially those based in USA and UK, set the global narrative and the scientific categories in which we think. This is worrisome because English is a notoriously fickle language with no precise grammar. George Orwell described how the English language can be distorted to mean arbitrary things. We have seen this in the political sphere when words such as "insurgent", "surge" and now "alt-right" are introduced. But a similar corruption happens in the scientific sphere, based on how the academic establishment encourages certain scientific terms to become popular. The worst form of bondage is when we are completely unaware of how we are bounded by our own thought processes. This makes slavery through controlling the language one of the subtlest and wickedest forms of slavery. In this regard, we have to see how colonial powers systematically abducted native American and Australian children from their parents and enrolled them in boarding schools, where they were punished every time they spoke in their mother tongues. Such boarding schools existed in India where the elite bureaucrats were trained during the British Raj. But the greater problem today is how economic opportunity is denied to Indians who don’t speak English. In this way, Indians are being alienated from their own mother tongues and forced to think in English – an imported language, where they are not recognized as an authority. For example, despite being fluent speakers, Indians are not allowed to teach English in many countries. This is a subtle form of economic bondage where Indians remain consumers of English products, but do not have any brand recognition when they are the producers. If the English language is imagined as a computer, Indians do not have the root privilege and can only use a limited set of commands.

In computer science, such chains of slavery are exposed in the programming environments we use, and how much control a programming language gives to the user. This is allegorically narrated in the classic movie Tron, where the wicked operating system tries to curb the freedom of the user. In today's world, we can replace this operating system with Facebook, or NSA, or the manufacturers of mobile devices. Our online and offline lives depend on these services, but how much awareness do we have of our personal data that we leave behind ? Without awareness, there is no freedom.Most people do not have a conscious awareness of their data because the computing environments do not interact with them in a language and graphics that they can relate to, although it is now possible to build such interfaces.

Degree of transparency in language:

There is a remarkable difference between different human languages on how opaque they can be rendered to the speakers (or equivalently, how they can be manipulated by a position of authority). Borrowing the analogy from Kevin Kelly, the English language is like a giant egg. This problem is apparent in the very words "artificial intelligence". What exactly do we mean by "intelligence"? What do we mean by "artificial"? Do we expect an AI to be masculine or feminine? Do we expect this to be compassionate or selfish? Do we treat it as a person, capable of suffering? More importantly, does humanity have a creative agency with AI? Does the principle of strict causality in physics even permit humans to have a creative agency? Howe we answer these questions  is simply a matter of convention and social custom -  these are decided by positions of authority. For example, the Oxford dictionary gives guidelines on how to interpret the words. Academic authority is consulted for the interpretation of scientific terms. But this is not the case in Sanskrit - which has many distinct words for intelligence: buddhi, manas, citta, jnana, vijnana, prajna, ahamkara and so on. Which of these words should we use to translate artificial intelligence ? The meaning of these Sanskrit words is not given by convention or authority, but is clear from their very etymology. In this blog and later, I will argue that we should not merely translate from English categories, but build a complete scientific framework from the grounds-up using Sanskrit terminology.

Sanskrit is unique, because unlike any other human language, there is no dictionary needed for Sanskrit. Instead, it possesses a generative grammar of computational rules. The number of Sanskrit words is potentially infinite. Even if we restrict to words less than 5 syllables in length, there are more than hundreds of thousands of words. Each word in Sanskrit is akin to a self-explanatory computer program that can be parsed into individual syllables (phonemes) by which its meaning can be derived. Thus, an infinite number of new words can be generated whose meaning will be unambiguous to a Sanskrit speaker.  The magic of Sanskrit grammar is that you can have multiple ways of breaking a word and putting it together again, that leads to multiple angles of meaning, all of which converge on the denoted object.  Certain words have even ten or more derivations to distinct contextual associations, that reflect the meaning like how the facets of a diamond reflect light in many directions within. This is unparalleled by any other language. The reason for this are two computational processes called Sandhi and Samasa that specify  what to do when words are put together. 

I will illustrate this with a sentence containing just two words, from a mathematical text in Sanskrit called Yuktibhasa.
 JyaSamvargam CapaDwayaYogaViyogArthaJyaVargAntaram.
This translates into the following equation: 
sin(a)sin(b) = sin^2((a+b)/2)-sin^2((a-b)/2) 
How is this possible? The first word is the expression on the left. The second word is the expression on the right. The equals-to sign is omitted, because it is implicit from an aspect of Sanskrit grammar known as Vibhakti. The left expression is translatable into English as Sine-Product. The right expression can be translated into a Lisp like syntax as follows. 
Difference(Squares(Halves(Sum&Difference(Pair(Arcs))))) 
The word order in Sanskrit is reversed, but otherwise this is what it says. The key thing to note is that the symbols () and &  are omitted in Sanskrit. The parse-tree of how the parantheses close with each other is implicitly determined by the rules of Sandhi and Samasa.  These compuatational rules enable on-the-fly generation of complex words that precisely describe any given semantic context. Such words are used not only for mathematical formulae but in poetry and regular parlance. For example, the hero Arjuna is addressed in the epic Mahabharata as Savyasachi (the one who can shoot arrows from both hands), Pandava, Partha etc, depending on the context in which the other person is referring to him. For any given context, the names themselves are not important for understanding people or concepts, but rather the relationships between them. We can see this influence in all Indian languages, where family relatives are addressed with words that denotes the precise kinship in the family tree. But unlike other languages where the meaning of words can change over time, the etymology of Sanskrit words retain their purity.

Key to understanding this is to recognize the nature of "pollution" in a language that can obscure meaning. Indian philosophers understood the universe in terms of five subtle elements, named as - space, fire, air, water and earth. These elements have a distinct philosophical meaning (which should not be confused with the meaning of the words in English). In this order, these elements are defined as those cumulatively accessible to the senses of sound, touch, sight, taste and smell. Thus, the most gross element is named "earth" which is accessible to all the five senses, where as the most subtle element is named "space" which is accessible merely to the sense of sound. The grosser elements are considered prone to pollution, where as the most subtle element "space" and its associated property of "sound" is free from it. Any natural language is considered as "Prakrit" (literally meaning "natural") which has all the five constituent elements. A "prakrit" can be polluted, just as earth, water and air could be polluted. When a language (Prakrit) becomes polluted, it is termed Vikrit. When it is mixed with polluting materials, it becomes "Bhramsa". When the nature of the polluting materials overtake the very nature of language, it becomes "Apabhramsa".  The greater the amount of pollution in a language, the greater the harm created to the ecology of concepts defined in that language, in how they relate to the perceiving mind. 

Amongst modern languages, English is an Apabhramsa and its grammar is that of an Apabhramsa - needing an infinite number of exceptions. But at its core, English is a "prakrit" whose exact nature is often not known to its speakers. Other languages are also Prakrits - each with their own charm and sweetness, when realized in their unpolluted forms. However, at the core of any Prakrit is the fundamental vibration of "sound", which is free from any type of pollution. Indian philosophers argue that there is a language that is present only in the most subtle aspect of "sound" - Sanskrit. Thus, Sanskrit is not an ordinary language (Prakrit). It is considered beyond any type of pollution. The closest analogy in the western tradition is the language of pure mathematics. However, there is a significant difference: Sanskrit is a spoken language. It cannot be represented in symbolic form - with whichever alphabet - without losing its purity. There are two other differences. Unlike mathematical/programming languages, Sanskrit can be used for poetry and aesthetics as well as for science. Unlike mathematical/programming languages, Sanskrit is based on a continuous living tradition and is indeed the direct ancestor for modern mathematical/programming languages, which only preserved parts of its aspects. 

To understand what is missing, I will give a few examples. The nature of a computer program in the UNIX programming environment is not evident from the name of its command, unlike the vocabulary of Sanskrit. One has to look up the man page of the command to understand what the program does. One can also look in the source-code of the program, but that can be obscure in itself. Typically, this source-code or its associated data structures are not even available for other computing applications, such as popular web-services like Google and Facebook. Another comparison is with the Lisp programming language, whose compiler is a short program written in Lisp itself. This makes Lisp a very flexible language where new programming syntax can be invented on the fly. Such is also the case for Sanskrit language, which has a very compact grammar given by Panini. However, a blind person cannot comprehend the syntax of the Lisp language, but can follow the diction in Sanskrit. This is also the case for mathematical equations and formulae, which were traditionally represented as poems that can be sung in Sanskrit. It is also important to note that mathematical notation has certain limits. Many applications enabled by modern computing hardware like deep neural networks, differentiation of discontinuous functions etc.  cannot be adequately analyzed by the current mathematical notation. This may be a fundamental limitation.

 There are certain key differences between modern axiomatic mathematics and traditional Indian Ganita expressed in Sanskrit. From a very early point on, Sanskrit tradition was conscious of the limitations of 2-sided logic (I explained this in a different blog), so did not accept the notion of proofs by contradiction and used them only sparingly. Instead, the onus is placed on experimental observation, like in physics. Further, the notion of the conscious observer is very carefully defined,  with respect to different layers of consciousness. Sanskrit language can be seen as a method of encoding low entropy through its technical terms and grammar. By repeated use, they refine one’s consciousness and make one see certain things that are not seen on first glance. As Sanskrit is a spoken language, these words can be chanted or heard, with eyes closed in meditation. This can be a source of mathematical inspiration.

The heritage of scientific works in Sanskrit: 

 The computer scientist Alan Kay once argued that humans are very myopic in how we think about the future - our eyes are dazzled by the present and we can imagine the future only in terms of finite modifications of the present situation. However, our culture and history are vast, with many ideas forgotten by the wayside before they reach their maturity. In computer science, we can see this every passing year, as brilliant ideas lay forgotten for several years, until new powerful computers or better user-interfaces suddenly make them popular again. 

But the true history of computation (and science in general) is not known to most people.  Their origins lie in the ancient history of India, which for the most part, has only been given brief glimpses in Europe. The central figure of this story is the Sanskrit grammarian Panini, who was born before the Buddha, and wrote the first algebraic system, the first formal system, and indeed, the very first computational system. Panini stands midway between us today and the earliest Sanskrit sages 5000 years ago. With Paninian grammar and associated shastras (scientific texts), India had a headway of about 2000 years before Europe (and most of the other countries) when it comes to computational thinking. This computational mindset has penetrated into million art-forms, cultural and religious practices and of course, the scientific investigation in India. For example, the sutra tradition of Panini is the direct ancestor of the modern mathematical notation that we use in science today. The Vayu Purana of 500 BC gives an explicit definition of a Sutra to have the following characteristics: 
  1. Alpaksharam: with the fewest letters possible
  2. Asandigdham: unambiguous
  3. Saaravat: meaningful & have the capacity to generate new sutras
  4. Viswatomukham: applicable to the external world 
  5. Astobham: containing no pauses and gaps
  6. Anavadhyam: irrefutable from perception and other means of knowledge (pramanas)
The Paninian rules of grammar have developed in this Sutra tradition and took them to the highest pinnacle. If you look for this in wikipedia, or in textbooks, or in trendy TED talks, you will not find it. The reason is simple: India was colonized and its history was suppressed. For the most part, historians of science ignore the contributions from the whole world beyond Europe. They perpetuate colonial stereotypes about the superiority of Europeans (Greeks) which sound comical and funny in this 21st century. The fact is Europe was a scientific backwaters right up to the dawn of the industrial revolution. Greeks and Romans had very few tangible achievements in science. They were quite superstitious and had appalling arithmetic, inaccurate time-keeping by calendars, poor navigational tools and medicine that barely worked.  This evidence cannot be rubbed away by magically attributing scientific merit to random Greeks from antiquity, by citing secondary literary sources. I will quote from the book “Upright thinkers” by MIT physicist Leonard Mlodinow, who says point-blank that systematic scientific investigation and intellectual discourse never happened in the Arab world, China or in India. 

Thinkers who were critical of the intellectual status quo and who attempted to develop and systematize the intellectual tools necessary to push the life of the mind forward were strongly discouraged, as was the use of data as a means of advancing knowledge.

This is ridiculous, considering all the “data” and astronomical tables that Europe actually used to make their calendars came from the Arab world and India. The best rebuttal is given by Prof. C.K.Raju who traced out the gradual historical development of calculus in India starting from Aryabhata and contrasted this with the dramatic appearance in Europe. 

In India, too, a Hindu establishment focused on caste structure insisted on stability at the expense of intellectual advance. 

Actually, India never had heresy. Not only were controversial issues discussed openly, but there is a well-documented tradition of philosophical debate. Unlike other cultures, the various types of personal bias are explicitly listed through the system of Purva Paksha.  This enabled thinkers of competing schools to admit that their authority on truth is only partial, and not complete without the knowledge of other schools. There is no comparable example to the trial of Socrates in India. The rishis of India always stressed on personal experience of knowledge instead of adhering to the letter of tradition. When we talk about social inequalities, how can we ignore slavery. Greeks and Romans had atrocious slave societies which continued until the modern times in European colonies. As attested by the Greeks, slavery was altogether absent in India. In fact, the archeological evidence from India shows it to be one of the most egalitarian societies in the world and materially quite well-off, right until the Islamic invasion. This is not to say that inequalities did not exist, but they have to be treated relatively with other societies. In any case, this a lazy argument and does not suffice for a wanton dismissal of India. 

In contrast to Europe, the scientific tradition in India is continuous and shows consistent material artifacts throughout time. Indeed, the literary corpus of Indian manuscripts dwarfs every other civilization, with hundreds of thousands of Sanskrit works still lying untranslated. The scientific superiority of India was acknowledged by every civilization, including by the Greeks. The Arabs, who studied knowledge from all over the world, stated explicitly that Indians were the first race to pursue science. But in today’s western controlled academia, Indian contributions are acknowledged only grudgingly. Wikipedia, which admits citations only from western sources, is a great example of this bias. When we look up the entry for “thesaurus”, say, the first reference is to one Philos of Byblos. Does anybody have a copy of this book ? Has this been ever seen by western scholarship except through extrapolations from secondary sources ? Is there a living cultural tradition of using this book ? None. This can be contrasted with the secondary mention of the Sanskrit “Amarakosha”, which is a real book. It is far larger and has been in continuous use till today. If the history of science is a train, Indians are eligible to travel only in third class compartments. Greeks can travel for free. There are many levels of control: western academic scholars and their “peer”-reviewed journals, popular books and magazines written by western-certified academics, online portals, and finally censorship on social websites like Wikipedia or Facebook. It is a good exercise for the reader to check his / her favorite science website (Aeon, Nautilus, Conversation, Edge) on  how much the sections on the history of science and philosophy has elements from India or China. This hardly reflects the actual numbers of scientists of Indian and Chinese origin, even those working in the west ! It is an open question how long this academic cartel of western superiority can be maintained against the growing economic power of India and China, as well as the relative egalitarian structure of online communication. 

This systematic eclipsing of Indian achievements gives a very distorted view of history, making these conclusions useless. For example, the grand tome of Steven Pinker “Better angels of our nature” on the history of violence in the world has scant data from India. It is unforgivable because India had the longest historical record of civilization as well as the largest human population in the world. The data from India is also inconvenient to  Pinker’s thesis, which argues that historically violence has only fallen down in time. In fact, India had a very peaceful civilization for thousands of years along the Indus and Saraswati valleys with no warfare. A similar distortion occurs in the book “Guns, Germs and Steel” by Jared Diamond due to the complete absence of India.  Diamond argues that state power from agricultural states has forcibly penetrated tribal peoples living in the peripheries and forests. But this did not happen in India, where the tribal languages and customs survived to the present day. Finally, I have to mention the book “The information” by James Gleick which gives a broad history of information theory and computer science, but completely overlooks India. As I will demonstrate through this series of blogs, it is absurd to overlook the country of Panini.


Sanskrit as a promoter of biodiversity:  
Extent of the fertile cultivated lands around the Saraswati river in the Vedic period

For about ten thousand years, the Indian subcontinent was not only the most populous area but also the most technologically and economically advanced civilization in the world. But despite this, this region preserved its biological diversity. The forests of India housed vast numbers of tigers and other wild animals, whose numbers started to decline only during the colonial era. The same is true for linguistic and cultural diversity in human societies. One can contrast how Irish and other Celtic languages got exterminated from the British isles to how Dravidian and south east Asian languages thrived despite the dominance of Sanskrit. India is the only civilization in the world where tribal languages and customs are preserved, despite being in close contact with literate societies. Apart from protecting economic and lifestyle niches, religious beliefs and practices were also protected. Many external religions such as Zoroastrianism, Judaism, Syriac Christianity, Bahai'ism have seeked and found refuge in India. This case of India is all the more surprising when we note that the aggressive European civilizations were but cousins to India, sharing a common linguistic and mythological ancestry. So what did its cousins lack that made India tolerant? 

The answer may be in the computational nature of the Sanskrit language and the sciences nourished by it. Taken together, they are a means to amplifying the consciousness of a person, making him aware of every single aspect of life and his conduct to it.  This reinforcement of consciousness is the key to avoiding environmental catastrophe in any age. Often, humans destroy living ecosystems through sheer ignorance and not paying attention. Greed is a big factor, but stupidity results in greater violence in the long term. 

The languages and the belief systems that we think in are Prakrits - applicable to a specific place and context. A certain type of fish might survive in certain type of waters, but other fish may die. Such is the case with Prakrits, they cannot claim to be universal. Further, if they become polluted (becoming Apabhramsas), they cause suffering to the very creatures that used to live there earlier. The greatest cause of suffering is the ego nurtured by the polluted mind. For example, after they conquered Bengal, the British have systematically scorched the region with famine to break the morale of people.  The Americans exterminated the bison so that they could starve the native Indian tribes that depended on it. It is hard to fathom the depravities of such egotism, which continues to cause ecological destruction today. Even if we were completely selfish people (which I believe is a mischaracterization of us humans), we should be aware of the pitfalls of short-term greed. There is an important lesson to be learned from human civilizations that survived for a long time without ecological collapse like in India (at least until today's age).  The lesson to learn is the open computational grammar of Sanskrit, which makes it modifiable to suit to specific local contexts in space and time, such that the human mind pays attention to the changing constraints of nature. Like pure waters of an unpolluted river, they can be enjoyed by all living beings. In a more general sense, we can say the same for open-source software if it achieves political and economic awareness amongst people.

When we compare the Indus-Saraswati valleys with other ancient civilizations like Mesopotamia and Egypt, the first thing that pops out is the sheer difference in size. North-western India was the largest alluvial plain to have been cultivated by early humans and this was nourished by the gigantic melting glaciers of the Himalayas post the ice-age. This was the most fertile territory for settlement of humans, as it had every single mineral and ecological resource. Because of the sheer size of this area, the rest of the world experienced a huge cultural and genetic influx from here. In contrast, there is little evidence for inflow of people into India until 2000 years ago. There is no genetic, archeological or literary evidence for an invasion/migration of Aryans into India. Hitler was a complete idiot and so was Hegel. Modern racists are equally stupid, despite being awful people. The case should have been closed, but a problem still remains: why are European languages similar to Sanskrit if they were not both sired by the same rampaging invaders?  Considering the very ancient dates for agriculture and civilization in India, an enormous amount of vocabulary might have been borrowed from India along with the spread of  agriculture. Words for numbers, agricultural tools and settled village life could have been borrowed from Indian practices. Even genetic evidence shows a significant migration from India and central Asia towards Europe. These cultures could have evolved into distinct new languages over thousands of years. Even the ancient European mythologies are a partial reflection of the more extensive Indian works. Using the relatively simplistic tree model for language evolution, the Greek Indologist Nicholas Kazanas argues that European ancestors speaking Indo-European languages spread from the Saraswati valley through a northern route, with a stop-over in the Amu Darya basin and further into Russia. There is genetic evidence that Europeans evolved the ability to digest milk protein in this region, which probably gave these tribes an advantage over previous settlers in Europe (along with agricultural knowledge). More broadly, the science of historical linguistics will need to evolve better models for linguistic evolution than simple hierarchical specialization from a common tree. How languages evolve may be far more complex, where technical words and phrases spread like in fluid dynamics.

India may have been the ancestral home for European tribes, but this agricultural civilization did not spread along with a simultaneous awareness of ecological consciousness and respect for nature. This can be contrasted with the parallel spread of civilization towards the south of India. Unlike the frigid north (just recovering from the ice-age), South India was already in a more advanced settlement phase: so these languages did not borrow vocabulary for numbers and settled village life as in Europe. However, the scientific and cultural influence of Sanskrit is tremendous in all Indian languages. For example, Telugu shares 90% of its vocabulary with Sanskrit, despite being a Dravidian language. Speaking Telugu provides a significant advantage in learning Sanskrit, as I discovered, not just in vocabulary but in all aspects of grammar. So this arbitrary grouping of South Indian languages into a separate family (based on corrupt models of language evolution considering only simple words and verb inflection patterns, but not the language in its entirety) needs to be questioned. Unlike European languages, South Indian languages didn't borrow such simple word structures from the Saraswati valley because they already had a working vocabulary for them, but borrowed more technical words and computational grammars which are infinitely more enriching. In this regard, it is illuminating to compare with the Finno-Ugric languages in Europe which also borrowed certain terminology for agriculture (these words resemble ancient Indo-Iranian roots than the later Germanic languages, despite the geographic proximity).  The final clinching evidence is that the names of all Indian rivers (including many in South India) can be traced to Sanskrit etymologically, but very few European rivers have Indo-European roots.

Throughout the cultural history of India, all great poets and writers in regional languages studied Sanskrit and were equally proficient in it. The  power of Sanskrit in word formation and grammar has penetrated all Indian languages. In fact, the first writers of any regional language (Tamil, Telugu, Malayalam etc.) wrote a technical Paninian-style grammar for their language before composing any literary work. This is because they understood the importance of grammar in imparting consciousness to the literary tradition. This is in stark contrast to European languages whose grammars were woefully inadequate until Sanskrit was discovered in colonial times. The imperfect alphabets, spelling, and word formation of European languages resist change to this day. 

It is time to reclaim the word “Aryan” back into Sanskrit, where it has a precise etymology. The root word refers to agriculture. Aryan simply meant a noble person belonging to a settled agricultural civilization. “Aryavarta” referred to the large arable land between the Indus and Gangetic plains that was suitable for agriculture. The computer scientist Alan Kay mentioned that we as technologists have not yet discovered a practice of “agriculture”: a stable, settled community that respects nature and evolves a set of civilized norms. I think this is exactly what is needed with computer science and AI: I term this “Arya Prajna” in Sanskrit – roughly translated as civilized intelligence or Aryan Intelligence for AI. Idiots and Nazis may be damned – they do not control how I speak words of my own heritage. I can as well call it Indian Intelligence or Hindu Intelligence, but that will do disservice to the many tribal communities who still live with their traditional nomadic lifestyles, and who are equally part of Indian / Hindu fabric. The Aryan ethic is a subset of Indian culture, but it was successfully exported as a universal value system, both to the west and to the east. I don't mind other people using this word, as long as they use it in the right context and meaning. In this regard, it is important to note that Buddha called himself as adhering to the Arya Dharma. In my opinion, the best examples of the Aryan ethic in today’s world are actually people who still live in the erstwhile plains of Saraswati river – now turned into the Thar desert.  These are Bishnois, who are a vegetarian community living sustainably and who are passionate protectors of wildlife in the region. These people are my inspiration in how we should use technology to live intelligently and consciously. This comes only through understanding nature’s rhythm (Rta) and the mutual interrelationships between all objects (Indrajala) including one’s own mind. This is known as Rtambara Prajna in Patanjali’s Yoga Sutra. We can equivalently call it Arya Prajna.
                          
Sanskrit categories of knowledge:
   
In an ancient time, the sage Vyasa organized the Indian knowledge corpus into the Vedas (Veda literally means knowledge, and Vyasa literally means compiler). Many people believe that Vyasa is a common moniker used by an entire group of scientists and poets who compiled these texts.  These texts are broadly divided into the internal spiritual sciences and external objective sciences. Unlike in the western and Abrahamic faiths, it is the external sciences that are used as entry point for the inner sciences, which are always shaped in their image. Thus in the Indian tradition, seeking the inner spiritual truth is a series of recursive steps of understanding the external world and creating a subtler internal universe in that image. The inner sciences are dealt by the Vedas, which are four in number. Each of them is paired with an outer science, these four are known as the Upavedas. This is the grouping I will adopt in my series of blogs. The pairs are as follows. 
  1. Rig Veda (hymns in verse form) paired with Ayur Veda (science of health)
  2. Sama Veda (hymns in song form) paired with Gandharva Veda (science of arts and music)
  3. Yajur Veda (hymns in prose form) paired with Artha Veda (science of ethics and economics)
  4. Atharva Veda (hymns in composite form) paired with Sthapatya Veda (science of engineering)

In this series of blogs, I will deal with them in the order 4-3-1-2. In each blog, I will try to provide a computational perspective on these subjects, grounded in a historical narrative from India. I will try to raise some new questions, as well as introduce certain Sanskrit words that encode unique perspectives to think about the existing issues.

I will connect language with physics, and argue how grammar is a better model for doing physics than the Newtonian analogy of "law". This will require building a complete framework for computational science through Vedangas, which I explain below.  On ethics, I will describe the limitations of language and logic in expressing ethical dilemmas and how that relates to economics. In modern times, these dilemmas relate to the ethics of AI in autonomous cars, finance and robotics, as well as technological unemployment. I will substantiate this with discussion from the Indian Dharmashastras, epics and the Artha Shastra of Chanakya. On health, I will argue about how we need to understand the human body as a holistic ecological system and vice-versa. I will discuss the holistic theory of healing in Ayurveda that combines pharmacology with ritual, arts and mental discipline. I will connect this with related evidence in neuroplasticity, gut bacteria, endocrinology, psychology and the broader ecological health in the environment. On art, I will argue how creativity is connected to gender and sexuality, using the theory of Samkhya. I will describe the principles of the ancient text of Natyashastra, as well as the computational arts of Avadhanam and the 64 kalas traditionally practiced in India. I will try to provide certain directions on how to extend these art-forms through AI, virtual reality and cybernetic systems. These connections may look weird to a novice, so far greater elaboration is needed. But I believe these are essential issues which cannot be overstepped while thinking about AI. 

On the first topic of the science of engineering, we need to understand the central role of computation in all engineering sciences today. In the Indian tradition, the Vedas are supported by six fields of supporting study – known as the Vedangas (limbs of the Vedas). These are the external body to the internal spirit of the Veda, echoing the Indian practice of equating the internal world with the external world. As the external world is easier to study and analyze, the Vedangas are studied rigorously by the students before the Vedas. I describe them in the following. Each of them merits a separate blog with references, on how traditional Indian knowledge can be used to interpret computational sciences. 

1.     Shiksha (literally “instruction”): I translate this into today’s language as user-interface design. Usually, this is translated as phonetics, as the original texts mostly dealt with accurate pronunciation. However, this is misleading because phonetics and alphabet are far more precise in India than in Europe or in the middle-east. Further, Shiksha also dealt with hand mudras that encode alphabets. The very word Shiksha (instruction) refers to the fact that user-interface design is not merely about providing the alphabet of interaction, but actually about instructing the users on how to achieve proficiency with this alphabet.Traditionally, Shiksha is visualized as the nose of the Veda (knowledge).
2.     Chandas (literally “structure”): I translate this into today’s language as combinatorics and  theory of rhythm. This is because this is precisely what this field describes. Pingala, who wrote one of the earliest texts on Chandas described binomial theorem, Fibonacci series and other combinatoric devices. Usually, Chandas is translated as “prosody”. This is misleading because Sanskrit prosody is far more extensive than the meters in other languages. What Sanskrit prosody analyzes is the systematic division of time, which ultimately leads to how music can be composed. This musical rhythm controls the movements of a person and sets them to the right tempo, in order to perform any action. In this regard, this is a central element of  sensory-motor loops in robotics, which can benefit from this inspiration. Traditionally, Chandas is visualized as the feet of the Veda.
3.     Vyakarana (literally “grammar”): I translate this into today’s language as programming language theory. This is because Paninian grammar is very different from how “grammars” of other languages are treated. It is in fact equivalent to the Backus-Naur Form in which programming languages are written. It also contains a huge ontology of semantic concepts, that is especially relevant for AI programming. Traditionally, Vyakarana is visualized as the mouth of the Veda.
4.     Nirukta (literally “etymology”): I translate this into today’s language as semantics. This is because Sanskrit tradition has an extensive theory known as “sphota” on how meaning can be derived from etymology. It needs to be always used in conjunction with Vyakarana. Traditionally, Nirukta is visualized as the ears of the Veda.
5.     Kalpa (literally “making”): I translate this into today’s language as geometry. This is because geometric context is the essential framework in which a ritual is formulated. The texts of Kalpa deal with rituals pertaining to specific situations. Any type of abstract ritual context can be encoded into a geometric context. Further, the texts of  kalpa contain “sulba-sutras” (literally “string-rules”) that expose the earliest known geometric constructions and proofs, along with the so called Pythagoras theorem (should be rightfully called the Sulba theorem). Traditionally, Kalpa is visualized as the hands of the Veda.
6.     Jyotisha (literally “seeing”): I translate this into today’s language as harmonic and data analysis. This is because time-keeping is the essential ingredient in seeing things in the external world. The texts of Jyotisha described observing astronomical events and precisely calculating time. They produced sophisticated works of trigonometry and calculus, that form the basis for the modern mathematical theory of analysis. However, the philosophy of Aryabhata and other scientists of Jyotisha is very different from axiomatic mathematicians working on analysis. It is more in line with modern techniques of data-interpretation (observing the external world). So this is where this inspiration needs to be deployed. Jyotisha has often been misused to make "predictions of the future", just like how data analysis is being misused today.  Traditionally, Jyotisha is visualized as the eyes of the Veda.

In the upcoming blogs, I will try to summarize the Indian contributions towards each of these fields to build a holistic framework for computational sciences. I will try to build certain perspectives on how these inspirations can be deployed today for solving open problems and for looking ahead. These issues need to be thought through again and again, by scores of people, so that a coherent Indian perspective can be built about AI. At present, Indian scientists rarely use Indian categories and terminology. One significant disadvantage is the lack of Indian language translations for cutting edge scientific topics. In this regard, a stark contrast can be observed with Chinese, Japanese and various European scientific communities.This needs to change.

I will give a few examples to indicate the kind of perspective I would like to build. In Sanskrit,  Mantra and yantra signify a cybernetic environment for the user and can be used as new models for software and hardware respectively. The etymology of these words and their existing associations with Indian psychology may create a new humanistic focus on computing, where the effect of software on user's mental state is placed central to its evaluation. From the  Indian perspective, the physical analogy for an algorithm is not a mechanical clock, but a constantly flowing river that nourishes people. This river is Saraswati  on the banks of which Indian civilization flourished, and who was later glorified as the goddess of speech. In the Indian tradition, this river is supposed to flow through all the other rivers, blending at sacred spots of confluence. When Indians make pilgrimages, they carry small pots of  water from the rivers of their places of origin to the sacred Ganges and mix them in. This is a way of acknowledging the commonality of all the rivers. Interpreting this tradition with computers and algorithms, we should encourage interoperability of all computing systems, by periodically blending in the waters of computation with each other. This is necessary to prevent the chaos of codecs, ports and standards that we experience today. This is a gentler method of ensuring interoperability than setting the agenda from the top down. Thinking of algorithms and computer programs as rivers also requires us to maintain them free of pollution. Various types of pollution in terms of data-structures,  security, network infrastructure etc. need to be addressed in a similar manner to how we address pollution  in ecology. I will discuss these aspects in conjunction with Ayurveda.  

Learning Sanskrit for Poetry and Spirituality:

My blog will definitely annoy many people, to whom there needs to be no better reason to learn Sanskrit than to read the Upanishads or to enjoy the poetry of Kalidas. Indeed, sublime works of Sanskrit such as Abhijnana Shakuntalam had a huge influence in the European renaissance, influencing the likes of Goethe and Coleridge. In a public discourse, Shatavadhani Ganesh once chided people championing  scientific applications of Sanskrit as idiots who miss the true beauty of this language. Point taken. Personally, I do agree that these poetic works are the highest treasure of Sanskrit, but this will always be a subjective personal judgement. These works of poetry and spirituality are complementary to scientific practice, in how we analyze the external world. My argument is that our modern tools for science are imperfect and need to be re-hauled through Sanskrit. The modern methods of academic instruction, mathematics and science have been disconnected from the actual historical heritage in Sanskrit works. This disconnect has produced so much pollution and strife in this world that people cannot even find their inner world of poetry any more.  In Sanskrit tradition, the contrast could not have been stronger. Great mathematicians like Bhaskara were also highly skilled poets. All the great Sanskrit poets and musicians used computational thinking that would pride a scientist. These bridges have to be rebuilt today, not only for the sake of lovers of Sanskrit, but for the whole world. The mainstream narrative from western media is okay with letting Indians have their naked mystics, but not as open about acknowledging the full extent of scientific contributions. So mysticism and poetry are not my bone of contention. But anybody who tries to confine the applicability of Sanskrit to just within these realms is an enemy, not of Sanskrit, but of science.

What to do next ? 

I have a tough problem writing these blogs and I need help from the readers. Firstly, if you like my blog, please copy it and annotate it with your own impressions and ideas. You may use everything written in my blogs with the most liberal creative commons license. If you are an editor, please edit my arguments and cite them with references, wherever needed. You can cut specific portions of the text and distribute it to a certain audience who may not be interested in the other aspects. At this stage, I need to sharpen my ideas through discussion and criticism. If you have money, please support my research :) If you are from CDAC or other such scientific institute working on Sanskrit, please be aware that digitizing old Sanskrit texts or art-forms is not the only goal within your mandate. It is high time that a brand new vision for computation and human interfaces is built from Sanskrit. If you are the Indian government, please evolve a funding scheme that adequately supports scientists working in these fields. You have no more excuses for lack of monies. An ecosystem of funding and support is needed not only for scientific literature, but also for technical development of computational systems based on these ideas.  It is better to have many sources of funding, such that they cannot be controlled by vested interests in the establishment. I think a blog is neither the right encoding (a book may be better, I might write one when my ideas become more mature) nor the right method of communicating to the public (who have short attention spans, but I don't care about the average reader). The right method of creating online literature has not been developed yet. Ted Nelson had several great ideas in his Xanadu system, but even he is unaware of the Sanskrit tradition of etymology, sutras and bhashyas (no fault of his). A new world of computational literature will be built in my own life-time: I would like to shape it with inspiration from the Sanskrit tradition. After all, the future is not settled yet.
 
I will try the age-old Indian trick of using stories and mythologies to get my point across. In essence, our situation today is similar to the Kishkindakanda in Ramayana, where Rama requests the help of Sugriva to find his lost Sita (Indian scientific inspiration). Sita is kidnapped by Ravana and imprisoned in an ivory tower in Lanka (western academia) and we need to build a bridge to get there. My job here is to be like Angada - the son of Vali, convincing the king Sugriva that he should keep his promise to Rama. I should have no ego. I should let go of the fact that my own father Vali has been killed by Rama, as this is done for the cause of Dharma. For me personally, Vali is the altar of European "enlightenment", of which I am definitely a product. But how can Vaali equally not be my own personal Indian heritage ? Actually, I cannot be sure. So I have to keep evaluating the relevance of Sanskrit in a dispassionate manner: on issues such as technical transparency, biodiversity and so on. Only this context determines who is Rama and who are Sugriva and Vali.  The cause of the Indian claim for scientific heritage is important, but cannot be the sole arbiter of my work here. This is also why I ignore the aspect of poetic beauty and spirituality in the relevance of Sanskrit for computation.These are subjective values and prone to misinterpretation. After the Vanara army is ready, I have to show where to ocean is shallow and how to build a bridge to Lanka. But I need help from everybody to pick up their stones and lay them down. When the time comes, I should be ready to serve as a messenger to the court of Ravana. When I place my foot down on a topic, no Rakshasa should be able to lift it up. This is hard work and I am not sure if I am monkey enough to be up for it (please excuse my pun from the Ramayana). There are many others like me, and most of them are not even Indian. But learning Sanskrit for the purpose of redeeming science can be as joyful an endeavor as living the Ramayana in our own lives. In this sense, there is a Rasa to this drama and it is not mere mechanics. 

References: 

If you actually got until here, it means you have patience for more reading. There are many references to be added here and this section will (may) be updated later.  I will give specific references in the upcoming blogs when I discuss individual topics in greater detail. 



[1] Rajiv Malhotra's book "Being Different" : I highly recommend this book to everyone, especially his passages on Hegel and German Indology. He also supported a volume of books on Indian scientific contributions. He also organizes a series of Indian Indology conferences
 

[2] C.K.Raju's articles on the history of Indian Ganita (calculus): 1, 2, 3 ...His book on the various theories of time which critique Newtonian and Einsteinian models of physics. See also his lecture on the superiority of Indian methods of time-keeping. But most importantly, know about the real history of calculus. 




[3] The video lectures on the contributions of Indian mathematics by IIT Bombay. 
 

[3.141...] See reference [3.1415...] By the way, what time is it ? Never enough time for perfection. 

[4] The lectures of Shatavadhani Ganesh on Indian culture: they are a treasure-trove of erudition with many references. 

 

[5] The videos on Sanskrit learning by Advaita Academy:  Panini's Ashtadhyayi,  Sanskrit instruction

 

[6] The videos on Indian culture and arts by Shaale
 
[7] Michel Danino's lectures on the Aryan Invasion Theory

[8] Michel Danino's lectures on the Indian civilization.
 

[9] Edward Slingerland's book "Trying not to try" 
  
[10] Leonard Mlodinow's book "The upright thinkers" 

[11] Steven Pinker's book "The better angels of our nature" 

[12] Jared Diamond's book "Guns, germs and steel" 

[13] Bernie Krause's book "The great animal orchestra" 

[14] Alan Kay's lectures on the history of computing ideas: 1, 2, 3, ...

[15] Ted Nelson's videos: 1, 2, ..
 

[16] John Kadvany's article on Paninian grammar and modern computing

[17] Subhash Kak's article on the application of Paninian grammar to computing

 

[18] Rick Briggs on the possibility of using Sanskrit for knowledge representation. Sadly this article is very badly misinterpreted by jingoists. I will write more about knowledge representation and semantics when I discuss Vyakarana and Nirukta.
 
[19] Introduction to the flexibility of Sanskrit 


[20] My article on  the four-sided negation in Indian logic

[21] My article on technological unemployment 

[22] My article on the trend of monopolization in computing applications

[23] PRI's world in words is one of favorite podcasts: this is their episode on endangered languages. Ainu tribal woman of Japan singing in old-age, native American tribal languages destroyed by colonization, their biased episode on Sanskrit revitalization (where Sanskrit is portrayed as a liturgical language connected with religious nationalism - quite a contrast, for example, with Ainu)

[24] Sanskrit Ghana Pati singing in old age: Just like the Ainu woman above, but with a lot more computational complexity (I will write about the error correcting codes of Ghana recitation in my blog on Shiksha).

[25] Greek Indologist Nicholas Kazanas has written many books, and videos of his lectures are available on Youtube. 

[26] American Indologist David Frawley disputes with the hierarchical tree model for the evolution of Indo European languages and proposes a diffusion-based approach.  

[27] Teman Cooke argues that the myth of the scientific method is a lie and this is not how science is conducted in the real world. This lie was invented to make way for European appropriation of scientific discoveries from the rest of the world. See [28].

[28] CK Raju's interviews with Claude Alvares where he argues that the paradigm shift theory of Kuhn is a lie. He also gives a broad historiography for scientific discoveries in India as well as in the Arab world.He serially debunks the myth of Copernicus, Euclid and Claudius Ptolemy. In my opinion, the myth of Claudius Ptolemy is the most important one to debunk (and it is also the easiest, as it rests on ridiculously flimsy grounds). 

[29] Here is an informal blog that recounts various instances of how Indian Ganita is digested into western mathematics. This summarizes CK Raju's scholarly work along with examples and illustrations from other writers. 

[30] Sankrant Sanu is an activist for the cause of using Indian languages in higher education and computing: His book  Bhasha Neeti has a nice web portal. 

[31] There is an ecosystem of web portals explaining Indian culture, philosophy and customs from the insider perspective: Prekshaa, Pragyata, IndicPortal, IndiaFacts, CreativeIndiaMag and so on. They are very good and not at all jingoistic, as how the mainstream media portrays them to be. (You might run into hotheads and idiots on social media though).  Just like any other web-based education / journalism organization, it is unclear how these endeavors make any profit. As profits get depleted from journalism, the only journalism that will survive is a fake one, with support from the (financial) establishment. The same can be said about education. This will have a deleterious effect on our democracy. The business model for web education and journalism needs to be redefined and Indian culture may provide psychological insights on how to do this (I will write more about this when I write on the Artha Veda). 

[32] Loads of references about AI, technological unemployment, ethics in AI, autonomous cars, how data-driven learning messes up algorighmic convergence.. are you kidding me ? Use google man. Or get a PhD. Or check out the Ethical Machines podcast. Or the FATML conference. One of the organizers is an expert Sambhar cook. Check out his geomblog

[33] People who have read my blog before would know that I severely reject parochialism. As a reference to new readers who might come across this post, I am sharing two previous essays I wrote: How the European renaissance was driven by advances in the science of anatomy, how parochial superiority complex coupled with victimhood is a deadly murderous mix that leads to terrorism