Friday, November 10, 2006

C o n j u g a t e G r a d i e n t M e t h o d

When a person believes that he is doing something for a right cause, he becomes indistinguishable from a robot. Most of the great things happen because of these type of people. Most of the nonsense that we see in the world is also committed by the same kind of people. So where is the distinction ?

The distinction lies in the vision of the people - in what they see.

The problems of the world can be visualized collectively as a function which forms a peak over several dimensions. The solution for these problems lies at the top of the peak and it has to be reached somehow.

In mathematics, the technique that is used to solve this problem is called the conjugate gradient method. When there are thousands of dimensions, we have to first choose a direction to travel. The most simplistic answer - go in that direction where the descent is maximum (called the steepes descent method) - is also ignored by the junta of the world.

Every person has his own direction where he thinks he'll reach the top. In a sense, he is true. Each one of these directions yields an improvement. But the funny thing is that several of these directions are contradictory to each other. Look at the following figure to understand this.

So we see maniacal people (red and blue) fighting with each other about the correct way to reach the top. Sometimes, things get funnier with somebody (yellow) choosing the least important direction for reaching to the top. I could demonstrate this on a simple function over 2 dimensions, so you can imagine what could happen in real-world problems.

Take the problem of improving the status of traffic in India - the problem lies in several dimensions
a) Controlling the really large vehicles and routing them away from the city
b) Controlling traffic in peak hours by urban planning
c) Controlling traffic blockades by specifying seperate pedestrian districts, parking places etc.
d) Getting riders to wear helmets.

Amongst all these dimensions, the traffic police of Hyderabad demonstrate their zeal in (d). Like mad men, they keep digging in this direction and ignore everything else.

As a second example, take the problem of university education - it needs improvement in several dimensions
a) Establishing world class research facilities with proper accountability
b) Getting the students to have a real dialogue with the world around
c) Getting the students to attend the classes

Again, it so happens in India that all the zeal is demonstrated in (c).

Now imagine that we have a sane bunch of people in the world. You would imagine that things would be a lot better. People would choose the best direction to solve all the problems (cyan) instead of the stupidest (yellow). But, even then, it so happens that steepest descent is not the optimal method to reach to the top. We will eventually get there, but not in the most optimal manner. Because we might follow a zig-zag path with two opposing directions choosing alternate steps, taking a very long time to converge. This is what we would see in elections (assuming each step to be of a 5 year duration) with power alternating between left wing and right wing policies.

The Conjugate Gradient Method is a way of obtaining the peak in just "n" steps where "n" is the number of dimensions. The logic is simple - once you choose a direction, do the best you can in the current step, never repeat the direction.

It so happens that all these "n" directions would become perpendicular to each other (In mathematical terminology, they are said to form an orthogonal basis). If you want to understand the logic behind how to choose the right direction at each step, read this document (only for people with a mathematical bent of mind)

I have attended a fascinating mathematics course by Prof Mike Erdmann at Carnegie Mellon. Each one of us students had to choose a topic for write up, and when my turn came, I jumped at the chance of writing the proofs for the conjugate gradient method.

I think this method gives a lot of intuitions about how to organize a debate. When you are debating with another person, you are presenting a view. That is, you take a picture of the 3D function in the above figure, from an angle. Each person has a seperate photograph and you compare these photographs with each other to understand what could be the true shape of the function in 3D.

But most often, a debater fails to see this main purpose of the debate. He keeps insisting that he has the only photo that can capture the 3D shape and keeps digging in that direction. I call this the "root of all the evil" syndrome.

If only we have a smart bunch of leaders, we would end up choosing the right set of directions, and the problem of all the "n" dimensions will get solved in just "n" steps.

2 comments: