2 Month Notice

Well, technically it’s been more than 2 months since I joined my new workplace, but I guess it is high time I gave an update as to what I am up to.

As expected from working at one of the top tech companies in the world, there is a lot of work (and fun), but there is a great potential for learning as well. And man have I learned a lot!! In my first month here, I worked on a Windows 8 Modern UI app and understood the architecture of building a Modern app hands-on. Not only that, but I also had to integrate the app with a web service using Javascript (which, by the way, is my weakest programming language), and after a lot of fumbling in the dark, I can now bend JS to my absolute will (Evil laughter)!!

In my free time, I got together with a senior of mine, Prakhar Gupta, who works at the same company albeit in the Bangalore office, and quickly coded up a Windows Phone 8 app. The app basically acts as a birthday reminder for all those like me who are poor at remembering dates. Expect to see the app in the Windows Marketplace soon! Along the way, I have also been drawn to Cloud Computing, thanks to the amazing Windows Azure (they have tutorials on creating Android apps with an Azure back-end), and hope to soon gain certifications in Cloud Computing. This along with some other projects that I really can’t talk about (Non Disclosure Agreement, you see) have made my life coding bliss!!

Oh, and did I mention that I have also started development on the Leap Motion? Expect to see more on that and Kinect development in my next few posts. This is from a practical standpoint. From a knowledge standpoint, I am learning everyday. I have learned about good design and best practices while coding in C# and am also re-exploring functional programming with F#. SQL and database querying now seem to come more naturally than ever, and I have also started looking into query execution plans to further optimize my SQL code. I have also been trying to read up on the Common Language Runtime (CLR) which so far looks great with the way the CLR handles managed modules and the variety of support provided for different languages, but with all the work and coding going on, I am having a hard time actually removing time for myself to read more. Will have to stretch more on the reading front!

In the pipeline are some more apps (maybe on Android?) and reading papers and texts on NLP (for WishWasher) and Computer Vision (which is still my favoured field). I do seem to be loaded with work, but hopefully, I will keep inventing things and inspiring you to try new things. Keep an eye out for more on this domain.


Birthday Wish NLP Hack

Well, it was my 22nd birthday 11 days back, and while the real-world was quite uneventful, I managed to create a small stir in the virtual-world.

For this birthday, I decided to do something cool and what is cooler (and a greater sign of laziness) than an AI program that replies to all the birthday wishes on my Facebook wall? This was definitely cool and quite possible given a basic understanding of HTTP and some Artificial Intelligence. After experimenting for 2 days with the Facebook Graph API and FQL, I had all the know-how to create my little bot.

Note: This is from a guy who has never taken a single course on Natural Language Processing and who has next to zero exposure programming NLP programs. Basically, I am a complete NLP noob and this hack is something I am really proud of.

But one major problem still remained: How to create a NLP classifier that would classify wall-posts as birthday wishes? I tried looking for a suitable dataset so I could build either a Support-Vector Machine or Naive Bayes Classifier, but all my search attempts were futile. Even looking for related papers and publications were in vain. That’s when I decided to come up with a little hack of my own. I had read Peter Norvig’s amazing essay on How to Build a Toy Spell Checker and seen how he had used his intuition to create a classifier when he lacked the necessary training dataset. I decided to follow my intuition as well and since my code was in Python (a language well suited for NLP tasks), I started off promptly. Here is the code I came up with:

The first thing I do is create a list of keywords one would normally find in a birthday wish, things like “happy”, “birthday” and “returns”. My main intuition was that when wishing someone, people will use atleast 2 words in the simplest wish, e.g. “Happy Birthday”, so any messages just containing the word “Happy” will be safely ignored, and thus I simply have to check the message to see if atleast 2 such keywords exist in the message.

What I do first is remove all the punctuations from the message and get all the characters to lower-case to avoid string mismatching due to case sensitivity. Then I split the message into a list of words, the delimiter being the default whitespace. This is done by :

<p>s = ''.join(c for c in message if c not in string.punctuation and c in string.printable)<br />
t = s.lower().split()</p>

However, I later realized that there exist even lazier people than me who simply use wishes like “HBD”. This completely throws off my Atleast-2-Words theory, so I add a simple hack to check for these abbreviations and put in the expanded form into the message. Thus, I created a dictionary to hold these expansions and I simply check if the abbreviations are present. If they are, I add the expanded form of the abbreviation to a new list that contains all the other non-abbreviated message words added in verbatim [lines 15-20]. Since I never check for locations of keywords, where I add the expanded forms are irrelevant.

Then the next part is simple, bordering on trivial. I iterate through the list of words in my message and check if it is one of the keywords and simply maintain a counter telling me how many of the keywords are present. Python made this much, much easier than C++ or Java.
But alas, another problem: Some people have another bad habit of using extra characters, e.g. “birthdayyyy” instead of “birthday” and this again was throwing my classifier off. Yet another quick fix: I go through all the keywords and check if the current word I am examining has the keyword as a substring. This is done easily in Python strings using the count method [lines 31-34].

Finally, I simply apply my Atleast-2-Words theory. I check if my counter has a value of 2 or more and return True if yes, else False, thus completing a 2 class classifier in a mere 40 lines of code. In a true sense, this is a hack and I didn’t expect it to perform very well, but when put to work, it really managed to do a splendid job and managed to flummox a lot of my friends who tried posting messages that they thought could fool the classifier. Safe to say, I had the last laugh.

Hope you enjoyed reading this and now have enough intuition to create simple classifiers on your own. If you find any bugs or can provide me with improvements, please mention them in the comments.


Face Detection with OpenCV 2

Finally done with my Senior project at college and wow, it was great! Will definitely post about it soon and hopefully will be a good apology for the tardiness of this post.

My project was on Facial Expression Recognition and for starters, I had to process images to detect a face(s) in the image. This is where OpenCV really came useful, since it has built-in capabilities of detecting faces, using something called a Haar Wavelet (which is basically a square wave that you use to apply some digital signal processing to your image) and Cascade Classifiers (a bunch of weighted machine learning classifiers used together).

Now, for its ease and convenience, as well as to maintain forward compatibility with future versions of OpenCV, I decided to ditch the OpenCV1.0 API in favour of the OpenCV2+ API, and moreover, I decided to do it in Python since the numpy library proves very powerful for linear algebra operations and rapid prototyping is easy with Python. But now here comes the big problem: OpenCV 1.0 was really popular (I guess it was because of the Kaehler-Bradski book) and almost all the links on the web show code in OpenCV 1.0 on how to detect faces (which is quite ugly compared to the 2.0 version), be it C or Python. So I jumped right into OpenCV2’s wonderful documentation, but alas, only code in C++. Not to fear, since I have good proficiency in both languages now, and it was just a matter of translating code (This is where iPython came really, really useful for its auto-complete functionality) .

So here I will show you how to detect faces in images with OpenCV2 and Python as succinctly as possible, due to the lack of more documentation, and hopefully you’ll come up with the next-big-thing!

from cv2 import *

# Locate face in image. Using cv2 functions
def detect_face(image):

    # Specify the trained cascade classifier
    face_cascade_name = "/usr/local/share/OpenCV/haarcascades/haarcascade_frontalface_alt.xml"

    # Create a cascade classifier
    face_cascade = cv2.CascadeClassifier()

    # Load the specified classifier

    #Preprocess the image
    grayimg = cv2.cvtColor(image, cv2.cv.CV_BGR2GRAY)
    grayimg = cv2.equalizeHist(grayimg)

    #Run the classifiers
    faces = face_cascade.detectMultiScale(grayimg, 1.1, 2, 0|cv2.cv.CV_HAAR_SCALE_IMAGE, (30, 30))

    print "Faces detected"

    if len(faces) != 0:            # If there are faces in the images
        for face in faces:         # For each face in the image

            # Get the origin co-ordinates and the length and width till where the face extends
            x, y, lx, ly = face[0], face[1], face[2], face[3]

            # Draw rectangles around all the faces
            cv2.rectangle(image, (x, y), (x + lx, y + ly), cv2.cv.RGB(155, 255, 25), 2)

    # Display the images with the faces marked
    cv2.imshow("Detected face", image)


    return (x, y, lx, ly)

def main():

    # Specify the image to process and pass it to the function

if __name__ == "__main__":

And there you go. In just a few dozen lines of code, you have a simple to understand and elegant face detector. It definitely works since I tested it out on plenty of datasets that I had obtained for my project as well as some personal images! So try it out and let me know about your thoughts in the comments.


Artificial Intelligence Vs Computer Security

I encountered a very interesting thing about Google a few days back and I’ve been itching to blog about it. Since this post is about something that touches almost everybody with an internet connection (namely, Web Search), I’ll try keeping it as low tech as possible. Also, the issue here is not exactly Security as we know it, but can be thought of as a branch and a way for the author to bring up this point.

First, a short description of what happened: I was using Google image search and was searching for images of penthouses and Google would not return me any images. Instead what I saw, a message stating that the word “penthouse” had been filtered out of the search query and since the query was that word only, the resulting query was a blank. When I deactivated search filtering, that was when I realized what was making Google filter out the word. Apparently, the word “Penthouse” is the name of a famous adult magazine. You have got to be kidding me.

Google Fail

Google Fail

Now this brings me to the main point. For those of you who are unaware of the way Google Search works, I’ll put it down in one paragraph. There are 2 main components to their search: Natural Language Processing and Web Page Indexing. Natural Language Processing (or NLP) is a sub-field of Artificial Intelligence which tries to give machines the ability to comprehend human-like speech and language. We think understanding human language is easy, since we are humans, but this is a very hard problem for machines due to the subtleties of human language like ambiguity and sarcasm. Google sends its query to one of its many servers which runs its NLP programs and tries to make heads and tails of our query as well as pick out the important words on which to return results on. So a query like, “What is WordPress?” will have the keywords “what” indicating the type of sentence/query and “wordpress” which gives the servers a hint at what to narrow their search on. There are many sub-algorithms used to achieve this, like Stemming, Word normalization, etc., and since Google uses Statistical NLP, there is also a lot of Machine Learning involved. The second component, Web Page Indexing, is simply the way Google’s servers store various webpages, so as to allow efficient retrieval of matching, relevant webpages. This where the famous PageRank algorithm comes to play, the algorithm that started Google from the minds of 2 Stanford students. Now, I may not be an expert in these fields, but I did attend a lecture on this technology by two experts from Microsoft Research at Microsoft, so you can take my word for now if you feel you are a bit lost.

So we have AI doing most of the heavy lifting. Where does Security come into play? Well, the filtering of my search results can be construed as the security part. An important part of security is Access Control, maintaining what people can/cannot do on a system. Google filtered my results in order to protect me from data that I might have found offensive, but what it did was also remove legitimate search results.

Now the bigger question is, did I fail to get results because the AI failed to understand my intent, or because the security measures put into place were too stringent to allow the results to come through? From the AI perspective, I would say that the system failed because of something called High Bias and a lack of smoothing in Machine Learning. However, from a Security perspective, I can argue that Google should know better than to block all results of the word “penthouse” and be so strong about it. So even though they have all these PhDs and other super-smarties working for them, this sure was a dumb mistake. This leads to the point of debate whether Computer Security is affecting Artificial Intelligence and vice versa, and in what ways. This is an interesting question for me, as I am personally involved in AI research but, thanks to some friends at Amrita University, have developed a keen interest in security. I am sure, with some probing, we can find some other such examples of conflict. This little problem of Google’s will (hopefully) be resolved in the next few weeks, but the point of debate may not be resolved so easily.


I and a bunch of guys at my college decided to take part in the Atos IT Challenge and try to win ourselves a trip to the UK for the 2012 London Olympics, the theme being Smart Mobility.

Our college was one of the lucky ones to have been selected to participate in this contest and I can see the competition from my institute itself, if not the entire world, is pretty nerve-wracking.

My team consists of Arth “Vyarth” Patel, Nimit “Lame-it” Shah, Sunny “Not Funny” Shah, Jigar “The Dafda” Dafda and Yours truly.

Having vetoed the original idea for a mobile disaster management system, I suggested a smart app that can mine and learn from the User’s history, match that with the current trends in the world and filter it with his/her geo-location to give the User some truly meaningful information that is highly relevant to their current situation, with all the data-crunching taking place on the Cloud.

This competition gives us a great opportunity to learn about a new field, Mobile Computing, a new platform, and also a chance to practice some of the AI techniques we’ve picked up in the past few months. But the biggest plan is to possibly Patent our idea. That would be TOTALLY COOL :-D.

Our idea page is here. If you liked our idea, please don’t hesitate to click on the FaceBook Like Button.

Now let’s hope the judges deem our idea worthy of selection so we can make it into the next round and begin the development stage.


Probabilistic Analysis of Algorithms

Probability Theory has always been one area of study that has made more than its fair share of students cringe. However, its importance in the field of Computer Science, especially Algorithm Design and Artificial Intelligence cannot be emphasized enough.

Here’s a question I recently encountered as part of my selection process as an Intern for Microsoft: Consider Randomized QuickSort for a set of ‘n’ numbers. What is the probability that we always get the worst-case scenario for this algorithm?

Now you may want to try this on your own first as it is deceptively simple.

Here’s the solution: By the analysis and design of QuickSort, we get the worst-case scenario when we select either the least or greatest element in the array of n numbers, as the pivot. Thus the probability for n numbers is 2/n. Now we take the remaining n-1 numbers (Divide and Conquer) and similarly the worst-case is obtained when we select the first or last element. Thus the new probability is 2/(n-1). This continues on until we have just 2 elements left, which then gives the probability of 2/2.

Here’s the smart part: When we analyze Randomized QuickSort, we notice that the selection of the pivot in the recurrence is independent of the pivot selection in the outer equation. Thus, since the probabilities are independent, we can multiply all our earlier probabilities to give us [ 2/n * 2/(n-1) * 2/(n-2) *…* 2/2 ] = [ 2^(n-1)/n! ] (The n-1 as we have n-1 2s in the numerator) .

And thus we have the answer!! Pretty simple, huh? All we needed to know was what is the worst-case criteria for QuickSort and that pivot selection is independent of the pivot selection in the recurrence.

Similarly, probability theory plays a big part in Bayes Networks, which are essentially Probability Distributions in a graph, which is a core topic in AI. Knowing about total probability, Joint Probability and Bayes Rule really helps in the inference of Bayes Nets. Combine that with Variable Independence, Explaining Away Effect and Enumeration and you’ve got yourself a Smart System.

Just because all this seems simple, doesn’t necessarily mean Probability theory is simple. It does tend to get a tad bit confusing at times, but with ample practice, one should find himself/herself being able to predict the future!