Data Visualization For Greater Good

I’ve always been interested in understanding images. From how images are formed to how we can get machines to understand images in similar manner to the average human being. It is helpful that our visual world is rich and that images can capture so much information due to their high dimensionality.

The key word here is “visual”. Human beings are visual creatures. We rely on our eyes more than we would like to acknowledge for multiple tasks (as seen via various sight related idioms). As I delve into more areas of Computer Science, the use of data to accomplish superhuman feats is ever growing. Deep Learning for one is a new field that has tremendously tapped into these enormous collections of data to produce computational models with astounding capabilities. However, to better understand what models can be trained, many researchers recommend visualizing the data, again iterating the first line of this paragraph. Thus to marry the two above ideas, I decided to make my life easier by exploring some data visualization and explain why fundamental data viz is a highly useful and rewarding skill to have.

For my tooling, I use the well-designed and utilitarian Data Driven Documents or D3 library. Of course, I could have used other libraries such as Bokeh (Python), but D3 has a lot of great features that overshadow the fact that it is written in Javascript (ughh). For one, D3 directly renders to HTML rather than generating intermediate JS or images, making the visualizations super interactive. Moreover, D3’s idiomatic approach is what won me over. Loading, filtering and manipulating the data is done asynchronously in a systematic and declarative fashion, saving me a lot of headache. Finally, D3 has the amazing bl.ocks.org, maintained by D3 creator Mike Bostock (who has a PhD in Data Visualization from Stanford, by the way) and Mr. Bostock also write fantastic tutorials which use D3 and explain its capabilities.

Now for the actual goodness. Since we have more readily accessible datasets, visualizing them helps us to leverage our “visual creatures” persona to better understand them and leverage their latent information. For example, I created this amazing word cloud using D3, reading in the text of Andrej Karpathy’s excellent article on what it means to get a PhD:

wordcloud

All of a sudden, you understand the key themes of his article even though you may not have read it, not to mention this looks super cool! This is the power of data visualization and a key proponent of my belief that data viz is a useful and rewarding skill. You can take a look at the code on my block, or fork it and add your own text corpus.

I did mention interactivity didn’t I? That word cloud may not be interactive, but this plot sure is. That is nothing but a plot of the 1024 dimensional vectors generated from a Convolutional Neural Network on a Geolocation dataset, where the idea is to train a model to predict the location where the image was taken by having the model look only at the image. If you’re confused about how I managed to plot a 1024-D vector in 2-D space, then I would recommend taking a look at the fabulous t-SNE algorithm and the open source implementation of the faster Barnes Hut t-SNE algorithm available from the inventor himself (if you look closely, you may see my name in the list of contributors 😉 ).

Another cool example is that of choropleth’s or heat maps as they are commonly known. Mike Bostock has some amazing visualizations using choropleth’s on a simple statistic such as census data which I highly recommend taking a look at, amidst his other gorgeous visualizations. I personally plan to use choropleths to visualize some of the geolocation datasets I am playing around with.

Some visualizations you might start off with if data viz is new to you is stock market prediction. You can pick up the data from Yahoo Finance and then with the power of D3, you could quickly just see how a particular stock has been performing over weeks, months, or even years. Kind of nice, compared to staring at all those floating point numbers without too much trouble either.

Overall, I hope via these simple examples, I have demonstrated the inherent power and usefulness of data visualization and how it can help tackle some of more challenging problems which our society faces. Now go out there and visualize some greater good!

Advertisements

The Right Way To Install Postgres

Ughh, my Ubuntu distro is acting up again, and this time it is for something as stupid as it being unable to connect to the PostgreSQL server on my machine. I mean, how hard is it for the OS to be able to report a conflict between versions rather than just give up with a “directory not found” error when it fails to get a response from a Unix socket? Apparently, with the mess that is apt-get and its lethargic rate of package updates, not to mention Canonical’s refusal to support forward compatibility of packages, very hard!

As Prof. Jennifer Widom has said, databases are ubiquitous, and Postgres is an industrial strength relational DBMS claiming to be the most advanced of its type, hence my preference for it (not going to open the NoSQL can of worms today, sorry).  However, as we are well aware, advanced and user-friendly are seldom synonymous, choosing to be strange bedfellows more often. Installing Postgres via binaries is a great idea if you work at a huge corporation and update your systems once every 5 years. However, us little folk wish to receive the latest and greatest as best as we can, and hence the reliance on package managers such as aptitude a.k.a. apt-get. So what’s the catch? There has to be one since I am making the effort to write this. Well, Postgres 9.6.2 released last week and apt-get still only shows me 9.5. 2 WHOLE patch updates later and apt-get is still coughing up furballs when I look for 9.6. So before I pull out all my hair, let’s get Postgres installed the right way, with Megadeth’s Dystopia aptly playing in the background.

First things first, how to get setup so that we can easily upgrade Postgres no matter what distro of Linux we use. Here’s where I introduce you to this handy little tool called linuxbrew. Linuxbrew is a package manager inspired from OSX’s extremely popular Homebrew package manager. What’s great about Linuxbrew is that it uses Github repos as its source rather than inaccessible private channels. This ensures we always have access to the latest fixes and updates. Installing linuxbrew is fairly straightforward:

ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Linuxbrew/install/master/install)"
PATH="$HOME/.linuxbrew/bin:$PATH"

Don’t forget to add that PATH update to your ~/.(zsh|bash)rc file so that you have access to the brew command. Once you have that done, the next step is super easy:

brew install postgres

This should take a while, since it will get the latest tarball, untar it, make it and get postgres up and running for you. On my dated machine, it took around 3.5 minutes, enough time for me to get a nice hot cup of mocha.

To verify that the Postgres server is up and running, type psql into your command line. This should open up the SQL prompt and you’re good to go and can stop reading here and go build something awesome. If not, then guess we have a few more steps to go. We have installed Postgres with the magic of linuxbrew, but linuxbrew isn’t a service manager, so it doesn’t quite know that it needs to start the server. Let’s do that. Type the command into your terminal:

pg_ctl start -D $HOME/.linuxbrew/var/postgres

Some online forums such as StackOverflow may have answers which add the -l flag to define a logfile. I am omitting that since I want Postgres to manage its logs and not have random logfiles throughout my filesystem. You should see the message server starting and a bunch of other status messages. Just hit enter and you should see your terminal again if you don’t already. Now if you type in psql, you should see that your Postgres SQL prompt functions perfectly.

As you have seen, with literally 4 lines of code (3 if you’re one of the lucky few), you’ve got a working installation of a super advanced RDBMS and that too one that is easily upgradeable. If you wish to update Postgres ever, just type in:

brew upgrade postgres

That’s it! Hope you enjoy using Postgres along with all your sweet NoSQL datastores. Looking forward to seeing what you build.

Eviva!

Solemn OAuth2

Considering all the recent reading I’ve been doing, I decided to tackle a new challenge and read a complete RFC for a change. A RFC (literally Request For Comments) is a document that proposes a new idea for the world to comment on and if it looks good, an organization such as the Internet Engineering Task Force, better known as the IETF, goes ahead and approves it for large scale use. Reading a RFC can be a lot of fun since some of the biggest ideas of today, such as TCP/IP, HTTP and REST started out as humble RFCs. For the same reason, I decided that my first RFC read would be about the ever-so-confusing OAuth2 protocol, described in RFC 6749.

In today’s mobile and web driven world, OAuth2 has been the mainstay that has allowed the whole world to share and access data securely. If it wasn’t for OAuth, things like Facebook login, Gmail and OneDrive would not have been possible. However, OAuth2 can be tricky to get right, not because it is a difficult protocol, but because a layman would be wrapped up more in the jargon than the actual workings. I hope to review OAuth2 and provide a simple working example for everyone to benefit from.

Let’s get some terminology out of the way. The RFC describes 4 entities. Let’s take a simple example of a Twitter (because of its ubiquity) API client to create analogies:

  • Resource Owner – This is you, the user, who stores your data (or resource) on the Twitter servers in the form of tweets, likes and other micro-blogging related data.
  • Resource Server – This is the server owned by Twitter that makes sure your data is securely and safely stored.
  • Client – This is the web or mobile app that we want to use to get access to the data from anywhere in the world. Let’s assume it is a mobile app for now.
  • Authorization Server – This is an independent server (again run by Twitter) whose job is to verify that you are the owner of the data or have been granted access by the owner to access the data. Let’s call this the Twitter Auth server.

Now that that’s out of the way, the basic flow of OAuth 2 is:

  1. The API Client asks the Twitter Server for authorization (either directly or indirectly via the Twitter Auth server) to access the desired resource.
  2. The Twitter Server gives the Client an authorization grant.
  3. The Client then presents the authorization grant to the Auth server, authenticates itself and gets an Access Token in return.
  4. The Client can now use the Access Token to gain access to the desired resource on the Twitter server and perform the function it was designed for.

There you have it! OAuth2 is that easy. The specific implementation/URLs vary, but the general flow is common.

However, we still haven’t talked about the Authorization Grant, why is it important and what that means for the kind of app you’re developing. So let’s quickly go through that:

  • Authorization Code – Here the client simply asks the resource server to redirect itself to the auth server in order to perform the authentication and authorization. Via a ‘redirect_url ‘ parameter, the auth server can send the resource owner back to the desired URL to continue the flow. This is the most common OAuth2 flow you will see.
  • Implicit – Rather than provide an authorization grant, the auth server directly provides the access token, thus greatly simplifying the flow. This is especially useful for websites using Javascript where Javascript can directly access the resource. Note that authentication is not performed since the access token is already provided.
  • Resource Owner Password Credential – This involves directly sending your username and password to the resource server as the authorization grant, over the wire. This is risky and not recommended unless the resource server is highly secured and trusted by your client, which is almost never the case.
  • Client Credentials – Again uses the client’s credentials to authenticate and authorize but only for resources that the client controls or for resources that have been predetermined, and not necessarily all available resources. This flow is not very common so I wouldn’t worry too much about it.

Well that’s it for the salient stuff of the RFC. Of course you can delve deeper into it if you are comfortable with Computer Security terms, but I hope that after reading this post, you are more comfortable understanding how OAuth2 functions and how you can leverage it to power your app.

Eviva!

Definition Of Geek

Computing today involves many different kinds of paradigms and many different kinds of talents. I can speak only to people who happen to have grown up with the strange kind of “brain-organization” that I seem to have somehow acquired; for lack of a better word, let me simply say that I’m a “geek.” I haven’t got a good definition or a good litmus test for geekhood, but I definitely know it when I see it; and I see it in about 2% of the world’s population. The main characteristic is an ability to understand many levels of abstraction simultaneously, and to shift effortlessly between in-the-large and in-the-small. A geek knows that, to achieve a certain high-level goal, you need to add one to a certain counter at a certain time.

Dear young person, if you are a geek, the world needs you, and you will never run out of opportunities to apply your talents. I urge you to take a close look at “literate programming”; it’s a way to write programs that makes me incredibly happy, several times each week. My book The Stanford GraphBase contains several dozen short examples of programs written in that style, intended to be read by humans first and machines next.

Dear young person, if you are not a geek, please ask somebody else for advice.

– Donald Ervin Knuth.

Amazingly inspirational words from the legendary Prof. Donald Knuth, featured in People of ACM, dt. Thursday, 5th June, 2014.

2 Month Notice

Well, technically it’s been more than 2 months since I joined my new workplace, but I guess it is high time I gave an update as to what I am up to.

As expected from working at one of the top tech companies in the world, there is a lot of work (and fun), but there is a great potential for learning as well. And man have I learned a lot!! In my first month here, I worked on a Windows 8 Modern UI app and understood the architecture of building a Modern app hands-on. Not only that, but I also had to integrate the app with a web service using Javascript (which, by the way, is my weakest programming language), and after a lot of fumbling in the dark, I can now bend JS to my absolute will (Evil laughter)!!

In my free time, I got together with a senior of mine, Prakhar Gupta, who works at the same company albeit in the Bangalore office, and quickly coded up a Windows Phone 8 app. The app basically acts as a birthday reminder for all those like me who are poor at remembering dates. Expect to see the app in the Windows Marketplace soon! Along the way, I have also been drawn to Cloud Computing, thanks to the amazing Windows Azure (they have tutorials on creating Android apps with an Azure back-end), and hope to soon gain certifications in Cloud Computing. This along with some other projects that I really can’t talk about (Non Disclosure Agreement, you see) have made my life coding bliss!!

Oh, and did I mention that I have also started development on the Leap Motion? Expect to see more on that and Kinect development in my next few posts. This is from a practical standpoint. From a knowledge standpoint, I am learning everyday. I have learned about good design and best practices while coding in C# and am also re-exploring functional programming with F#. SQL and database querying now seem to come more naturally than ever, and I have also started looking into query execution plans to further optimize my SQL code. I have also been trying to read up on the Common Language Runtime (CLR) which so far looks great with the way the CLR handles managed modules and the variety of support provided for different languages, but with all the work and coding going on, I am having a hard time actually removing time for myself to read more. Will have to stretch more on the reading front!

In the pipeline are some more apps (maybe on Android?) and reading papers and texts on NLP (for WishWasher) and Computer Vision (which is still my favoured field). I do seem to be loaded with work, but hopefully, I will keep inventing things and inspiring you to try new things. Keep an eye out for more on this domain.

Eviva!

Birthday Wish NLP Hack

Well, it was my 22nd birthday 11 days back, and while the real-world was quite uneventful, I managed to create a small stir in the virtual-world.

For this birthday, I decided to do something cool and what is cooler (and a greater sign of laziness) than an AI program that replies to all the birthday wishes on my Facebook wall? This was definitely cool and quite possible given a basic understanding of HTTP and some Artificial Intelligence. After experimenting for 2 days with the Facebook Graph API and FQL, I had all the know-how to create my little bot.

Note: This is from a guy who has never taken a single course on Natural Language Processing and who has next to zero exposure programming NLP programs. Basically, I am a complete NLP noob and this hack is something I am really proud of.

But one major problem still remained: How to create a NLP classifier that would classify wall-posts as birthday wishes? I tried looking for a suitable dataset so I could build either a Support-Vector Machine or Naive Bayes Classifier, but all my search attempts were futile. Even looking for related papers and publications were in vain. That’s when I decided to come up with a little hack of my own. I had read Peter Norvig’s amazing essay on How to Build a Toy Spell Checker and seen how he had used his intuition to create a classifier when he lacked the necessary training dataset. I decided to follow my intuition as well and since my code was in Python (a language well suited for NLP tasks), I started off promptly. Here is the code I came up with:

The first thing I do is create a list of keywords one would normally find in a birthday wish, things like “happy”, “birthday” and “returns”. My main intuition was that when wishing someone, people will use atleast 2 words in the simplest wish, e.g. “Happy Birthday”, so any messages just containing the word “Happy” will be safely ignored, and thus I simply have to check the message to see if atleast 2 such keywords exist in the message.

What I do first is remove all the punctuations from the message and get all the characters to lower-case to avoid string mismatching due to case sensitivity. Then I split the message into a list of words, the delimiter being the default whitespace. This is done by :

</p>
<p>s = ''.join(c for c in message if c not in string.punctuation and c in string.printable)<br />
t = s.lower().split()</p>
<p>

However, I later realized that there exist even lazier people than me who simply use wishes like “HBD”. This completely throws off my Atleast-2-Words theory, so I add a simple hack to check for these abbreviations and put in the expanded form into the message. Thus, I created a dictionary to hold these expansions and I simply check if the abbreviations are present. If they are, I add the expanded form of the abbreviation to a new list that contains all the other non-abbreviated message words added in verbatim [lines 15-20]. Since I never check for locations of keywords, where I add the expanded forms are irrelevant.

Then the next part is simple, bordering on trivial. I iterate through the list of words in my message and check if it is one of the keywords and simply maintain a counter telling me how many of the keywords are present. Python made this much, much easier than C++ or Java.
But alas, another problem: Some people have another bad habit of using extra characters, e.g. “birthdayyyy” instead of “birthday” and this again was throwing my classifier off. Yet another quick fix: I go through all the keywords and check if the current word I am examining has the keyword as a substring. This is done easily in Python strings using the count method [lines 31-34].

Finally, I simply apply my Atleast-2-Words theory. I check if my counter has a value of 2 or more and return True if yes, else False, thus completing a 2 class classifier in a mere 40 lines of code. In a true sense, this is a hack and I didn’t expect it to perform very well, but when put to work, it really managed to do a splendid job and managed to flummox a lot of my friends who tried posting messages that they thought could fool the classifier. Safe to say, I had the last laugh.

Hope you enjoyed reading this and now have enough intuition to create simple classifiers on your own. If you find any bugs or can provide me with improvements, please mention them in the comments.

Eviva!

Watermarking – Truly Transparent Text

Well, I just finished writing up a new software project. It wasn’t something really difficult, just a tool to help people watermark multiple images at once, made at the behest of some photographer friends of mine due to the lack of such a tool on the net. While the tool was pretty straightforward (and a great exercise in Software Engineering), what was really interesting was the way to create the watermark, which required me to make the text transparent to a certain degree. Ofcourse, I had to search for the right way to do it, but again nothing straightforward cropped up (this is becoming really common now) and while I did find some useful code snippets, they did not do exactly what I wanted. Thankfully, on reading the code, I was able to gather enough information about how to construct a basic watermarking algorithm works as well as how to manipulate the alpha value of images to achieve the transparency.

First some Image basics. Every image you see in the digital form is represented by pixels (picture elements in short), and each pixel has 4 values: 3 values which specify how much or Red, Green and Blue should be present in that pixel, and the 4th value is the alpha value, which determines the Opacity/Transparency of that pixel. RGBA in total. Now the alpha value is key here, and once I understood how the alpha value is manipulated, creating the Image Processing module was a cinch. For this example, I used the PIL library of Python.

What I first did was declare the colour and transparency of the text which would be used as the watermark. This was as simple as specifying the tuple (0, 0, 0, trans), where trans is my transparency value. Next, I create a completely transparent white image the same size as my input image. By specifying the RGB values as 255 each, the image was a plain white image, but by specifying alpha as 0, the image was truly transparent. Now comes the fun part: PIL has something called an ImageDraw module which allows one to draw text or other shape onto an image using an instance of a Draw object on the image. So I just use this Draw object on my transparent image,  using the .text method to draw the specified text at a particular position. This gives me a transparent image (or a canvas if you may), with just some text on it and nothing else seen. Remember, the image is transparent so you should not see any white or any other colour, but the text is as transparent as specified by the trans variable. But there is a slight problem, as the program ignores the alpha value  when displaying and manipulating the image. This is easily solved by using something called masking as described in the next chapter. However, we can still assume our image to be truly transparent.

Finally, I use the .paste method of my original image to paste my transparent image onto my input image. In the paste method, the most important thing is the 3rd argument which is the mask. The mask simply specifies which parts of the image being pasted should be actually pasted. The .paste method uses the alpha channel of an image to determine the mask, and since everything but the text has an alpha value of 0, only the text is pasted onto my input image. This results in simply my input image having some text on it, without a whitish blur that ruins your hard taken photo. Since both the images are of the same size, it means that the location you put your watermark will be preserved on pasting it.

Here’s the code:


from PIL import Image, ImageFont, ImageDraw

def watermark(img_file, text, wfont, text_pos, trans):

    """
    Watermarks the specified image with the text in the specified font and on the specified point.
    """

    # Open the image file
    img = Image.open(img_file)

    # The Text to be written will be black with trans as the alpha value
    t_color = (0, 0, 0, trans)

    # Specify alpha as 0 to get transparent image on which to write
    watermark = Image.new("RGBA", img.size, (255, 255, 255, 0))

    # Get a Draw object on my transparent image from the ImageDraw module
    waterdraw = ImageDraw.Draw(watermark, "RGBA")

    # Draw the text onto the transparent image
    waterdraw.text(text_pos, text, fill=t_color, font=wfont)

    # Paste the watermark image onto the input image img, using watermark image as the mask
    img.paste(watermark, None, watermark)

    return img

So you can see that the code is fairly straightforward. Now remember, this is just a demo to give you a basic idea of how to achieve watermarking. There will be similar libraries that allow you to do the same thing in almost every programming language, so all you have to do is apply the concepts. And then you too can get an amazing watermark like this:

Watermarked Image

An image that I watermarked using my own program.

Using the above ideas and techniques, I was able to code up my Watermarking tool in about 14 days, with a fun GUI and efficient processing. I have open sourced it on GitHub and hopefully I can expect you to be a contributor on it.

Eviva!