Introduction to the precision-recall plot

Great read on Precision-Recall curves. Helped clarify some doubts in my head on this very important model metric.

Source: Introduction to the precision-recall plot

An alternate read which is slightly more technical can be found here.

Why Pokemon Go! Is A Disappointment

Close to 6 months ago, the much awaited mobile game, Pokemon Go!, was released on most major mobile platforms. Now if you’ve been playing the classic Pokemon games series from Gamefreak, all the way from the GameBoy Color to the Nintendo DS, you will know that as kids, we wanted nothing more than to run into the wilderness and encounter rare pokemon to befriend, capture and battle. Go! gives us all of that. From running around catching pokemon, to challenging Gyms and Gym leaders. Unfortunately, that is where the similarities stop.

As someone who has grown up playing all the Pokemon games on the GameBoy systems, seen the first 12 seasons of the Pokemon anime and wasted a good chunk of money on trading cards, Go! was an exciting prospect that was eagerly awaited. After a good year of patience since Niantic made its initial announcement, the app was finally on the App and Play stores, and subsequently on my phone. The app was soon the most downloaded app in the history of the Play store and Nintendo stock rose an astounding 21% (which is ridiculous since a 2% increase is considered solid business). This app was next big thing.

The first few hours of Go! were bliss! Creating an avatar, selecting my starter, just walking around the city, using the proximity indicator to locate those ultra rare Pokemon, it was the largest nostalgia rush I’d had since I graduated college. So far, so good. Then it all changed. The first peeve was the candy system. For those of you in the know, (rare) candy is something we use to level up any Pokemon quickly, but the right way to do it is to battle Pokemon and give them experience points, a.k.a. real life hard-work. The candy system in Go! is nothing of the sort. It requires us to catch more Pokemon of the same species in order to collect more candy that is specific to that Pokemon species and use that to level them up. The concept of peer battles is non-existent and that killed some of my interest in the app. It was still a great app, tonnes of fun, and gave me the required motivation to get my backside into gear and move out of the house. However, constantly catching the same Pokemon again and again, just to level up the first one of that kind you caught, soon became redundant and boring. Strike one.

Then came the next big update: the removal of the proximity indicator. The footstep indicator that displayed how far you had to walk to find a Pokemon was completely removed and thus made the game almost unplayable. When you have to rely on dumb luck to find a Pokemon you’ve really been meaning to catch, it is no longer a game. This was compounded by the fact that Go! comes with no sort of instructions or gameplay objective whatsoever. Suddenly it seemed that the good people over at Niantic had no idea what they were doing. Strike two.

The final kicker for me was the proliferation of illicit means to locate Pokemon. Pokemon trackers and websites displaying where different kinds of Pokemon were located started showing up all over the internet. These were illicit since Niantic had never released an official API and they were pretty much reverse-engineered and hacked together. Niantic made poor choices trying to curb this kind of unfairness, either not addressing the issue, or making the game completely unplayable. By this time, the number of active users of Go! had declined drastically and I was pretty sure that Go! was in doldrums.

As I sit at my desk with the DesMuMe DS emulator running in a different window with Pokemon Platinum, I am reminded of all the reasons why Pokemon is a global phenomenon. Playing the original game series by GameFreak showcases how to take a good idea and make it great through sophisticated execution, something a lot of companies have taken upon themselves in recent times. In comparison, Go! was doomed by its own popularity and the poor decisions by Niantic. As I head to face the first Gym Leader in Pokemon Platinum, I’m quite content having given up Go! and gone back to the classic series. At the end of the day, you gotta catch ’em all!

Solemn OAuth2

Considering all the recent reading I’ve been doing, I decided to tackle a new challenge and read a complete RFC for a change. A RFC (literally Request For Comments) is a document that proposes a new idea for the world to comment on and if it looks good, an organization such as the Internet Engineering Task Force, better known as the IETF, goes ahead and approves it for large scale use. Reading a RFC can be a lot of fun since some of the biggest ideas of today, such as TCP/IP, HTTP and REST started out as humble RFCs. For the same reason, I decided that my first RFC read would be about the ever-so-confusing OAuth2 protocol, described in RFC 6749.

In today’s mobile and web driven world, OAuth2 has been the mainstay that has allowed the whole world to share and access data securely. If it wasn’t for OAuth, things like Facebook login, Gmail and OneDrive would not have been possible. However, OAuth2 can be tricky to get right, not because it is a difficult protocol, but because a layman would be wrapped up more in the jargon than the actual workings. I hope to review OAuth2 and provide a simple working example for everyone to benefit from.

Let’s get some terminology out of the way. The RFC describes 4 entities. Let’s take a simple example of a Twitter (because of its ubiquity) API client to create analogies:

  • Resource Owner – This is you, the user, who stores your data (or resource) on the Twitter servers in the form of tweets, likes and other micro-blogging related data.
  • Resource Server – This is the server owned by Twitter that makes sure your data is securely and safely stored.
  • Client – This is the web or mobile app that we want to use to get access to the data from anywhere in the world. Let’s assume it is a mobile app for now.
  • Authorization Server – This is an independent server (again run by Twitter) whose job is to verify that you are the owner of the data or have been granted access by the owner to access the data. Let’s call this the Twitter Auth server.

Now that that’s out of the way, the basic flow of OAuth 2 is:

  1. The API Client asks the Twitter Server for authorization (either directly or indirectly via the Twitter Auth server) to access the desired resource.
  2. The Twitter Server gives the Client an authorization grant.
  3. The Client then presents the authorization grant to the Auth server, authenticates itself and gets an Access Token in return.
  4. The Client can now use the Access Token to gain access to the desired resource on the Twitter server and perform the function it was designed for.

There you have it! OAuth2 is that easy. The specific implementation/URLs vary, but the general flow is common.

However, we still haven’t talked about the Authorization Grant, why is it important and what that means for the kind of app you’re developing. So let’s quickly go through that:

  • Authorization Code – Here the client simply asks the resource server to redirect itself to the auth server in order to perform the authentication and authorization. Via a ‘redirect_url ‘ parameter, the auth server can send the resource owner back to the desired URL to continue the flow. This is the most common OAuth2 flow you will see.
  • Implicit – Rather than provide an authorization grant, the auth server directly provides the access token, thus greatly simplifying the flow. This is especially useful for websites using Javascript where Javascript can directly access the resource. Note that authentication is not performed since the access token is already provided.
  • Resource Owner Password Credential – This involves directly sending your username and password to the resource server as the authorization grant, over the wire. This is risky and not recommended unless the resource server is highly secured and trusted by your client, which is almost never the case.
  • Client Credentials – Again uses the client’s credentials to authenticate and authorize but only for resources that the client controls or for resources that have been predetermined, and not necessarily all available resources. This flow is not very common so I wouldn’t worry too much about it.

Well that’s it for the salient stuff of the RFC. Of course you can delve deeper into it if you are comfortable with Computer Security terms, but I hope that after reading this post, you are more comfortable understanding how OAuth2 functions and how you can leverage it to power your app.


2014 in review

The stats helper monkeys prepared a 2014 annual report for this blog.

Here’s an excerpt:

A San Francisco cable car holds 60 people. This blog was viewed about 2,100 times in 2014. If it were a cable car, it would take about 35 trips to carry that many people.

Click here to see the complete report.

Firefox Un-Synced

Boy am I boiling mad. You’d think after all the hard work the Open Source community (and Mozilla in particular) get from contributors, you’d have a deterministic, working service, but it seems I have been mistaken.

So here’s what happened. My work laptop got kind of messed up due to some Virtual Machine (Virtualbox, in case you’re wondering) configurations and that introduced a bug in my Network centre, thus leading me to format my OS. Now, as any good hacker, I have over a dozen tabs open in my Firefox browser with sites pointing to code, articles, documentation, youtube, you name it, so I have to make sure that I have these links saved somewhere so I can reload them after I re-install Firefox. Delicious is too slow for my liking and while Pocket would get the job done, the tediousness of cleaning up later was something I dreaded.

In comes Firefox Sync, where I already have an account thanks to years of using Firefox on a Linux machine. I could simply create a folder in my bookmarks, save all the links there, hit ‘Sync Now’ and then have all my bookmarks magically restored. This option isn’t there in Internet Explorer (atleast it doesn’t seem to be) so I had to export all my bookmarks from IE to a different partition.

Now, once I was done reinstalling Windows, I signed in, booted up IE and installed Firefox. While Firefox was installing, I decided to import my bookmarks back into IE, but lo and behold, all the bookmarks were already there. It seems Microsoft added a cloud sync to IE right under our noses, so the only extra work I had to do was delete the exported bookmarks. Microsoft 1, Mozilla 0.

Back to Firefox, I quickly sign in and activate sync. I wait with anticipated breathe as my theme and bookmarks start populating. Then I try to find my backup folder, but wait! I can’t find it anywhere!! I open the Bookmarks manager and to my dismay, I see that the bookmarks are the same as the one on my Linux machine and the sync from my work machine seems to have been overridden, hence losing my saved bookmarks. This almost made me cry, and I am sure if you were someone like me, you’d feel the same way. Thus, it turns out that due to a change in the Sync system in Firefox from an older version to a new one, there was some craziness going on and that made my sync account on my work machine invalid. Microsoft 1, Mozilla -1.

After this, the only damage control I could do was remember as many links as I could (which weren’t a lot considering how dependent I was on Sync) and then unlinking and relinking the sync accounts on both my machines so that they both would be in a stable state. Ironically, I noticed that the Firefox version on my Windows machine is greater than the one on my Linux machine. Now hopefully, Sync should work for me without glitches.

However, this was a very disappointing scenario since failures like these in today’s age is almost unheard of. I have learnt my lesson to never depend on Sync again and always export and import bookmarks explicitly. All I can do now is move on and continue with my work, treating this as a cases similar to a HDD failure. I just hope anyone else using sync doesn’t have to face this situation.

Shame on you Mozilla.

Rob Pike’s 5 Rules of Programming

  • Rule 1. You can’t tell where a program is going to spend its time. Bottlenecks occur in surprising places, so don’t try to second guess and put in a speed hack until you’ve proven that’s where the bottleneck is.
  • Rule 2. Measure. Don’t tune for speed until you’ve measured, and even then don’t unless one part of the code overwhelms the rest.
  • Rule 3. Fancy algorithms are slow when n is small, and n is usually small. Fancy algorithms have big constants. Until you know that n is frequently going to be big, don’t get fancy. (Even if n does get big, use Rule 2 first.)
  • Rule 4. Fancy algorithms are buggier than simple ones, and they’re much harder to implement. Use simple algorithms as well as simple data structures.
  • Rule 5. Data dominates. If you’ve chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming.

2013 in Review

The stats helper monkeys prepared a 2013 annual report for this blog.

Here’s an excerpt:

A San Francisco cable car holds 60 people. This blog was viewed about 2,200 times in 2013. If it were a cable car, it would take about 37 trips to carry that many people.

Click here to see the complete report.