Retirement

After much deliberation, I have decided that I simply don’t have the time to update my blog on a fortnightly basis as I had originally planned to.

Please don’t get me wrong, I have a tonne of content to write, but actually getting spare time when I am not cognitively overloaded to etch out my thoughts on this page is just not happening. This, along with the fact that WordPress’ support for Markdown is rudimentary at best, have been the motivating factors.

Thus, the decision to retire this blog and move the content to my website (varunagrawal.github.io) where I will continue to sporadically write about my daily crazies. This blog will no longer be updated, and any new content will be only on my website.

Advertisements

Arch Linux – First Impressions

So thanks to the recent Creator’s Update from Microsoft, my Ubuntu partition became inaccessible due to what I believe is a dirty bit being set. After recovering all my data and ranting about the ordeal online, I decided to reinstall Linux so I could get back to work. That’s when it hit me that I finally had the opportunity to really get Arch Linux working on my laptop and make it my primary distro moving forward. After silently thanking Microsoft’s bad update, I got to work.

Now fair warning, installing Arch is crazy hard, simply because you have to do everything yourself. All the way from configuring the BootLoader to partitioning your device to mounting your filesystem, it’s a lot of work and very overwhelming at first. Thankfully, the Arch Wiki is one of the greatest resources for all things Linux on the internet today.

After 4 painstaking hours of going through mounds of documentation and double checking various things to make sure I wasn’t making mistakes, I finally had Arch installed. I did document each step to streamline the process so hopefully it should take far less time the next time I attempt this herculean labour again. I now had a fully functioning Arch Linux desktop.

Finally, I began installing the necessary packages to set up my development environment. Let me go ahead and say this, pacman is one of the best package managers out there! Every package installation was smooth, the output messages were clear and meaningful, and I could see the benefit of the Arch philosophy since I always had the latest version of everything.

Software that did not have dedicated packages could be easily installed using the ArchLinux User Repository and the makepkg command, which is simply just git cloneing a repo and running a make like command on it to install the software. Indeed, this is how I installed a lot of packages such as VS Code.

The best part about Arch though was the speed. My previous Ubuntu installation would often lag on the UI end, but even with the GNOME desktop, Arch was smooth as butter.

Battery life also improved significantly. It helped that Arch was so trimmed down and didn’t waste time installing useless junk I wouldn’t use.

The main downside to Arch was its lack of mainstream support. ROS isn’t as easy to install, RVM was considerably broken due to its own fault of not supporting OpenSSL 1.0, and lots of the latest packages were just simply missing. While I hope Linuxbrew gains more momentum as the de facto way to install packages, my productivity did take a bit of a hit.

That said, Arch Linux was well worth the patience and burden to get installed. Given the option, I would always choose Arch and encourage others to do the same. Hopefully, a time will come when Arch does become mainstream and the Arch way continues to prevail without facing the same fate as Canonical.

Recovering URLs from Chrome, and other data

Last evening, while working on some C# code on Windows, I saw a notification for the Windows Creator’s Update. Well this was cool since a lot of people having been raving about it and I happily installed the updated hoping for the latest and greatest.

However, later that evening when I booted into my Linux partition, I was greeted with a boot into the Emergency Mode. I assumed that the Windows update must have reset some of my old settings which Windows has been notoriously known to do, so I booted back into Windows, and updated the settings again, booted from a Live-USB to run fdisk and clean any dirty bits on my partitions, but to no avail.
Sadly, after messing around with various solutions off the internet, I had to concede defeat and accept the fact that I could no longer boot into my linux OS. After some strong cussing at Microsoft and rage yelling, I decided to pick up the pieces and recover as much data as I could.

Booting Into Your System

One of the reasons I love working with Linux is the fact that I can use Live-USBs. This is a whole operating system that boots from your USB and runs off your RAM without actually installing the OS on your machine, i.e. it does not touch the secondary memory (HDD/SSD/etc). This is extremely useful since now you can boot into your system and go rummaging through your old linux filesystem without affecting it. Since it runs off the RAM, it’s a bit slower than a native installation, but hey, this is only for recovery purposes.

The easiest way is to just create a Ubuntu Live-USB on Windows. The distro and version doesn’t matter since this is only for recovery, but I generally create a Live-USB of the distro I would end up installing again in order to save time.

Make sure you’re booting off the USB and when you come to the GRUB screen, select Try without installing. This will boot you into Ubuntu where you can use the File Explorer GUI to select the old partition (which I will henceforth refer to as the partition) and mount it. Now I could go exploring into the file system as normal. I had to chown -R $USER some directories in order to be able to access them, but I managed to recover all the data and code from my system onto an external HDD.

Now the only thing remaining was recovering my Chrome tabs from my last web browsing session. This was non-trivial since I now had to find out how Chrome stored my previous session tabs and I also had to go about understanding OneTab’s structure to recover the URLs saved there. If you don’t use OneTab, you should – it’s great!

Thankfully, Stack Overflow was once again a saving grace and I was able to get straightforward solutions without too much trouble.

Recovering Session Tabs

Head over to ~/.config/google-chrome/Default on the partition. This will be under the /media directory and will mostly be denoted by custom GUIDs which you can identify by looking at the folder’s properties from the GUI. Once you’re in the the Default directory, all you need to do is copy over 2 files: Current Session and Current Tabs. The files are binary files so you can’t just read them from a text editor despite the Unix philosophy of everything is a file.

Now to recover your session tabs, we need to use a Chrome browser on another machine. You need to save all the tabs on your Chrome browser (OneTab is great for this), then you should export those tabs to a text file, shut down the Chrome browser and make sure you kill all the Chrome tasks. Then you go to the same location on your alternate machine. The location for different OSes can be found here. Replace the existing Current Session and Current Tabs files with the one you retrieved from the partition. Start up Chrome and you should see all your “supposedly” lost tabs in front of you in all their glories.

In case it didn’t work, just make sure you close Chrome completely, include any background working processes and try the above process again. Since you have all the old as well as new URLs backed up, there shouldn’t be any cause for concern.

Recovering OneTab URLs

Now comes the other bit: Recovering your saved URLs from the OneTab extension. Luckily, this is easier than it sounds. First off, ensure you save your current saved URLs in OneTab. This can be done by exporting them and saving them in a text file. Time to recover data!

In the same config directory ~/.config/google-chrome/Default there is a directory called Local Storage. Inside that directory you’ll see a bunch of files, with the extensions .localstorage and .localstorage-journal. Each of these pairs of files correspond to an extension installed in Chrome and are the data storage files for the extension. The extension corresponding to each file can be identified by the unique ID of the extension, hence for OneTab, you’re looking for the following files:

  • chrome-extension_chphlpgkkbolifaimnlloiipkdnihall_0.localstorage-journal
  • chrome-extension_chphlpgkkbolifaimnlloiipkdnihall_0.localstorage

Copy these files over to your alternate system in the same Local Storage directory. Restart Chrome completely and run the OneTab extension and voila! You should now see all the tabs saved in OneTab from your old system. You can import your previously exported tabs into OneTab again and you should have a combination of both sets of tabs.

Conclusion

Hopefully, at this point you should be able to resume working on your machine with no interruptions. With a little bit of understanding of how Chrome saves its data, we were able to easily recover previous sessions and extension data. Not too bad, huh?

References

  • Session Tab recovery courtesy of this SO thread.
  • OneTab URL recovery from here.

Introduction to the precision-recall plot

Great read on Precision-Recall curves. Helped clarify some doubts in my head on this very important model metric.

Source: Introduction to the precision-recall plot

An alternate read which is slightly more technical can be found here.

xkcd Font And The Fine Art of Kerning

I am huge fan of the webcomic xkcd not only because of its intelligent take on many scientific, mathematical and physical phenomenon in a subtly humorous manner, but also due to it’s simple style where all the characters are stick figures. This helps one focus on the the message of the comic as well as helping its creator, the totally awesome Randall Munroe, publish new strips very frequently, satiating our need to not only procrastinate, but also not feel too bad about it. Also, if you don’t know who Randall Munroe is, google him right now. I mean now! Yes, stop reading and go google him and come back.

Back from our digression, I’ve noticed that the font used in xkcd is very curious. After some simple searching around, I found out that the font is Randall’s very own handwriting. What is cool about this font is the kerning involved. Kerning is nothing but the spacing between characters in text and this matters more than you would think. For example, take a look at the below strip, courtesy of Mr. Munroe:

kerning

Being a perfectionist is the worst thing on this planet, since after I discovered kerning, I now see it everywhere. Moreover, the fact that Randall’s handwriting follows very good kerning on the webcomic is a minor miracle, in my humble opinion.

Of course, the xkcd font was enticing enough that I started to have very strong desires to use this font on my system, and lo and behold, the font was actually available online on the ipython repository. So, now I shall describe to you how to use this amazing font on your own system to use with everything you wish to. The process below is specifically for Linux, though I imagine it should be far easier to install the font on Windows and macOS since they have much better font managers.

First things first, get the font! You can simply go to the repository and either download the single font xkcd-Regular.otf or download the whole repo in a zip file and extract that file. The other file xkcd.otf isn’t important so you can just leave it be.

The next step is to create a directory .fonts in your home folder. Thus the location of the directory is ~/.fonts. You want to now copy over the font file to this directory since Linux is aware of this directory as a possible source of fonts even though it may not have existed on your system a couple of minutes ago. Oh well.

Now the final step is to let Linux know that you want this font to be available system-wide. So we run fc-cache -fv to rebuild the entire font cache and make the xkcd font available to every application on your system. To confirm the font is installed, you can open up Font Viewer and check to see if the font is installed. The nice thing about the name xkcd is that it is unique and it will mostly be the last font on your system.

Time for an application. Let’s make VS Code super cool by making xkcd the default font. If you don’t use VS Code, you are seriously missing out on one of the best cross-platform editors available right now. Open up the settings page on VS Code and add this line to your user settings section

"editor.fontFamily": "xkcd, 'Droid Sans Mono', 'Courier New', monospace, 'Droid Sans Fallback'",

As you can see, xkcd is the first font in the list, so if it is correctly installed, VS Code will use it. Save the file and voila! Your text should now look like this

xkcd

Pretty cool, huh? Hopefully you enjoy your fancy new font and read more of xkcd.

Eviva!

Creating Command Line Tools with Python

After spending considerable amount of time on a computer as a developer, the GUI starts to seem amazingly slow and you realize just how awesome command line tools can be. Especially if you follow the Unix philosophy of “one tool for one task”, you can quickly chain together multiple in-built tools and quickly accomplish most tasks without a single line of code.

However, what about when the tool you need doesn’t exist? Wouldn’t it be great to just create it? In this post, I’d like to show you how using Python. My reason for using python instead of the more native C/C++ is simple: easy to read code for even anybody new to programming to understand the basic structure, fast prototyping, a rich set of tools, and the general ease of use. You’re of course welcome to use other languages such as Ruby (I’ve seen a lot of great tools written in Ruby), but then you wouldn’t be reading this post, would you? Without further ado, let’s begin!

Project Setup

I faced a problem a couple of weeks ago where I needed a simple command to nuke a directory in order to rebuild some binaries and unfortunately no tool existed for this at the time. Hence I created nuke, a convenient command line tool that does exactly what it says, nuke directories.

To start, we create a project directory called nuke-tool. Inside nuke-tool, we create a directory called nuke and inside that directory, two files, nuke.py which will house all our main code logic and  __init__.py so that python is able to understand this is a package called nuke.

In the root of the project directory, we should also create a file test_nuke.py to create tests for our code. Don’t want to accidentally nuke something else now, do we? We also create a file called setup.py to aid pip in installing nuke. We’ll flesh these out one at a time.

I did mention a rich set of libraries to help us out in this process. I personally use argparse to help with argument parsing and clint from the awesome Kenneth Reitz to help with output text formatting and confirmation prompts (simple Yes/No prompts). Later on, you’ll see code snippets showing these libraries in action.

Nuke ’em

In any command line tool, the first thing we need is a way to parse command line arguments. It’s the arguments you pass to a command which are separated by spaces. For example, grep takes two arguments, a pattern and a file name grep main nuke.py.

In python, we can easily use the argparse module to help us create a simple argument parser. I wanted to provide two options to the user, an argument specifying the directory to nuke (with the default being the current directory), and a flag -y to override the confirmation prompt. The code to do this is:

import argparse
import os
.
.
.
def _argparse():
    parser = argparse.ArgumentParser("nuke")
    parser.add_argument("directory", nargs='?', default=os.getcwd(),
                        help="Directory to nuke! Default is current directory")
    parser.add_argument("-y", help="Confirm nuking", action="store_true")
                        args = parser.parse_args()
    return args

Continue reading

Data Visualization For Greater Good

I’ve always been interested in understanding images. From how images are formed to how we can get machines to understand images in similar manner to the average human being. It is helpful that our visual world is rich and that images can capture so much information due to their high dimensionality.

The key word here is “visual”. Human beings are visual creatures. We rely on our eyes more than we would like to acknowledge for multiple tasks (as seen via various sight related idioms). As I delve into more areas of Computer Science, the use of data to accomplish superhuman feats is ever growing. Deep Learning for one is a new field that has tremendously tapped into these enormous collections of data to produce computational models with astounding capabilities. However, to better understand what models can be trained, many researchers recommend visualizing the data, again iterating the first line of this paragraph. Thus to marry the two above ideas, I decided to make my life easier by exploring some data visualization and explain why fundamental data viz is a highly useful and rewarding skill to have.

For my tooling, I use the well-designed and utilitarian Data Driven Documents or D3 library. Of course, I could have used other libraries such as Bokeh (Python), but D3 has a lot of great features that overshadow the fact that it is written in Javascript (ughh). For one, D3 directly renders to HTML rather than generating intermediate JS or images, making the visualizations super interactive. Moreover, D3’s idiomatic approach is what won me over. Loading, filtering and manipulating the data is done asynchronously in a systematic and declarative fashion, saving me a lot of headache. Finally, D3 has the amazing bl.ocks.org, maintained by D3 creator Mike Bostock (who has a PhD in Data Visualization from Stanford, by the way) and Mr. Bostock also write fantastic tutorials which use D3 and explain its capabilities.

Now for the actual goodness. Since we have more readily accessible datasets, visualizing them helps us to leverage our “visual creatures” persona to better understand them and leverage their latent information. For example, I created this amazing word cloud using D3, reading in the text of Andrej Karpathy’s excellent article on what it means to get a PhD:

wordcloud

All of a sudden, you understand the key themes of his article even though you may not have read it, not to mention this looks super cool! This is the power of data visualization and a key proponent of my belief that data viz is a useful and rewarding skill. You can take a look at the code on my block, or fork it and add your own text corpus.

I did mention interactivity didn’t I? That word cloud may not be interactive, but this plot sure is. That is nothing but a plot of the 1024 dimensional vectors generated from a Convolutional Neural Network on a Geolocation dataset, where the idea is to train a model to predict the location where the image was taken by having the model look only at the image. If you’re confused about how I managed to plot a 1024-D vector in 2-D space, then I would recommend taking a look at the fabulous t-SNE algorithm and the open source implementation of the faster Barnes Hut t-SNE algorithm available from the inventor himself (if you look closely, you may see my name in the list of contributors 😉 ).

Another cool example is that of choropleth’s or heat maps as they are commonly known. Mike Bostock has some amazing visualizations using choropleth’s on a simple statistic such as census data which I highly recommend taking a look at, amidst his other gorgeous visualizations. I personally plan to use choropleths to visualize some of the geolocation datasets I am playing around with.

Some visualizations you might start off with if data viz is new to you is stock market prediction. You can pick up the data from Yahoo Finance and then with the power of D3, you could quickly just see how a particular stock has been performing over weeks, months, or even years. Kind of nice, compared to staring at all those floating point numbers without too much trouble either.

Overall, I hope via these simple examples, I have demonstrated the inherent power and usefulness of data visualization and how it can help tackle some of more challenging problems which our society faces. Now go out there and visualize some greater good!