Saturday, February 12, 2011

Tricky question using Python defaultdict

I got a very interesting question which puzzled me, though I solved it in ten minutes. Its a nice challenge, try to answer the question before you look for the answer. Use python interpreter to try it out.

Collections module in python has a defaultdict(default_factory) which will take in a default value and return it if we try to access the key which is not there. In the case of normal dictionary, we will get a key error.

>> d = dict()
>>d['rr']
KeyError: 1

>> from collections import defaultdict
>> dd= defaultdict(int)
>>dd['rr']
0

But now if you try to access this, we get
>>dd['rr']['tt']
TypeError: 'int' object is unsubscriptable

Now, here is the question. How will you make it work? Also you need to access arbitrary number of dictionary of dictionaries (for example >>dd['rr']['tt']['t'] =1), write a function to make this possible.

Friday, January 7, 2011

Top Business Models that rocked 2010.

Even if an organization has a great product/service offering, I think it is very important that they have a very good business model in place to complement it. A successful business model not only gives an organization a sustainable competitive advantage that will set it apart from others but I think it is also often the difference between surviving and not. I came across these great slides that pretty much portray the big hitters of 2010 and I recommend looking at it. Awesome!
View more presentations from Board of Innovation (BOI).

Wednesday, January 5, 2011

Windows on Mac OS

There is always a bit of confusion about the right way to use Windows from a Mac OS, bootcamp or VMware. One nice thing about using Boot-camp and Parallels/VMware fusion together is having only one installation of Windows. Then when you are on Mac OS, if you need to run Windows for a typical office type application, you can just launch Windows via Parallels/VMware fusion(your bootcamp installation).

This way not only allows one installation of Windows but also only one Windows to update and keep current.

Also if you install your bootcamp Windows installation using a NTFS partition, you can run WinClone from the Mac side to backup your Windows installation. WinClone works great.

Friday, December 31, 2010

Android Oauth.

We all know OAUTH authentication mechanism is secure and works great. To understand OAUTH more see wiki article and OAuth official website. Implementing OAUTH for Android was tricky since a user has to login and authenticate using the browser, which will redirect the user back to our Android application. Once an android application/intent goes to the background and resumes, all the locally stored Objects are reset. So this will cause an Exception/error and break out of the application when using OAUTH, we will see how to avoid this.

For using OAuth in android we will use a library called oauth-signpost written in java. Download the oauth-signpost core and commons http jars.

Assuming that you know the basics of android like editing manifest files, configuring build paths etc I will be more specific about the problem and the tricks to avoid/resolve them. One of the options I read online(stack overflow) to overcome Objects getting reset was " First of all, you do not need to save the whole consumer and provider object. All you need to do is store the requestToken and the requestSecret. Luckily, those are Strings, so you don't need to write them to disk or anything. Just store them in the sharedPreferences or something like that." Yes you can use sharedPreferences but there is a more simpler way I used, just declare all the String objects as Static. They wont be reset and it worked for me like charm!

I will also give a step by step tutorial about how to use OAUTH for android. We will use Twitter Oauth. I will post the code soon..

Lazy Replication in a Distributed System.

I am inspired to write about replication in the back-end servers after robustly implementing Lazy replication in my Distributed micro-blogging service. I.e, Data Servers/ Replica Managers synchronize themselves by Lazy replication, also called as Gossip replication.

The other two types of replication strategies(active and passive) provide fault tolerance and strong consistency(immediately synchronizing before responding to a client request) but I preferred to trade latency for strong consistency. Hence I implemented Lazy replication to get a highly available service (immediately return responses and synchronize later) which will be eventually consistent.
Note: I used JSON for communication everywhere in the system.

DATA SERVERS/REPLICAS SYNCHRONIZATION.
* Replicas synchronize with each other through gossip mechanism/messages.
* A gossip message contains a log of past updates and a vector timestamp, which is the vector clock. I differentiated delete updates from others with the help of a flag.
* When a Gossip message is received, the receiver applies all stable updates.
* After four rounds of Gossiping the Data Server removes entries from the log table(and frees up the memory) that are known to have been applied everywhere, which is determined by the help of vector clocks received from all the servers.

DATA SERVER & HTTP SERVER COMMUNICATION.
The Http server associates a vector timestamp with each query when communicating with a Data server. The Data server also returns its vector timestamp to the front end Http server. Note that each update/log will also have a unique identifier.
* If the http server:timestamp <> data server:timestamp, the data server queues the query until it has received enough gossip messages to sufficiently update its state.
* The http server sends heartbeat messages once in a while to keep track of newly added replica managers.

I balanced the load by randomly selecting a data server to communicate from the http server. When that data server fails, the http server adds it in its dead list and selects an other available data server. In a system with N data storage servers, the system will tolerate the failure of up to N-1 nodes. As long as one data storage server is available, a client request will succeed. In the data returned by the back-end the events will be causally ordered with the help of vector clock timestamps, which means that updates that were initially handled by the same back-end will be in chronological order. Feel free to ask if you have any questions!

Class/Object Diagram.

I have always wanted to visually see the design of a project /program, the classes and the relationship between attributes/objects in these classes. Rather than drawing it manually each time I change the design/architecture of a project, I wanted to find a tool which would do it for me.

I found this cool Eclipse plug-in Objectaid UML, which does the exact same thing. All we have to do is simply select all the JAVA classes and drag it into the diagram space, the diagrams will appear magically. And what’s more nice is as we update our code in Eclipse, the diagram is updated as well. We can control and easily select what we want to see in the diagrams like relationships, classifiers, attributes, operations etc. They do have a nice one minute tutorial which will easily get you started. Awesome!

Monday, March 30, 2009

Higher education :)

Hi Folks, I have got an admit from Australian National University(ranked #1) for MIT/M computing. I have also got an admit from The University of San Francisco (bay area, more opportunities, excellent course work) computer science. And I am finding it difficult to decide between the two since ANU is ranked #1 in Australia! While I don't know much about the opportunities in Australia, I pretty much know that US has plenty if I complete my MS at USF! Any thoughts and comments will be appreciated. Thanks for reading and have a nice day!