Setting up screen on linux

Screen is a program that runs in terminal and allows you to create multiple terminals with one ssh login. It runs in background so that you can attach to the same session if ssh disconnects. Its particularly useful when you are using mobile data or have to leave some process running in foreground.

Working with docker

During migration of a website which has not been updated in a while, it occured to me that docker will be ideal for this website. It did not have a lot of traffic and business impact was also less.

Should caste based reservation be allowed?

TL;DR: Caste based reservations should be allowed until caste based discrimination is stopped.

Spark streaming reduceByKeyAndWindow unstable application

I wanted to run a job which runs 24×7 and which reports if certain keywords occur more than a N times in the stream. Spark streaming looked a ideal candidate for this task. Spark has a reduceByKeyAndWindow function which was exactly what I was looking for.

Spark streaming: Fixing all executors not getting jobs

I was working on a feature recently which needed a streaming job that runs 24×7 and processing 100 million rows per day. The spark web ui is a wonderful tool to look at how things are running internally. While debugging I noticed that the streaming jobs were getting allocated to only one machine. Spark has a set priority to dispatch jobs to the executors based on proximity (on the same host, in the same pool etc) and if they complete the job within a fixed interval then all the jobs are sent to the same executor.