Hackweek XXXIII recap

Engineering

Learn from our challenges and triumphs as our talented engineering team offers insights for discussion and sharing.

Hackweek XXXIII recap

Engineering

During Hackweek, everyone at LiveRamp gets a week to work on projects they’re passionate about. Hackweek has a long history at Liveramp; we’ve gotten many efficiency tools, new products, interesting investigations, and other fun things from it. We are very proud of results of this Hackweek, too.

Screen Shot 2015-06-19 at 1.22.24 PM

DSC_0263

In case you missed our previous Hackweek recap, you can read it here.

Project highlights

Simple KB

Alex, Harry and Meng developed a convenient chrome extension that makes it simpler to search for articles in our internal resource website. It can also be used to add and remove labels to existing articles.

Screen Shot 2015-06-19 at 3.00.43 PM

Canary

To make sure we’re delivering customer data on time, we closely monitor our workflows, and we want the right team to be alerted whenever something goes wrong. Although the logic for auto-detecting failures and delays is mostly simple, there used to be a lot of overhead to implementing alerts. Chris, Armaan and Shia removed this overhead by creating this great UI where you can easily register an alert. It auto-detects errors and sends an alert email to the right team. Canary also includes support for generating neat html formatted email bodies.

Screen Shot 2015-06-19 at 3.06.41 PM

Screen Shot 2015-06-19 at 1.22.51 PM

Breaking BAdmin

Admin is our internal web interface, which we use for tasks like importing files and looking up stats. After many years of growth, it has become a huge rails / javascript repo, which makes it brittle and hard to develop.  Many engineers from our frontend team worked together to split admin into smaller components. As a result, the whole admin webpage is now split by application. They also gave us easier rails development by providing reusable assets and a rails app generator.

Screen Shot 2015-06-19 at 3.08.40 PM

Slot Machine dashboard

Our mapreduce cluster is a heavily-utilized (~100%) resource that is shared among many workflows. When we add a new workflow or run an existing workflow more often, we impact other workflows by increasing the contention for cluster resources. This impact is not easily measurable and we’re left with many questions that aren’t easy to answer:

  • How much processor resource are we utilizing?
  • When should we invest more in the cluster?
  • How many machines should we buy?.

Alfonso, Ben, Jeremy, Tenzing and Vishrut visualized this problem for us in their slot machine project (in mapreduce terminology, a slot is a portion of machine that can be reserved to run a map or reduce task) . In the slot machine dashboard, you can easily see how much cluster resource each workflow takes.

Screen Shot 2015-06-19 at 3.10.07 PM

 

DSC_0283