Engineering Blog

Engineering

Learn from our challenges and triumphs as our talented engineering team offers insights for discussion and sharing.

Migrating from SVN to Git

Over the last few weeks, LiveRamp has converted from an all-SVN shop to an all-Git shop, switching over our entire stack of Java/Ruby + Ant/Ivy/Rake + Hudson. It was a big effort, but it has yielded big results. It wasn't easy to decide to commit to the migration, but in the end I think ...

More Compact Than CompactProtocol: TupleProtocol

LiveRamp makes extensive use of Thrift's CompactProtocol to save space for long-term data storage and for communicating between services. However, this summer, star LiveRamp intern Armaan Sarkar took us to a new level of compact-ness with his work on the new TupleProtocol. While we are completely happy with the CompactProtocol for permanent data storage and ...

Google Calendar + Arduino = The Roominator

Last week was our quarterly Hackweek, and this time around a handful of us set out to solve a slightly more physical problem than we usually tackle: conference room abuse. We have nine conference rooms in our office these days, and even though we regularly use Google Calendar to schedule meetings, we still struggle ...

Java Performance: synchronized() vs Lock

Yesterday, I noticed that one of our systems was using a Lock where a plain old synchronized() block would suffice, and I thought to myself, does this matter? Since the Lock was already fulfilling the same role, the only real question was performance. My gut told me that there should be a performance difference between ...

Bringing Ruby’s ActiveRecord to Java

LiveRamp started out using Ruby and Rails extensively to build out our systems. We loved the flexibility that it gave us to quickly put together a functional application. Tools like ActiveRecord are huge productivity boosters, saving us the trouble of hand-coding database interaction and letting us focus directly on our application. However, we evolved to ...

Memory-efficient sparse bitsets

A bitset is a data structure designed to store a vector of boolean values very compactly - one bit per value. In practice, they're a really handy way to save memory. However, we had a situation in one of our extremely memory-intensive applications where a simple bitset wouldn't cut it. We have over 2500 ...

Día de los Proyectos Muertos

At LiveRamp, every employee is required to be entrepreneurial, and a big part of entrepreneurship is the willingness to invest enormous efforts in initiatives that might not succeed. It has thus always been an important part of our culture to celebrate both well-executed successes as well as well-executed failures. The cultural recognition of "constructive failure" ...

Striving for zero copies with Thrift 0.5

"Zero copies" is a common optimization principle used in high-performance applications. The gist of the technique is to have the smallest number of byte array copies necessary for the server to perform its task. Byte array copies are one of those insidious time-wasters that are hard to understand or even detect until you start ...

Analyzing network load in Map/Reduce

Hadoop Map/Reduce can put a heavy toll on your network. Just how heavy, though, isn't obvious. This is an especially important consideration when you are expanding your cluster. LiveRamp recently encountered this situation, and in the process we devised a neat theoretical model for analyzing how network topology affects Map/Reduce. When does Hadoop put the ...

BloomFilter

We recently had a situation where we had to search a big list of 500 million hashes against a list of 40 million hashes. The 500M hashes were stored in flat, unsorted text files on 5 DVDs, so there was no easy way to search that list. The 40M hashes were stored in a ...