Engineering Blog

Engineering

Learn from our challenges and triumphs as our talented engineering team offers insights for discussion and sharing.

Automatic logging of MapReduce task failures

When using Cascading to run MapReduce jobs in production, the most common exception we find in our job logs look like this: Caused by: cascading.flow.FlowException: step failed: (1/1), with job id: job_201307251526_37599, please see cluster logs for failure messages at cascading.flow.planner.FlowStepJob.blockOnJob(FlowStepJob.java:210) at cascading.flow.planner.FlowStepJob.start(FlowStepJob.java:145) at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:120) at cascading.flow.planner.FlowStepJob.call(FlowStepJob.java:42) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) This exception tells us that the job failed ...

Debugging “ClassCastException: cascading.tap.hadoop.io.MultiInputSplit” exceptions when testing Cascading flows

When testing our Hadoop data workflows we've intermittently run into this error, which ends up failing the MapReduce job being tested: java.lang.ClassCastException: cascading.tap.hadoop.io.MultiInputSplit cannot be cast to org.apache.hadoop.mapred.FileSplit at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:371) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) A quick search for the error didn't find any obvious problems. When we dug into the problem a a bit more, we noticed a couple ...