Engineering Blog


Learn from our challenges and triumphs as our talented engineering team offers insights for discussion and sharing.

Automatic logging of MapReduce task failures

When using Cascading to run MapReduce jobs in production, the most common exception we find in our job logs look like this: Caused by: cascading.flow.FlowException: step failed: (1/1), with job id: job_201307251526_37599, please see cluster logs for failure messages at cascading.flow.planner.FlowStepJob.blockOnJob( at cascading.flow.planner.FlowStepJob.start( at at at java.util.concurrent.FutureTask$Sync.innerRun( at at java.util.concurrent.ThreadPoolExecutor$Worker.runTask( at java.util.concurrent.ThreadPoolExecutor$ at This exception tells us that the job failed ...