Facebook dives deep on their big-data stack.
Originally published on Kapuno in the Technology community
π Under the Hood: Scheduling MapReduce jobs more efficiently with Corona | Facebook (Article)
Memorable excerpt:
βOver half a petabyte of new data arrives in the warehouse every 24 hours, and ad-hoc queries, data pipelines, and custom MapReduce jobs process this raw data around the clock to generate more meaningful features and aggregations.β