Hadoop is quickly being deployed at virtually every company that has scads of data to be crunched. And in this rush to beef up your training to learn Hadoop, it pays to know where you can go once you’ve gotten all that Hadoop learning.
The list includes American Airlines, Amazon, Twitter, Foursquare, LinkedIn, HP, and IBM, and that’s just for starters. (Dice has more than 800 Hadoop-related job listings, most of which are engineering posts.)
But one of the biggest users and an early adopters of Hadoop is Yahoo, says InformationWeek. Yahoo initially started out using Hadoop to speed up the indexing of its Web crawl results for its search engine, but today its use of Hadoop is much more extensive.
Yahoo uses Hadoop on 42,000 servers to:
- Analyze Yahoo Mail traffic to block 20.5 billion spam messages daily.
- Analyze users’ activity to send them individually targeted content.
- Advise its advertisers on demographic success and failure and user behavior with ads.
- Do its own internal analysis of information captured from its user interactions.
Yahoo stores an amazing 140 petabytes in Hadoop, which is big data indeed.
And at a time when the online activities of any large company can routinely generate petabytes of raw data to be analyzed on a weekly basis, the distributed processing of these large data sets across clusters of computers using a simple open-source programming model is irresistible to business intelligence experts and network engineers. Expect to hear much more about Hadoop in you conversations about big data.
And one more thing you need to know: Doug Cutting, Hadoop’s inventor, named it after his son’s toy elephant.
Yahoo And Hadoop: In It For The Long Term [InformationWeek]