Pentaho, Splunk and the Great Machine-Data Blending

There are a lot of ghosts in the machines.

Pentaho and Splunk have announced a new collaboration over machine data, with a combined platform (called Pentaho Business Analytics for Splunk Enterprise) that will run machine data gathered by Splunk through Pentaho’s analytics tools.

Pentaho Business Analytics for Splunk Enterprise will also combine that machine data with NoSQL, Apache Hadoop, and enterprise application datasets. “BI analyst teams work with other datasets and visualizations, and that is why our alliance with Pentaho is so important,” Eddie Satterly, Big Data evangelist for Splunk, wrote in a statement. “We are excited to collaborate with Pentaho to enable business users to gain new insights from exploding volumes of machine-generated data alongside other structured data sources.”

That’s all well and good for Pentaho and Splunk (and for any companies that use their respective products), but what does this merging of data and analytics tools suggest for those working in the business-intelligence arena? First, that the team-ups between various data firms—including a burst of alliances earlier this year—are still an ongoing thing. Second, there could be more mergers focused on integrating machine data more tightly with the massive river of data many businesses analyze every day.

Machine data—also known as machine-generated data—is defined most simply as that data not created by a human end-user. Automated factory tools churning out hourly reports about their work progress, for example, can fall into that “machine data” category. While this type of data is generally regarded as reliable, it still needs to be analyzed and taken into account when making business decisions; given the amount of machine data generated by certain industries, the alternative is leaving a giant “black hole” of unanalyzed datasets and logs, which could come to bite a business later on.

But it’s not enough to simply analyze machine data by itself; its full impact on a business is best ascertained when mixed with data from other sources, such as customers or vendors. That desire will likely drive the emergence of even more platforms that blend machine data with a variety of other data sources and analytical tools; this Pentaho-and-Splunk alliance surely isn’t the last.

 

Image: kentoh/Shutterstock.com

Post a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>