Is There an R in Your Future?

The R programming language is fast becoming the lingua franca of data analysis

By Doug Bartholomew | February 2009 (TV April 2009)


If you have a crystal ball to see into the near future of computing, there’s a good chance you see whole lot of R.

As the use of open source software becomes more prevalent in business, government and academia, the R programming language is fast becoming the platform of choice for  statisticians, data analysts and research scientists. The adoption of R has snowballed to  the point where the estimated number of users ranges from 1 million to 2 million in business and academia.

“People use R to analyze and manipulate data,” says Colin Magee, vice president of sales  and marketing at Revolution Computing, a software firm in New Haven, Conn. “They use it to analyze chemical compounds and perform risk analytics and predictive  analytics.”

Companies using R include many of America’s top firms, including Google, PfizerBank of America, Merck, Shell, and the InterContinental Hotels Group. Why is R suddenly so de rigeur for the data crunching crowd? For one thing, it’s easy to use. You don’t have to be a programming whiz to work with it. And R, which is open-source, is free  – the underlying software code can be downloaded at no cost. Finally, R contains “all the  underlying modeling techniques you use for predictive analytics,” Magee explains. “And  it’s easy to fold data into R, to analyze it, and to output it.”

That doesn’t mean everybody knows how to use R, though. In fact, college graduates as  well as finance and life sciences professionals who can wield R handily may have a leg up on those who don’t.

“Absolutely, these institutions that use R will want people with R skills,” says Magee, whose company offers support as well as its own high-performance version of the  software called Revolution R Enterprise. “As commercial enterprises, particularly those in  the financial industry and life sciences, establish more work in this language, they will  need the talent with this knowledge.”

Magee believes the use of R is likely to continue to expand rapidly in corporations,  nonprofits, and other organizations that need extensive data analysis. “As college  graduates emerge with R skills and enter industry, it’s inevitable that the growth and use  of this language will continue,” he argues.

Interest is currently growing in other fields, too, noticeably high-tech biological  applications, and finance and economics, adds John M. Chambers, a former Bell  Labs researcher and now a consulting professor of statistics at Stanford University. Chambers says it’s standard practice in Stanford’s statistics department to use R for class  demos, and have students use it to perform exercises. “I think a large fraction of students and recent graduates with statistics training worldwide will have experience with R,” he says.

As is the case with other open-source software programs, R isn’t totally free. The reason is companies often need support to tweak the software to meet their specific needs. They  may require expertise to install the software, customize it, and connect it with other  systems. In other words, many organizations that avail themselves of R typically need statisticians or data analysts handy with R to change the underlying code to adapt it to  their business.

Regarding training, most people who use R professionally either learned it in statistics class in college, or picked it up on the job. “I would guess that a large fraction of users just ‘pick it up,’ perhaps going on to a book or an online tutorial later,” Chambers says.

Doug Bartholomew is a California-based business and technology writer.

Comments

  1. BY Ed Borasky says:

    I’ve been programming in R for about ten years and it has become my favorite programming language. I’ve put up a number of resources on my web site … Google for “Data Visualization and R Programming Books” if you can’t see the link above.

  2. BY James Cooper says:

    Not a very in-depth article. How is it used? What are the system requirements? What is the learning curve? How about a couple of examples of how easily it can solve specific problems.

    And the video is excessively cute but content free. I don’t need reporters to flirt at me, or wrinkle their nose at the idea that there are technical details too complex to describe or even understand. I think it demeans both the reporter (“Cat”) and the subject.

  3. BY Rod Grisham says:

    I agree with the first comment. And why are there no links for more details, such as the principal web site for the language information and source/executables?

  4. BY Mark Feffer says:

    Rod’s point about the link is a good one. Here’s the link, which I’ve also added into the story:

    http://www.r-project.org/

  5. BY Jim Beidle says:

    The article did what it was supposed to: it piqued my interest. I understand that it is intended as a leader, drawing my attention to some area that may develop my carreer. However, it is nice when leader stories provide links and information on resources the author has vetted. Please consider doing so in future articles.

    I appreciate the light delivery and I find Cat charming. I also found Cat’s flirty non-sequitur distracted from the content or the piece. On the whole, I’d rather have spent the seconds getting a quick demo or two of the value the R language provides.

    Thanks for the leader! I’m spending a few minutes looking over the material on http://www.r-project.org.

  6. BY David Smith says:

    If you want to keep up with what’s happening with R, I blog about it daily at http://blog.revolution-computing.com as part of my work for REvolution Computing (which mentioned in the article). An easy way to get started with R is to download our distribution for Windows or Mac from http://www.revolution-computing.com/downloads/revolution-r.php .

  7. BY Naomi B. Robbins says:

    Why the ugly 3-D bar chart why discussing R? The graphical defaults in R are outstanding.

  8. BY Paul van Eikeren says:

    If you are interested in using R within the familiar environment of the corporate productivity desktop, Microsoft Office (Word, Excel and PowerPoint), you can use an Office add-in called Inference. You can get started immediately by downloading a free trial version from http://www.InferenceForR.com. The download of Inference also includes Inference Studio, which provides you with the ultimate graphical environment for entering, testing and debugging your R code.

  9. BY Brian Horan says:

    R is great for Desktop, Server, and cluster (MPI) use. There are packages in CRAN for almost everything you could need – if not, write one and submit it.

    The graphics are high quality, the syntax is decent, the statistical calculations are efficient, the matrix operations are convenient, and the price is right.

    (and because it’s open source, you can check the math, if you’re into that kind of thing…)

  10. BY Mike says:

    I just stumbled across this edition of DiceTV,and after reading a bit of the “R” website, have a question for the “R” programmers:

    How does “R” improve upon a language like APL?

  11. Mike, R is a dialect of the “S” language, which originated at Bell Labs. The original “S” language inherited operations on arrays as a whole and on slices of arrays from APL, although the syntax is vastly different. “S” is a functional language, and it inherited a lot from Lisp.

    “S” and R improve on APL by being functional and object-oriented languages, and by being more widely available and easier to read than APL. And R is open source – there have been open source dialects of APL, but none of them have achieved much popularity.

Post a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>