Monday, March 23, 2009

Streaming Hadoop Data Into R Scripts

Along the lines of Mongo Measurement Requires Mongo Management, the HadoopStreaming package on CRAN provides utilities for applying R scripts to Hadoop streaming.

Hadoop has been deployed on Amazon's EC2. See our more recent ACM article, "Hadoop Superlinear Scalability: The Perpetual Motion of Parallel Performance" for a more detailed discussion about scalability issues.

No comments: