Back to results
Cover image for book Parallel R

Parallel R

Data Analysis in the Distributed World
By:Q. Ethan McCallum; Stephen Weston
Publisher:O'Reilly Media, Inc.
Print ISBN:9781449309923
eText ISBN:9781449320331
Edition:1
Copyright:2011
Format:Reflowable

eBook Features

Instant Access

Purchase and read your book immediately

Read Offline

Access your eTextbook anytime and anywhere

Study Tools

Built-in study tools like highlights and more

Read Aloud

Listen and follow along as Bookshelf reads to you

It’s tough to argue with R as a high-quality, cross-platform, open source statistical software product—unless you’re in the business of crunching Big Data. This concise book introduces you to several strategies for using R to analyze large datasets, including three chapters on using R and Hadoop together. You’ll learn the basics of Snow, Multicore, Parallel, Segue, RHIPE, and Hadoop Streaming, including how to find them, how to use them, when they work well, and when they don’t. With these packages, you can overcome R’s single-threaded nature by spreading work across multiple CPUs, or offloading work to multiple machines to address R’s memory barrier. Snow: works well in a traditional cluster environment Multicore: popular for multiprocessor and multicore computers Parallel: part of the upcoming R 2.14.0 release R+Hadoop: provides low-level access to a popular form of cluster computing RHIPE: uses Hadoop’s power with R’s language and interactive shell Segue: lets you use Elastic MapReduce as a backend for lapply-style operations

• 2026 © SAU Tech Bookstore. All Rights Reserved.