Back to results
Cover image for book Apache Flume: Distributed Log Collection for Hadoop - Second Edition

Apache Flume: Distributed Log Collection for Hadoop - Second Edition

By:Steven Hoffman
Publisher:Packt Publishing
Print ISBN:9781784392178
eText ISBN:9781784399146
Edition:2
Copyright:2015
Format:Reflowable

eBook Features

Instant Access

Purchase and read your book immediately

Read Offline

Access your eTextbook anytime and anywhere

Study Tools

Built-in study tools like highlights and more

Read Aloud

Listen and follow along as Bookshelf reads to you

Book Description

If you are a Hadoop programmer who wants to learn about Flume to be able to move datasets into Hadoop in a timely and replicable manner, then this book is ideal for you. No prior knowledge about Apache Flume is necessary, but a basic knowledge of Hadoop and the Hadoop File System (HDFS) is assumed.

What you will learn

  • Understand the Flume architecture, and also how to download and install open source Flume from Apache
  • Follow along a detailed example of transporting weblogs in Near Real Time (NRT) to Kibana/Elasticsearch and archival in HDFS
  • Learn tips and tricks for transporting logs and data in your production environment
  • Understand and configure the Hadoop File System (HDFS) Sink
  • Use a morphlinebacked Sink to feed data into Solr
  • Create redundant data flows using sink groups
  • Configure and use various sources to ingest data
  • Inspect data records and move them between multiple destinations based on payload content
  • Transform data enroute to Hadoop and monitor your data flows

Who this book is for

• 2026 © SAU Tech Bookstore. All Rights Reserved.