Big Data: Big Data: Hadoop, Spark, NoSQL and IoT

Objectives

  • Understand what big data is and how quickly it’s getting bigger.
  • Manipulate a SQLite relational database using Structured Query Language (SQL).
  • Understand the four major types of NoSQL databases.
  • Store tweets in a MongoDB NoSQL JSON document database and visualize them on a Folium map.
  • Understand Apache Hadoop and how it’s used in big-data batch-processing applications.
  • Build a Hadoop MapReduce application on Microsoft’s Azure HDInsight cloud service.
  • Understand Apache Spark and how it’s used in high-performance, real-time big-data applications.
  • Use Spark streaming to process data in mini-batches.
  • Understand the Internet of Things (IoT) and the publish/subscribe model.
  • Publish messages from a simulated Internet-connected device and visualize its messages in a dashboard.
  • Subscribe to PubNub’s live Twitter and IoT streams and visualize the data.

Outline

Outline (cont.)

Outline (cont.)

  • 17.5 Hadoop
    • 17.5.1 Hadoop Overview
    • 17.5.2 Summarizing Word Lengths in Romeo and Juliet via MapReduce
    • 17.5.3 Creating an Apache Hadoop Cluster in Microsoft Azure HDInsight
    • 17.5.4 Hadoop Streaming
    • 17.5.5 Implementing the Mapper
    • 17.5.6 Implementing the Reducer
    • 17.5.7 Preparing to Run the MapReduce Example
    • 17.5.8 Running the MapReduce Job

Outline (cont.)

Outline (cont.)

  • 17.8 Internet of Things and Dashboards
    • 17.8.1 Publish and Subscribe
    • 17.8.2 Visualizing a PubNub Sample Live Stream with a Freeboard Dashboard
    • 17.8.3 Simulating an Internet-Connected Thermostat in Python
    • 17.8.4 Creating the Dashboard with Freeboard.io
    • 17.8.5 Creating a Python PubNub Subscriber
  • 17.9 Wrap-Up
  • Exercises

©1992–2020 by Pearson Education, Inc. All Rights Reserved. This content is based on Chapter 5 of the book Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and the Cloud.

DISCLAIMER: The authors and publisher of this book have used their best efforts in preparing the book. These efforts include the development, research, and testing of the theories and programs to determine their effectiveness. The authors and publisher make no warranty of any kind, expressed or implied, with regard to these programs or to the documentation contained in these books. The authors and publisher shall not be liable in any event for incidental or consequential damages in connection with, or arising out of, the furnishing, performance, or use of these programs.