PHASE 1 :
In Lesson 2, we discussed two major activities involved during the system planning phase.
System planning is the first phase of system development meanwhile system analysis is a
Software Requirements Specification
[Note: The following template is provided for use with the Rational Unified Process]
[Text enclosed in square brackets and displayed in blue italics (style=InfoBlue) is included
Hadoop is an open-source framework to store and process Big Data in a
distributed environment. It contains two modules, one is MapReduce and
another is Hadoop Distributed File System (HDFS).
MapReduce: It is a parallel programming model for proces
r-directory > Reference Links > Free Data Sets
If you work with statistical programming long enough, you're going ta want to find
more data to work with, either to practice
What is Hive
Hive is a data warehouse infrastructure tool to process structured data in
Hadoop. It resides on top of Hadoop to summarize Big Data, and makes
querying and analyzing easy.
Initially Hive was developed by Facebook, later the Apache Software F
Project #2: Customer Complaints Analysis
Data: Publicly available dataset, containing a few lakh observations with attributes like; CustomerId,
Payment Mode, Product Details, Complaint, Location, Status of the complaint, etc.