Web Snippets: Features of Hadoop

Friday, January 26, 2018

Features of Hadoop

Parallel Execution

Extremely good at high-volume batch processing because of the ability to do parallel processing.
Can perform 10 times faster than on a single thread server or on mainframe

Data Locality

Data is not moved .
Processing data where it resides

Note: This is the ideal choice, however it might not be possible to always achieve data locality

Fault Tolerance

Data is stored in HDFS, where data automatically gets replicated into 2 other locations.
Level of replication is configurable and this makes it incredibly reliable data storage system

Economical

Generates cost benefits by bringing massively parallel computing to commodity servers
A rough cost of hadoop inluding hardware, software and other expenses come to about 1,000 $ a terabyte about 1/5th to 1/20th of the data management system.

Scalable

Open source platform and runs on industry-standard platform
New nodes can be easily added as data volume of processing needs grow without altering anything in the existing system and program

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)