Parallel Execution
- Extremely good at high-volume batch processing because of the ability to do parallel processing.
- Can perform 10 times faster than on a single thread server or on mainframe
- Data is not moved .
- Processing data where it resides
Fault Tolerance
- Data is stored in HDFS, where data automatically gets replicated into 2 other locations.
- Level of replication is configurable and this makes it incredibly reliable data storage system
- Generates cost benefits by bringing massively parallel computing to commodity servers
- A rough cost of hadoop inluding hardware, software and other expenses come to about 1,000 $ a terabyte about 1/5th to 1/20th of the data management system.
- Open source platform and runs on industry-standard platform
- New nodes can be easily added as data volume of processing needs grow without altering anything in the existing system and program
No comments:
Post a Comment