Apache Spark RDD
RDD (RESILIENT DISTRIBUTED DATASETS)
- Basic program abstraction in Spark
- All operations are performed in memory objects
- Collection of entities
- It can be assigned to a variable and methods can be invoked on it.Methods return values or apply transformations on the RDDs
Characteristics of RDD'S
Partitioned:
- Individual RDD's would have multiple RDD split across in the cluster
- Allows us to process elements of an RDD in parallel
- Data is stored in memory for each node in the cluster
Immutable:
- Once created cannot be changed
- Only two operations can be performed on an RDD
Resilient:
- Can be reconstructed even if a node is crashed.Data held in RDD
- Is not lost Fault tolerant
Your blog is so inspiring for the young generations.thanks for sharing your information with us and please update
ReplyDeletemore new ideas.
JAVA Training in Chennai
JAVA Training in Velachery
Software testing training in chennai
Android Training in Chennai
Selenium Training in Chennai
Hadoop Training in Chennai
JAVA Training in Chennai
Java Training in Tnagar
The more information are open to energize a judicious model, the more exact it becomes. At the point when AI and AI can manage the mind-boggling lift partner data focuses,Data Analytics Course in Bangalore
ReplyDelete