RDD tracking
Every RDD keeps track of :
- where it came from ?
- All transformation it took to reach it's current state
These steps are called Lineage/DAG of an RDD
Data Visualization
- In Spark, a job is associated with a chain of RDD dependencies organized in a direct acyclic graph (DAG)
- Dependency graph where every RDD knows its parent RDD and the transformation
Note: All transformation are in memory and none of the transformation are
applied till we access the results
Advantage of Lineage
- Allows RDD's to be reconstructed when nodes crash.
- We start from the source file. Apply all the transformation which are stored and recreate the RDD
- Allows RDD's to be lazily instantiated (materialized) when accessing the results
I have been searching for a useful post like this on salesforce course details, it is highly helpful for me and I have a great experience with this Salesforce Training who are providing certification and job assistance. Salesforce certification in Noida
ReplyDeleteGreat Article android based projects
ReplyDeleteJava Training in Chennai Project Center in Chennai Java Training in Chennai projects for cse The Angular Training covers a wide range of topics including Components, Angular Directives, Angular Services, Pipes, security fundamentals, Routing, and Angular programmability. The new Angular TRaining will lay the foundation you need to specialise in Single Page Application developer. Angular Training Project Centers in Chennai
This is such a great resource that you are providing and you give it away for free. I love seeing blog that understand the value of providing a quality resource for free. 리니지갤러리
ReplyDeleteThanks for the detailed article on this topic. I would like to see more such awesome articles from you. Also you can get the new and best features of GBWhatsapp which are coming in 2022- GBWhatsapp 2022 APK
ReplyDelete