I wrangle bytes at Cognizant, making healthcare data sing. By night, I spill the beans (and code) on #dataengineering on Hashnode. Join me to conquer coding & laugh along the way! ๐
๐คHow Bucketing Organizes Your Apache Spark Universeโก ยท Bucketing ๐ชฃ Bucketing is a way to assign rows of a dataset to specific buckets and collocate...
Although we are quite familiar with join operations in spark, but do you know spark has some inbuilt tricks to do joins in an efficient manner without...
Breaking News in Dataland: The WithColumn Chain is a Performance Thief! ยท Attention, PySpark wranglers! We've uncovered a hidden culprit that's been...