data_dec Profile Banner
Data Engineering Community (DEC) Profile
Data Engineering Community (DEC)

@data_dec

Followers
2K
Following
332
Media
345
Statuses
1K

A non-profit organisation providing Data Engineers a supportive and collaborative platform.

Nigeria
Joined April 2024
Don't wanna be here? Send us removal request.
@data_dec
Data Engineering Community (DEC)
2 years
๐Ÿš€ Data Engineering Community is a vibrant hub for anyone passionate about data engineering, from aspiring data engineers to seasoned professionals. We provide a supportive and collaborative platform for learning, A thread๐Ÿงต #DataEngjneering
2
7
26
@data_dec
Data Engineering Community (DEC)
7 hours
๐Ÿ“Œ Spark brings multiple data paradigms, batch, stream, graph, and SQL into one scalable ecosystem. #dataengineering #apachespark #graphx #sql #batchprocesssing #streamingprocessing
0
0
0
@data_dec
Data Engineering Community (DEC)
7 hours
โ—‰ Spark SQL: This queries structured data using SQL syntax or DataFrames. The Catalyst Optimiser ensures fast query execution. โ—‰ GraphX: Enables distributed graph computation for use cases like social network analysis or route optimisation.
1
0
0
@data_dec
Data Engineering Community (DEC)
7 hours
processing when data doesnโ€™t change rapidly. โ—‰ Streaming Processing: This handles real-time data from IoT sensors, clicks, and financial transactions. Spark Structured Streaming treats live data as an unbounded table that continuously grows.
1
0
0
@data_dec
Data Engineering Community (DEC)
7 hours
๐Ÿ’กWhat are the capabilities of Apache Spark? Apache Spark isnโ€™t a single-purpose tool. Itโ€™s a unified platform that can handle different types of data workloads: โ—‰ Batch Processing: Process large, static datasets โ€” e.g., daily logs, transaction histories. Use batch
1
0
0
@data_dec
Data Engineering Community (DEC)
1 day
troubleshoot jobs, and design better data pipelines.
0
0
1
@data_dec
Data Engineering Community (DEC)
1 day
โ—‰ The cluster manager is the kitchen supervisor (assigns resources). โ—‰ The executors are the line cooks (prepare the food). โ—‰ The DAG is the recipe,ย  the optimized plan of steps to follow. ๐๐จ๐ญ๐ž: Understanding this architecture helps you tune performance,
1
0
0
@data_dec
Data Engineering Community (DEC)
1 day
โ—‰ DAG (Directed Acyclic Graph): Spark doesnโ€™t just run line by line, it builds a logical plan (DAG), optimizes it, and executes it efficiently. ๐Ÿ“Œ Simple Analogy: Imagine a busy restaurant: โ—‰ The driver is the head chef (decides whatโ€™s cooked).
1
0
0
@data_dec
Data Engineering Community (DEC)
1 day
managers are YARN, Kubernetes, or Spark Standalone. โ—‰ Executors: These are worker nodes that actually perform computations and store intermediate data in memory. โ—‰ Tasks and Jobs: Spark breaks your operations into stages and tasks that run in parallel across executors.
1
0
0
@data_dec
Data Engineering Community (DEC)
1 day
kicks into action: โ—‰ Driver Program: This is where your code starts. It defines transformations, actions, and creates the SparkSession. - Think of it as the brain that plans and coordinates your work. โ—‰ Cluster Manager: Allocates resources and manages executors. Commo
1
0
0
@data_dec
Data Engineering Community (DEC)
1 day
๐Ÿ’ก๐”๐ง๐๐ž๐ซ๐ฌ๐ญ๐š๐ง๐๐ข๐ง๐  ๐’๐ฉ๐š๐ซ๐คโ€™๐ฌ ๐€๐ซ๐œ๐ก๐ข๐ญ๐ž๐œ๐ญ๐ฎ๐ซ๐ž To use Spark effectively, you need to understand its architecture, how it runs your jobs behind the scenes. When you submit a Spark job (e.g., spark-submit or PySpark script), Sparkโ€™s architecture
1
3
4
@data_dec
Data Engineering Community (DEC)
2 days
Thatโ€™s the problem Spark was built to solve. #Apachespark #Spark #dataengineering #languages
0
0
1
@data_dec
Data Engineering Community (DEC)
2 days
In essence, Spark is like the backbone of modern data engineering.ย  Itโ€™s what powers engines, fraud detection, real-time dashboards, and ETL pipelines at companies like Netflix, Uber, and Amazon. Note: If you canโ€™t process data fast enough, you canโ€™t react fast enough.
1
0
0
@data_dec
Data Engineering Community (DEC)
2 days
โ—‰ Scalability: Handles workloads across thousands of nodes. โ—‰ Versatility: Supports multiple languages likeย  Python, SQL, Java, Scala, and R. โ—‰ Flexibility: Works with both batch (historical) and streaming (real-time) data.
1
0
0
@data_dec
Data Engineering Community (DEC)
2 days
for small to medium datasets, but they fall short when dealing with gigabytes or terabytes of data. Thatโ€™s why Apache Spark was born. ๐Ÿ“Œ Itโ€™s an open-source distributed computing engine designed for: โ—‰ Speed: Performs in-memory computation, reducing disk I/O.
1
0
0
@data_dec
Data Engineering Community (DEC)
2 days
๐Ÿ’ก๐–๐ก๐ฒ ๐ƒ๐จ๐ž๐ฌ ๐€๐ฉ๐š๐œ๐ก๐ž ๐’๐ฉ๐š๐ซ๐ค ๐‘๐ž๐š๐ฅ๐ฅ๐ฒ ๐„๐ฑ๐ข๐ฌ๐ญ? We live in a world where data is generated faster than ever, including transactions, IoT signals, social media clicks, sensor data, and system logs. Traditional tools like Excel, SQL, and pandas are powerful
1
6
6
@data_dec
Data Engineering Community (DEC)
4 days
๐Ÿ’ซ Happy New Week, Data Engineers
0
1
3
@mb_awak
Mbuotidem Awak #MS Fabric
9 days
What started as a DataFestHackathon is now published in the proceedings of the AI/Robotics Conference! Happy to see my name proudly listed alongside my co-authors @josh_bori @Mkm_world @josh_salako And even more exciting, our work has been selected for an oral presentation at
9
28
162
@data_dec
Data Engineering Community (DEC)
11 days
๐Ÿ’ซ Happy New Week, Data Engineers!
0
1
10
@CoreDataEngr
Core Data Engineers
16 days
Core Data Engineers at the Data Engineering Community (DEC) Meetup! In September, members of the Core Data Engineers family, past and current participants, instructors, and team members showed up and represented at the @data_dec Meetup!
1
6
11
@Eddie_Gregs
Eddy
18 days
Over the weekend, I had the privilege of speaking at an event hosted by @IbomData on โ€œPeople, Pipeline, and Possibilities.โ€ The topic was to remind everyone that even with the rapid growth of AI, the human factor in Data & Analytics Engineering remains irreplaceable.
2
2
6