Phillip Manywanda
@DataguyPhill
Followers
2
Following
17
Media
137
Statuses
310
Joined April 2024
@TDataImmersed @DabereNnamani β
Cleaned messy data β
Uncovered job trends β
Created powerful visuals EDA & Visualization are π for Data Science! Want to see everything? Check out my notebook: π https://t.co/AYkbQwLt05 Which visualization do you use most? Letβs discuss! ππ
anaconda.com
0
0
0
@TDataImmersed @DabereNnamani π₯ Seaborn for Advanced Plots Heatmap: Correlation between key variables π₯ Box Plot: Job title vs company ratings π Pair Plot: Relationships between salary, rating & founding year Aesthetics + Insights = π‘
0
0
0
@TDataImmersed @DabereNnamani π Matplotlib for EDA Histogram: Salary distribution π° Bar Chart: Top locations for Data Science jobs πΊοΈ Line Plot: Salary trends by company size π’ Visualizing data brings numbers to life! π₯
0
0
0
@TDataImmersed @DabereNnamani π EDA = Knowing Your Data Summary stats for Rating, Salary, and Revenue Identified top job titles & their average ratings Analyzed salary trends by company size EDA helps spot patterns & anomalies fast! π
0
0
0
@TDataImmersed @DabereNnamani π§Ό Data Cleaning is the foundation of good analysis! Handled missing values π΅οΈ Extracted & cleaned Salary Estimate π° Standardized Company Names & Locations π Data cleaning = better insights! β
0
0
0
π Week 6 was all about Exploratory Data Analysis (EDA) & Visualization! I cleaned, analyzed, and visualized an uncleaned dataset of Data Science jobs using Pandas, Matplotlib & Seaborn. Let's break it down! π§΅ @TDataImmersed
#TDI
@DabereNnamani
5
0
0
Wrap-Up & Full Notebook β
Data cleaned β
New features created β
Data merged β
Insights uncovered This was real-world data prep at its finest! Check out my full notebook here: π h https://anaconda.cloud/share/notebooks/bab3f1ea-092c-4be5-ac0d-4b16fad8224e/overview
0
0
0
String Cleaning & Deck Extraction π‘ Text manipulation in Pandas I extracted the deck from the Cabin column to analyze survival rates by deck. π· Question β‘οΈ π· My Solution Text data isnβt always cleanβPandas makes it easy!
0
0
0
π Merge vs. Concatenate? merge() = Joins datasets on a key (like PassengerId) concat() = Stacks datasets (vertically or horizontally) π· Question β‘οΈ π· My Solution These techniques help when dealing with multiple data sources!
0
0
0
Creating New Features π οΈ Feature Engineering I added: β
FamilySize = (sibsp + parch + 1) β
FarePerPerson = Fare Γ· FamilySize π· Question β‘οΈ π· My Solution Why? These features give new insights into passengersβ social & economic backgrounds!
0
0
0
π° Outliers distort averages! I detected extreme fare prices using the IQR method and capped them instead of removing. π· Question β‘οΈ π· My Solution Capping ensures we keep all data while limiting extreme values! π³οΈ
0
0
0
π Data transformation step! Instead of 1, 2, 3, I converted Pclass into "1st Class", "2nd Class", "3rd Class" for better readability. π· Question β‘οΈ π· My Solution Why? Clear labels improve data storytelling! π
0
0
0
π Duplicate records skew analysis! Using drop_duplicates(), I checked and removed any duplicates in Titanic data. π· Question β‘οΈ π· My Solution Have you ever encountered duplicate headaches? π€―
0
0
0
You may not know what to do with missing values... π€ Drop or Fill? dropna() β Remove missing data (good if thereβs little missing) fillna() β Replace missing values (mean, median, etc.) I used the median for Age to avoid outliers! π·
0
0
0
Finding Missing Data π Identifying missing values in the Titanic dataset using Pandas: π· Question β‘οΈ π· My Solution Missing values can break analysisβstep 1 is always detection!
0
0
0
π§Ό Why is data cleaning important? Missing values can bias analysis π Duplicates distort insights π Outliers skew statistics π A clean dataset = better decisions! β
0
0
0
π Week 5 was all about Data Cleaning & Transformation with Pandas! From handling missing values to merging DataFrames, this was a deep dive into real-world data prep. Letβs break it down! π§΅π
11
0
0
@DabereNnamani @TDataImmersed @JacobAjala That wraps up my Week 3 highlights! π Want to explore the complete code and dive into more details? Check it out here: π https://t.co/VRhcj8ulP5 What was your favorite part? Letβs discuss! β¨
anaconda.com
0
0
0
@DabereNnamani @TDataImmersed @JacobAjala π NumPy Adventures NumPy made math magical! I: Built and manipulated 1D/2D arrays Found fare stats (min, max, mean) for Titanic data Explored indexing and random arrays π²β¨ π· Questions β‘οΈ π· My Solutions How do YOU use NumPy? Let me know! π
0
0
0