Data Science Fundamentals

Overview

A complete, self-contained ML reference built as Jupyter notebooks. Covers the stack from scratch: Python fundamentals, data preprocessing, classical ML, deep learning, and NLP. Each folder is a standalone topic with its own datasets and examples.

What It Covers

Topic	Content
Python	Core language, data structures, OOP, standard library
Data Science	Missing values, encoding, scaling, train/test split
Regression	Linear, polynomial, SVR, random forest
Classification	Logistic regression, KNN, Naive Bayes, SVM, decision trees, model comparison
Clustering	K-means
Deep Learning	Neural networks, customer churn, audiobooks, MNIST
NLP	Text preprocessing, sentiment analysis, NER, LDA topic modeling, spaCy (10 notebooks)

Why It Exists

Built as a personal reference while working through the ML curriculum — structured to be reproducible from scratch. Each notebook is self-contained: load the dataset, run the cells, see the output.