This is bird-picture overview of NN domain, without going deep into specific applications.
Overview
Aspects of building NN
- Initialization methods
- Forward/Backpropagation
- Activation functions (sigmoid, tanh, ReLU, …)
- Layer kinds
- Cost functions
- Training/Optimization algorythms
- Batching/Epochs approaches
To sort: dropout, regularization
Levels of usage
- understand math (derive forw/back-prop by hand)
- program base math (with nothing more than numpy and similar)
- build NN in framework (pytorch and similar)
- finetune existing model
- prompts lol
Main domains
- CV (computer vision)
- LLM (text)
Specifics
Layers
- FCN (fully-connected)
- CNN and Pooling (convolution)
- RNN (recurrent), LSTM, GRU
- Diffusion
- Attention/Transformer
To sort: Autoencoders, Boltzmann Machines, GAN, GNN, RL
Datasets
- MNIST (grayscale 28x28 handwritten digits)
- Cifar-10 (32x32 color images in 10 categories)
- ImageNet-22K (14mln+ color images of var size in 14k+ categories)
Milestone NN architectures
- LeNet-5, 1998 (CNN)
- AlexNet, 2012 (CNN)
- VGGNet, 2014
- ResNet, 2015
Python
There are 2 main branches with similar functionality but with relatively different programming approaches.
More popular branch (statefull, OOP-driven):
- pandas
- matplotlib
- numpy
- torch, tensorflow, keras
Alternative (pure, stateless, functional-driven):
- polars
- plotnine
- jax
- flax