Content:
– Artificial neural networks can be trained using gradient descent;
– Artificial neuron, activation functions;
– What the artificial neuron does + linear separability, ...
– Multiple layers of neurons and universal approximation;
– Feed-forward/recurrent, layered/non-layered architectures;
– Neural networks for classification and regression;
– How to compute the gradients: autodiff;
– Motivation: autodiff vs. symbolic and numeric differentiation;
– Autodiff: the principle + graphical illustrations;
– Backprop through common operations (graphically):
– Defining new operations, incl. the caching of intermediate results;
– Autodiff: a numeric example;