Glossary ML

Machine learning/ deep learning terms

Batch

The set of examples used in one training iteration. The batch size determines the number of examples in a batch.

Batchnorm

During training this layer calculates mean .mean() and variance .std() over mini-batch, $\gamma$ and $\beta$ ($\gamma$=1, $\beta$=0 by default) are learnable parameter vectors of size $C$ (where $C$ is the input size). Computed mean and variance are then used for normalization during evaluation (validation/ inference).

\[y = \frac{x - \mathrm{E}[x]}{\sqrt{\mathrm{Var}[x] + \epsilon}} *\gamma +\beta\]

Batch size

The number of examples in a batch. For instance, if the batch size is 100, then the model processes 100 examples per iteration.

Dropout

During training, randomly zeroes some of the elements of the input tensor with probability $p$ (0.5 by default) using samples from a Bernoulli distribution. It helps to regulate and preventing the co-adaptation of neurons.

Epoch

All examples have been processed once over the entire training set. An epoch represents N/ batch size training iterations, where N is the total number of examples.

Iteration

A single update of a model’s parameters (weights and biase) during training.

Mini-batch

in which batch size is usually between 10 and 1000 (most efficient).

Stochastic Gradient Descent (SGD)

in which batch size is 1.