Neural Networks gone wild! They can sample from discrete distributions now!

--

Photo by Jonathan Petersson on Unsplash

Training deep neural networks usually boils down to defining your model’s architecture and a loss function, and watching the gradients propagate.

However, sometimes it’s not that simple: some architectures incorporate a random component. The forward pass is no longer a deterministic function of the input and weights. The random…

--

--