Why You Need To Understand Tensor Operations.

Published in

Becoming Human: Artificial Intelligence Magazine

5 min readSep 20, 2021

It’s always best when mysteries can be explained in terms of things we know. Much as all, transformations learned by deep neural networks can be reduced to a handful of tensor operations applied to tensors of numeric data. It is possible to add, subtract and even multiply tensors!

For instance, you could build a network by stacking Dense layers on top of each other. A Keras layer instance will look like this:

Element-wise operations

We could translate this as a function; which takes as input a 2D tensor and returns a new representation for the input tensor as another 2D tensor. This function can be visualized as follows:

Where, W is a 2D tensor and b is a vector — both attributes of the layer. Implicitly, there are three tensor operations here: a dot product (dot) between the input tensor and a tensor named W; an addition (+) between the resulting 2D tensor and a vector b; and, finally, a relu operation. relu(x) is max(x, 0). So how do we unpack this?

The relu operation and addition are element-wise operations, that is, they are applied independently to each entry in the tensors being considered. Therefore this operation is an instance of vectorized implementations. A naive implementation of an element-wise relu operation using a for loop could be as follows:

Similarly, for addition:

Trending AI Articles:

1. Why Corporate AI projects fail?
2. How AI Will Power the Next Wave of Healthcare Innovation?
3. Machine Learning by Using Regression Model
4. Top Data Science Platforms in 2021 Other than Kaggle

Comparatively, using NumPy for the same element-wise operation will be super fast applying this code below:

Broadcasting

The drawback to naive implementations like(naive_add) is that it supports the addition of of 2D tensors with identical shapes. What happens with the addition when the shapes of the two tensors differ? In reference to the Dense layer introduced at the beginning of this article, if there’s no ambiguity, the smaller tensor will be broadcasted to match the shape of the larger tensor. This can be done in two steps, broadcast axes are added to the smaller tensor to match the ndim of larger tensor and the smaller tensor is repeated alongside these new axes to match the full shape of the larger tensor.

Concretely, consider X with shape (32, 10) and y with shape (10, ). First, we add an empty first axis to y, whose shape becomes (1, 10). We then repeat y 32 times alongside the new axis to end up with a tensor Y with shape (32,10), where Y[i, :]==y for i in range(0, 32). We can then add X and Y at this point because they have the same shape. This repetition operation is entirely virtual and no new 2D tensor is created!

The following example applies the element-wise maximum operation to two tensors of of different shapes via broadcasting:

Tensor dot operation

Besides broadcasting and element-wise operations, we have to address the elephant in the room — the tensor dot operation. Also referred to as tensor product, is the most common and most useful tensor operation. Contrary to element-wise operation, it combines entries in the input tensors. dot uses a different syntax in Tensorflow, but in both NumPy and Keras it’s done using the standard dot operator.

Let’s take the dot product between a matrix x and a vector y, which returns a vector where the coefficients are the dot products between y and the rows of x. You implement it as follows:

Lastly, we could also reshape tensors using the tensor reshaping operation. This implies rearranging its rows and columns to match a target shape. The shaped tensor has the same total number of coefficients as the initial tensor. Here’s an example:

A special case of reshaping can be obtained by transposition. This is exchanging its rows and its columns, so that x[i: ] becomes x[i: ].

Conclusion

Neural networks entirely consists of chains of tensor operations. Each neural layer transforms input with element-wise and tensor product operations. This is the engine of neural networks.

Don’t forget to give us your 👏 !