Elena' s AI Blog

Tensors in TensorFlow

13 Jan 2022 / 10 minutes to read

Elena Daehnhardt


Midjourney AI-generated art

Introduction

TensorFlow is a free OS library for machine learning created by Google Brain. Tensorflow has excellent functionality for building deep neural networks. I have chosen TensorFlow because it is pretty robust, efficient, and can be used with Python. In this post, I am going to write about how we can create tensors, shuffle them, index them, get information about tensors with simple examples.

# Import tensorflow
import tensorflow as tf
print(tf.__version__)
    2.7.0

Tensors

In TensorFlow, we work with tensors to keep numerical data for usage in machine learning. Tensors can store data in N dimensions. When the tensor has two dimensions, it is essentially a matrix. When tensor has only one dimension, then we are speaking about vector. A tensor can also contain just one numerical value, such as a scalar or a zero-order tensor.

We created the tensor, as mentioned earlier, structures as constants in the code examples below. TensorFlow also gives information on the number of dimensions in tensor.

# Creating a scalar tensor
scalar = tf.constant(7)
scalar
    <tf.Tensor: shape=(), dtype=int32, numpy=7>
# Check the number of tensor dimensions
scalar.ndim 
0
# Create a vector
vector = tf.constant([5, 7])
vector
<tf.Tensor: shape=(2,), dtype=int32, numpy=array([5, 7], dtype=int32)>
vector.ndim
1

We create a matrix structure, with data type defined as int16.

# Create a matrix
matrix = tf.constant([[5, 7],
                      [3, 10]], dtype=tf.int16)
matrix
<tf.Tensor: shape=(2, 2), dtype=int16, numpy=
array([[ 5,  7],
       [ 3, 10]], dtype=int16)>
matrix.ndim
2

Creating Tensor Variables

Above, we have already created several tensor constants. We cannot change elements in these structures. We can also make mutable tensors with “Variable.”

# Create a variable tensor
tensor = tf.Variable([[[1, 2, 3],
                       [4, 5, 6]],
                      [[7, 8, 9],
                       [10, 11, 12]],
                      [[13, 14, 15],
                       [16, 17, 18]]])

In the code below, we use assign method to change the first element (which is a matrice) in tensor. We fillled its values with zeros.

# Change elements of the first tensor element
tensor[0].assign([[0, 0, 0], [0, 0, 0]])
<tf.Variable 'UnreadVariable' shape=(3, 2, 3) dtype=int32, numpy=
array([[[ 0,  0,  0],
        [ 0,  0,  0]],

       [[ 7,  8,  9],
        [10, 11, 12]],

       [[13, 14, 15],
        [16, 17, 18]]], dtype=int32)>

If we have a NumPy array, we can easily convert it into a tensor. In the example below, we have a NumPy array of 30 integers. We create a tensor consisting of 5 matrices with three rows and two columns using this array. This is useful because tensors can be run faster on GPU, while a native NumPy array cannot. However, please note that we can use libraries such as CuPy to speed up NumPy processing with GPU.

It is essential to ensure that the shape of tensor elements must be correctly defined and fit into array size. We can check it by multiplying shape dimensions 5x3x2=30.

# Turn a NumPy array into tensor
import numpy as np
numpy_array = np.arange(1, 31, dtype=np.int32)
numpy_array

X = tf.constant(numpy_array, shape=(5, 3, 2))
X
<tf.Tensor: shape=(5, 3, 2), dtype=int32, numpy=
array([[[ 1,  2],
        [ 3,  4],
        [ 5,  6]],

       [[ 7,  8],
        [ 9, 10],
        [11, 12]],

       [[13, 14],
        [15, 16],
        [17, 18]],

       [[19, 20],
        [21, 22],
        [23, 24]],

       [[25, 26],
        [27, 28],
        [29, 30]]], dtype=int32)>

Tensor Shuffle

Sometimes we need to randomly shuffle data in tensors, which we can do with TensorFlow’s respective functionality. This is useful to avoid any ordering biases in the data.

In the following example, we create a tensor-based on NumPy array of 30 elements. We rearrange data into a shape of five matrices with three rows and two columns.

# Turn a NumPy array into tensor
import numpy as np
numpy_array = np.arange(1, 31, dtype=np.int32)
numpy_array

X = tf.constant(numpy_array, shape=(5, 3, 2))
X
<tf.Tensor: shape=(5, 3, 2), dtype=int32, numpy=
array([[[ 1,  2],
        [ 3,  4],
        [ 5,  6]],

       [[ 7,  8],
        [ 9, 10],
        [11, 12]],

       [[13, 14],
        [15, 16],
        [17, 18]],

       [[19, 20],
        [21, 22],
        [23, 24]],

       [[25, 26],
        [27, 28],
        [29, 30]]], dtype=int32)>

Finally, we shuffle the data with operation-level randomness seed.

# Shuffling tensor X with operation-level seed=24
tf.random.shuffle(X, seed=57)
<tf.Tensor: shape=(5, 3, 2), dtype=int32, numpy=
array([[[ 1,  2],
        [ 3,  4],
        [ 5,  6]],

       [[13, 14],
        [15, 16],
        [17, 18]],

       [[25, 26],
        [27, 28],
        [29, 30]],

       [[19, 20],
        [21, 22],
        [23, 24]],

       [[ 7,  8],
        [ 9, 10],
        [11, 12]]], dtype=int32)>

In my next post, I went more in-depth about how we can ensure the same order when shuffling data, which is essential for reproducing results in cases of model training or cross-validation. Please check my post on global and operation-level seeds.

Random Tensors

Random tensors are filled with random numbers, useful for initializing network weights. Herein we use a random number generation from a normal distribution.

Normal distribution, also known as the Gaussian distribution, is a probability distribution that is symmetric about the mean, showing that data near the mean are more frequent in occurrence than data far from the mean. In graph form, normal distribution will appear as a bell curve (see investopedia).

random1 = tf.random.Generator.from_seed(57) # set seed for reproducubility
random1 = random1.normal(shape=(3, 2))
random1
<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
array([[ 0.49689844,  0.8259971 ],
       [ 1.0340209 , -0.24918637],
       [-1.5780283 , -0.92161775]], dtype=float32)>

Tensor’s Information

We can get information from a tensor about its datatype with tensor.dtype, shape with tensor.shape, rank with tensor.ndim, axis or dimension with tensor[axis], and size with tf.size(tensor) as in the following example:

tensor=tf.zeros(shape=[2, 3, 5])
print("Datatype of tensor elements: ", tensor.dtype)
print("Number of dimensions or rank: ", tensor.ndim)
print("Shape of tensor: ", tensor.shape)
print("Total number of elements in the tensor: ", tf.size(tensor).numpy())
print("Tensor elements along the first axis: ", tensor.shape[0])
print("Tensor elements along the last axis: ", tensor.shape[-1])
Datatype of tensor elements:  <dtype: 'float32'>
Number of dimensions or rank:  3
Shape of tensor:  (2, 3, 5)
Total number of elements in the tensor:  30
Tensor elements along the first axis:  2
Tensor elements along the last axis:  5

Conclusion

In this post, I have described my understanding of tensors, how they are created in TensorFlow, and how we can emulate randomness, shuffle data, and get information on tensors. For writing this post, I have used the tutorials at Udemy, TensorFlow Developer Certificate in 2022: Zero to Mastery.

Did you like this post? Please let me know if you have any comments or suggestions.

Posts that might be interesting for you


All Posts