How to normalize array in Numpy?

Let’s learn about how to normalize an array in Numpy Python library. We will use linalg norm function for that purpose.

Numpy normalize array

Normalization scales numerical data to a standard range, often between 0 and 1 or to have a unit norm. This process is essential for algorithms sensitive to the scale of input features, such as gradient descent-based methods and distance-based algorithms.

Steps to normalize an array in Numpy

To normalize an array in Numpy you need to divide your array by np.linalg.norm of your array. Just take a look at below example or normalization.

import numpy as np

my_array = np.array([[1, 3, 5],
                     [7, 9, 11],
                     [13, 15, 17]])

print(f"My array: \n{my_array}")

my_normalized_array = my_array / np.linalg.norm(my_array)
print(f"My normalized array: \n{my_normalized_array}")

Python returned normalized array as an output. The np.linalg.norm(my_array) function, in its default configuration, calculates the L2 norm (or Frobenius norm for matrices) of the input array. The L2 norm represents the magnitude or length of the vector (or matrix), calculated as the square root of the sum of the squares of its elements – essentially the Euclidean distance from the origin in a multi-dimensional space.

Numpy normalize array

In this code, we start with the my_array and use the np.linalg.norm function to calculate the L2 norm of the array. We then divide each element in my_array by this L2 norm to obtain the normalized array, my_normalized_array.

Running this code results in a normalized array where the values are scaled to have a magnitude of 1.0. The resulting my_normalized_array will have each element scaled such that the L2 norm of the entire array becomes approximately 1.0. This is also known as unit vector normalization or scaling to unit length.

It’s important to note that L2 normalization is just one type of normalization. Other common methods include Min-Max scaling (scaling data to a fixed range like [0, 1]) and Standard scaling (standardizing data to have zero mean and unit variance). The choice of normalization technique depends on the specific dataset and the requirements of the data analysis or machine learning task.