Let’s learn how to trim an array with Numpy clip function.
Suppose we have an array:
[0.006, 2, 5, 8, 10, 25, 400]
Clipping an array
We would like to exclude extremum values. Numpy clip function allows us to exclude very low and high values.
Say, I don’t need values below 2 and above 25. Numpy clip function will change them to 2 and 25 instead.
import numpy as np my_array = np.array([0, 2, 5, 8, 10, 25, 40]) trim_array = np.clip(my_array, 2, 25) print(f"Trimmed array 2 - 25: \n {trim_array}")
As you may noticed the syntax of clip function is as follows: clip(my_array, min_value, max_value).
Using parameters
The thing is you don’t need to define Min and max values. You may put None parameter if you don’t need that value clipped.
import numpy as np my_array = np.array([0, 2, 5, 8, 10, 25, 40]) trim_array = np.clip(my_array, None, 25) print(f"My array: \n {my_array}") print(f"Trimmed array < 25: \n {trim_array}")
The output would be:
My array: [ 0 2 5 8 10 25 40] Trimmed array < 25: [ 0 2 5 8 10 25 25]
Of course it is also possible to take clip function parameters from the constant values defined before.
import numpy as np min_value = 1 my_array = np.array([0, 2, 5, 8, 10, 25, 40]) trim_array = np.clip(my_array, min_value, None) print(f"My array: \n {my_array}") print(f"Trimmed array > min_value: \n {trim_array}")
Here's an output where values are trimmed between min_value and max value is not defined.
My array: [ 0 2 5 8 10 25 40] Trimmed array > min_value: [ 1 2 5 8 10 25 40]
As you can notice the values are trimmed. 0 is replaced by 1 and 40 is still present.
These are the ways how you can remove outliers from your dataset.