The main object that you throw around in NumPy is called a multidimensional array. Typically you store numbers in it. Each “dimension” is called an axes. For example, a single co-ordinate in 3D space could be stored as:
V = [2, 4, 3]
This has one axis (one dimension).
A 2D rotation transformation could be described with:
R = [ [2, 3, 5], [1, 7, 2] ]
This has two axes.
Creating An Array
NumPy arrays can be created with standard Python lists:
>>> import numpy as np >>> a = np.array([2,1,5]) >>> a array([2, 5, 1])
If we wanted to create a 2 axis array we could pass in a list of lists:
>>> import numpy as np >>> a = np.array([[1,2,3],[4,5,6]) >>> a array([[1, 2, 3], [4, 5, 6]])
You can continue to nest lists within lists to create an array with any number of axes (dimensions).
You can also create arrays with special values, such as arrays full of 1’s, arrays full of zero’s, arrays full of random numbers and arrays with 1’s on the diagonal (like identity matrices).
An array of 1’s:
>>> a = np.ones([2,3]) >>> a array([[1., 1., 1.], [1., 1., 1.]])
An array with 1’s on the diagonal:
>>> a array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]])
Another really useful way of creating arrays is with
np.arange(). This does exactly what is says, it creates an array with a range of values:
>>> np.arange(4) array([0, 1, 2, 3]) >>> a = np.arange(9) >>> np.reshape(a, [3, 3]) array([[0, 1, 2], [3, 4, 5], [6, 7, 8]])
np.linspace() is another great array creating tool, which creates an array of linearly spaced numbers. The following example creates 5 numbers, linearly spaced from 4.0 to 10.0:
>>> np.linspace(4.0, 10.0, 5) array([ 4. , 5.5, 7. , 8.5, 10. ])
Indexing And Reading/Writing
NumPy arrays have one index per axis, forming a tuple. The indexed are zero-indexed, like all sensible languages/libraries :-D.
Reading from a 1 axis array:
>>> a = np.array([2,1,5]) >>> a 5
Reading from a 2 axis array:
>>> a = np.array([[2,5],[1,3]]) >>> a[0,1] 5
Writing to a 2-axis array:
>>> a = np.array([[2,5],[1,3]]) >>> a[1,1] = 10 >>> a array([[ 2, 5], [ 1, 10]])
Doing Basic Operations With Arrays
NumPy arrays can be added element wise with the
>>> a = np.array([[1,2,3],[4,5,6]]) >>> b = np.array([[1,1,2],[2,2,1]]) >>> a+b array([[2, 3, 5], [6, 7, 7]])
They can be multiplied element-wise with the
* operator (this is the same as
>>> a = np.array([[2,5],[1,3]]) >>> b = np.array([[1,4],[2,1]]) >>> a*b array([[ 2, 20], [ 2, 3]])
A dot-product of two arrays can be done with
>>> a = np.array([[2,5],[1,3]]) >>> b = np.array([[1,4],[2,1]]) >>> np.dot(a, b) array([[12, 13], [ 7, 7]])
The cross-product of two arrays can be done with
>>> a = np.array([4,5,1]) >>> b = np.array([3,1,2]) >>> np.cross(a, b) array([ 9, -5, -11])
One of the powerful features of Numpy arrays is the simple and terse slicing syntax (which is built upon Python’s slicing syntax). A slice is when you extract just a portion of the array for further use:
Very simple slicing is really the same as indexing:
my_array = np.array([4, 5, 6]) my_slice = my_array # my_slice = 5
Extract the first two elements:
my_array = np.array([4, 5, 6]) my_slice = my_array[0:2] # my_slice = array([4, 5])
Some of the real power of slicing is seen when you slice multidimensional arrays (arrays with more than 1 axis).
my_array[:, 0] tells Numpy to make a slice using all elements from the 1st axis (
:), and only the first element from the second axis (
0). An example of this slice is shown below:
my_array = np.array([[1, 2, 3], [4, 5, 6]) my_slice = my_array[:, 0] # Take all from 1st axis, and element 0 from second axis # my_slice = array(, )
: is the same as
0:<len - 1>, and captures all data.
This is commonly used to extract columns from data. For example, if you had the following array of x, y pairs:
xy_pairs = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
You could extract all the x values and all the y values with:
x_values = xy_pairs[:, 0] y_values = xy_pairs[:, 1] # x_values = array([, , , ]) # y_values = array([, , , ])
You can also use this to extract “rows” from an array:
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]) two_rows = data[1:3, :] # two_rows = array([[4, 5, 6], [7, 8, 9]])
Adding A Step
You can also add a step size while slicing Numpy arrays, just as you can when using standard Python slicing. The step size is the third argument in the slice syntax, i.e.
data = np.arange(10) # data = array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) my_slice = data[3:8:2] # Slice from 3 to 8, with a step size of 2 # my_slice = array([3, 5, 7])
Stacking is used to join arrays along a new axis. A classic example of this would be if you had two separate arrays of x and y values, and you wanted to to combine them into a single array of (x, y) coordinate pairs.
x_vals = np.array([ 1, 2, 3, 4]) y_vals = np.array([ 10, 20, 30, 40]) coords = np.stack([x_vals, y_vals], axis=1) # coords = [ # [ 1 10] # [ 2 20] # [ 3 30] # [ 4 40] # ]
If you want to start of with an empty array and add values to it in a loop, you can use concatenate, but you have to
reshape your empty array first to make sure it has the right dimensions:
data = np.array().reshape(0,3) # This reshape() call is important! for i in range(3) np.concatenate([data, np.array([[ i*1, i*2, i*3]])]) print(data)
Reading A CSV File
You can use Numpy’s
genfromtxt() method to read in CSV files and convert the data into a Numpy array:
data = np.genfromtxt('my_file.csv', delimiter=',')
Each line will be a different element in axis 1 of the array. Each CSV value on a line will be a different element in axis 2 of the array.
For example, if your CSV looked like:
0, 1, 2 10, 11, 12
The array would look like:
[ [0 1 2], [10 11 12] ]
Skip Header Rows
You can skip a header line/row in the CSV file by providing
data = np.genfromtxt('my_file.csv', delimiter=',', skip_header=1)
This is good when you have column data names in the first row (which is a common practice), e.g.:
Time (s), Depth (m), Width (m) 0, 1, 2 10, 11, 12
np.dot(a, b, out=none)
Dot product of two arrays.
Returns an array with 1’s on the diagonal and 0’s elsewhere (also known as an identity matrix).
my_array = np.eye(3) # my_array = array([ # [1, 0, 0], # [0, 1, 0], # [0, 0, 1]])
Returns a flattened array.
Numpy has a powerful feature called a masked array. A masked array is essentially composed of two arrays, one containing the data, and another containing a mask (a boolean
False value for each element in the data array).
Retrieving an array value which is masked will result in
masked being returned.
Creating A Masked Array
You can use the
np.ma.masked_equal() function to create a masked array from a standard array, specifying the value you want to use as the mask as the second parameter:
import numpy as np standard_array = np.arange(4) print(standard_array) # stdout: array([0, 1, 2, 3]) masked_array = np.ma.masked_equal(standard_array, 2) print(masked_array) # stdout: masked_array(data=[0, 1, --, 3], # mask=[False, False, True, False], # fill_value=2)
Checking If An Array Is Masked
You can check if an array is masked with
import numpy as np standard_array = np.arange(4) print(np.ma.is_masked(standard_array)) # stdout: False masked_array = np.ma.masked_equal(standard_array, 2) print(np.ma.is_masked(masked_array)) # stdout: True
Removing Masked Values From An Array
You can trim down an array and remove all masked values by using the
compressed() function that belong to an
masked_array = np.ma.masked_equal([1, 2, 3, 4], 2) no_masked_values = masked_array.compressed() print(no_masked_values) # [1 3 4]
Numpy Warnings And How To Silence Them
For example, running
np.mean() using Python 3.7 and a up-to-date version of Numpy will produce the following:
>>> import numpy as np >>> np.mean() /usr/local/lib/python3.7/site-packages/numpy/core/fromnumeric.py:3118: RuntimeWarning: Mean of empty slice. out=out, **kwargs) /usr/local/lib/python3.7/site-packages/numpy/core/_methods.py:85: RuntimeWarning: invalid value encountered in double_scalars ret = ret.dtype.type(ret / rcount) nan
Not how this is not an exception. Numpy prints a warning stating that you are trying to calculate a mean of an empty slice, as well as that there is an invalid value, but continues execution and returns
nan. These warnings are usually helpful in debugging problems with the data you are providing, but in some cases you will want to silence the warnings as the data is as expected.
The safest way to suppress Numpy warnings is to use the
np.errstate context manager, which only changes the warning state while the content is active. However, this has some problems…
with np.errstate(invalid='ignore'): # Numpy "invalid number" warnings will be suppressed here, # but not the "Mean of empty slice." warnings np.mean() # Warnings are back to normal here
However, the problem with this is that it will silence the
RuntimeWarning: invalid value encountered in double_scalars warning, BUT NOT the
RuntimeWarning: Mean of empty slice. warning. A better approach is to use the
warnings module (which is shipped with Python), however this comes at the expense of silencing a larger group of warnings (what if a
RuntimeWarning was emitted here for a different reason?):
import warnings import numpy as np with warnings.catch_warnings(): warnings.simplefilter("ignore", category=RuntimeWarning) # The better way! Both warnings from the line below are now silenced np.mean()
If you want to convert all warnings into exceptions, you can use the following code. This is particular dangerous as in applies to all code after this call.
By default, Numpy will print warnings when you attempt to divide by 0. For example, the following code:
a = np.array() b = np.array() c = a / b
will cause Numpy to print:
<ipython-input-6-186705c1cfef>:3: RuntimeWarning: divide by zero encountered in true_divide a/b
These warnings can be temporarily silenced with the
with np.errstate(divide='ignore') context manager:
a = np.array() b = np.array() with np.errstate(divide='ignore'): c = a / b
- programming languages