Data processing using arrays

With the NumPy package, we can easily solve many kinds of data processing tasks without writing complex loops. It is very helpful for us to control our code as well as the performance of the program. In this part, we want to introduce some mathematical and statistical functions.

See the following table for a listing of mathematical and statistical functions:

Function

Description

Example

sum

Calculate the sum of all the elements in an array or along the axis

>>> a = np.array([[2,4], [3,5]])
>>> np.sum(a, axis=0)
array([5, 9])

prod

Compute the product of array elements over the given axis

>>> np.prod(a, axis=1)
array([8, 15])

diff

Calculate the discrete difference along the given axis

>>> np.diff(a, axis=0)
array([[1,1]])

gradient

Return the gradient of an array

>>> np.gradient(a)
[array([[1., 1.], [1., 1.]]), array([[2., 2.], [2., 2.]])]

cross

Return the cross product of two arrays

>>> b = np.array([[1,2], [3,4]])
>>> np.cross(a,b)
array([0, -3])

std, var

Return standard deviation and variance of arrays

>>> np.std(a)
1.1180339
>>> np.var(a)
1.25

mean

Calculate arithmetic mean of an array

>>> np.mean(a)
3.5

where

Return elements, either from x or y, that satisfy a condition

>>> np.where([[True, True], [False, True]], [[1,2],[3,4]], [[5,6],[7,8]])
array([[1,2], [7, 4]])

unique

Return the sorted unique values in an array

>>> id = np.array(['a', 'b', 'c', 'c', 'd'])
>>> np.unique(id)
array(['a', 'b', 'c', 'd'], dtype='|S1')

intersect1d

Compute the sorted and common elements in two arrays

>>> a = np.array(['a', 'b', 'a', 'c', 'd', 'c'])
>>> b = np.array(['a', 'xyz', 'klm', 'd'])
>>> np.intersect1d(a,b)
array(['a', 'd'], dtype='|S3')

Loading and saving data

We can also save and load data to and from a disk, either in text or binary format, by using different supported functions in NumPy package.

Saving an array

Arrays are saved by default in an uncompressed raw binary format, with the file extension .npy by the np.save function:

>>> a = np.array([[0, 1, 2], [3, 4, 5]])
>>> np.save('test1.npy', a)

Note

The library automatically assigns the .npy extension, if we omit it.

If we want to store several arrays into a single file in an uncompressed .npz format, we can use the np.savez function, as shown in the following example:

>>> a = np.arange(4)
>>> b = np.arange(7)
>>> np.savez('test2.npz', arr0=a, arr1=b)

The .npz file is a zipped archive of files named after the variables they contain. When we load an .npz file, we get back a dictionary-like object that can be queried for its lists of arrays:

>>> dic = np.load('test2.npz')
>>> dic['arr0']
array([0, 1, 2, 3])

Another way to save array data into a file is using the np.savetxt function that allows us to set format properties in the output file:

>>> x = np.arange(4)
>>> # e.g., set comma as separator between elements
>>> np.savetxt('test3.out', x, delimiter=',')
Saving an array

Loading an array

We have two common functions such as np.load and np.loadtxt, which correspond to the saving functions, for loading an array:

>>> np.load('test1.npy')
array([[0, 1, 2], [3, 4, 5]])
>>> np.loadtxt('test3.out', delimiter=',')
array([0., 1., 2., 3.])

Similar to the np.savetxt function, the np.loadtxt function also has a lot of options for loading an array from a text file.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset