9.2.1 NumPy Library (1/2)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

234 | Big Data Simplied

9.2.1

NumPy Library

NumPy is the basic package in Python for doing scientic computing. The main content of this

package includes functionality for multidimensional arrays, high-level mathematical functions,

for example, linear algebra and Fourier transform operations, random number generators, etc.

While exploring scikit-learn, which is the main library of Python to implement machine leaning

functionalities, we see that it highly uses NumPy array as its primary data structure. Therefore, as

the initial step, let us rst review NumPy array.

NumPy Array: The array data structure in NumPy library stores regular data, which are elements of

the same type, for example, integer in a structured way. The array can be of varying dimensions,

for example, one-dimensional or 1D, two-dimensional or 2D, three-dimensional or 3D and so

on. The dimension is termed as axis in NumPy. Hence, a 2D array has 2 axes.

Examples:

1D array

[3, 5, 16, 18]

This array has 1 axis of length 4.

2D array

[[3, 5, 16, 18]

[45, 3, 7, 79]]

This array has 2 axes. The rst axis has a length of 2 and the second axis has a length of 4.

3D array

[

[[ 45 7 0 ]

[ 34 2 8]]

[[ 2 22 9]

[ 4 5 42]]

]

This array has 3 axes. The rst and second axes have length 2 and the third axis has a length of 3.

NumPy array() function can be used to make NumPy array. The following are few of the salient

attributes of the array function.

• ndim: To define the number of axes of the array.

• dtype: To define the data type of the elements in the array.

• shape: To get the dimensions of the array.

• size: To get the total number of elements of the array.

The most commonly created array is an empty array of a specic dimension which can be used as

a data structure to hold dynamic data. Empty arrays can be created in the following way.

M09 Big Data Simplified XXXX 01.indd 234 5/10/2019 10:22:56 AM

Working with Big Data inPython | 235

# Deﬁning 1-D array variable with data

>>> var2 = np.empty(4)

>>> var2[0] = 5.67

>>> var2[1] = 2

>>> var2[2] = 56

>>> var2[3] = 304

>>> print(var2)

[ 5.67 2. 56. 304. ]

>>> print (var2.shape) # Returns the dimension of the array

(4,)

>>> print(var2.size) # Returns the size of the array

# Deﬁning 2-D array variable with data

>>> var3 = np.empty((2,3))

>>> var3[0][0] = 5.67

>>> var3[0][1] = 2

>>> var3[0][2] = 56

>>> var3[1][0] = .09

>>> var3[1][1] = 132

>>> var3[1][2] = 1056

>>> print(var3)

[[ 5.67000000e+00 2.00000000e+00 5.60000000e+01]

[ 9.00000000e-02 1.32000000e+02 1.05600000e+03]]

[Note: Same result will be obtained with dtype=np.ﬂoat]

# Collapse the 2-D array into a single dimension

>>> print(var3.ﬂatten())

[5.670e+00 2.000e+00 5.600e+01 9.000e-02 1.320e+02 1.056e+03]

>>> print(var3.shape)

(2, 3)

# Same declaration with dtype mentioned

>>> var3 = np.empty((2,3), dtype=np.int)

[[ 5, 2, 56],

[ 0, 132, 1056]]

[Note: Note that the oat values have been rounded-down while converting them to integer,

for example, 5.67 rounded to 5 and .09 rounded to 0]

>>> print(var3[1]) # Returns a row of an array

[ 0 132 1056]

M09 Big Data Simplified XXXX 01.indd 235 5/10/2019 10:22:56 AM

236 | Big Data Simplied

>>> print(var3[[0, 1]]) # Returns multiple rows of an array

[[ 5 2 56]

[ 0 132 1056]]

>>> print(var3[:, 2]) # Returns a column of an array

[ 56 1056]

>>> print(var3[:, [1, 2]]) # Returns multiple column of an array

[[ 2 56]

[ 132 1056]]

>>> print(var3[1][2]) # Returns a cell value of an array

1056

>>> print(var3[1, 2]) # Returns a cell value of an array

1056

>>> print(np.transpose(var3)) # Returns transpose of an array

[[ 5 0]

[ 2 132]

[ 56 1056]]

>>>print(var3.reshape(3,2)) # Returns an array with changed

dimensions

[[ 5 2]

[ 56 0]

[ 132 1056]]

Practice Problem: Create and concatenate arrays.

>>>import numpy as np

>>>arr1= np.empty((2,3), dtype=np.int)

>>>arr1[0][0] = 5.67

>>>arr1[0][1] = 2

>>>arr1[0][2] = 56

>>>arr1[1][0] = .09

>>>arr1[1][1] = 132

>>>arr1[1][2] = 1056

>>>arr2 = np.empty((1,3), dtype=np.int)

>>>arr2[0][0] = 37

(Continued)

M09 Big Data Simplified XXXX 01.indd 236 5/10/2019 10:22:56 AM

Working with Big Data inPython | 237

>>>arr2[0][1] = 2.193

>>>arr2[0][2] = 5609

>>arr_concat = np.concatenate((var3, var5), axis = 0)

>>>print(arr_concat)

There are other variants of NumPy array which enables to create arrays full of ones, zeros, ran-

dom numbers or with pre-lled values as shown below.

# Create an array of 1s

>>> np.ones((2,3))

[[ 1., 1., 1.],

[ 1., 1., 1.]]

# Create an array of 0s

>>> np.zeros((2,3),dtype=np.int)

[[0, 0, 0],

[0, 0, 0]]

# Create an array with random numbers

>>> np.random.random((2,2))

[[ 0.47448072, 0.49876875],

[ 0.29531478, 0.48425055]]

# Deﬁning an array variable with pre-ﬁlled data

>>> import numpy as np

>>> var1 = np.array([[10,2,3], [23,45,67]])

>>> print(var1)

[[10 2 3]

[23 45 67]]

Mathematical and Statistical Functions in NumPy: The following table summarizes the key mathematical

functions provided by NumPy.

Sr # Command Purpose Sample Code with Output

1. sin, cos, tan, arcsin,

arccos, arctan,

degrees, etc.

Trigonometric

functions

>>> import numpy as np

>>> from numpy import pi

>>> array1 = np.array([30,60,90])

>>> np.sin(a*np.pi/180)

array([0.5, 0.70710678, 1.])

(Continued)

M09 Big Data Simplified XXXX 01.indd 237 5/10/2019 10:22:56 AM

238 | Big Data Simplied

Sr # Command Purpose Sample Code with Output

2. around, oor, ceil For rounding

decimals to the

desired precision.

>>> arr2 = np.array([67.07,88.10,

34, 231.67, 0.934])

>>> print(arr2)

[ 67.07 88.1 34. 231.67 0.934]

>>> np.around(arr2)

array([ 67., 88., 34., 232., 1.])

>>> np.around(arr2, decimals = 2)

array([ 67.07, 88.1 , 34. ,

231.67, 0.93])

>>> np.ﬂoor(arr2)

array([ 67., 88., 34., 231., 0.])

>>> np.ceil(arr2)

array([ 68., 89., 34., 232., 1.])

3. add, subtract,

multiply, divide,

power, reciprocal,

mod, etc.

Basic mathematical

operations on

arrays.

>>> arr1 = np.arange(6, dtype =

np.int).reshape(2,3)

>>> arr1

array([[0, 1, 2],

[3, 4, 5]])

>>> arr2 = np.arange(4, 15, 2,

dtype = np.int).reshape(2,3)

>>> arr2

array([[ 4, 6, 8],

[10, 12, 14]])

>>> np.add(arr1, arr2)

array([[ 4, 7, 10],

[13, 16, 19]])

>>> np.subtract(arr1, arr2)

array([[-4, -5, -6],

[-7, -8, -9]])

>>> np.multiply(arr1, arr2)

array([[ 0, 6, 16],

[30, 48, 70]])

(Continued)

M09 Big Data Simplified XXXX 01.indd 238 5/10/2019 10:22:57 AM

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 9.2.1 NumPy Library (1/2)

Create new playlist

Sign In

Sign Up

Table of Contents for
9.2.1 NumPy Library (1/2)