This chapter is one of the most important ones in this book. We will now begin to dive into the meat and bones of pandas. We start by taking a tour of NumPy ndarrays
, a data structure not in pandas but NumPy. Knowledge of NumPy ndarrays
is useful as it forms the foundation for the pandas data structures. Another key benefit of NumPy arrays is that they execute what is known as vectorized operations, which are operations that require traversing/looping on a Python array, much faster.
The topics we will cover in this chapter include the following:
numpy.ndarray
data structure.pandas.Series
1-dimensional (1D) pandas data structurepandas.DatcaFrame
2-dimensional (2D) pandas tabular data structurepandas.Panel
3-dimensional (3D) pandas data structureIn this chapter, I will present the material via numerous examples using IPython, a browser-based interface that allows the user to type in commands interactively to the Python interpreter. Instructions for installing IPython are provided in the previous chapter.
The NumPy library is a very important package used for numerical computing with Python. Its primary features include the following:
numpy.ndarray
, a homogenous multidimensional arrayFor more information about NumPy, see http://www.numpy.org.
The primary data structure in NumPy is the array class ndarray
. It is a homogeneous multi-dimensional (n-dimensional) table of elements, which are indexed by integers just as a normal array. However, numpy.ndarray
(also known as numpy.array
) is different from the standard Python array.array
class, which offers much less functionality. More information on the various operations is provided at http://scipy-lectures.github.io/intro/numpy/array_object.html.
NumPy arrays can be created in a number of ways via calls to various NumPy methods.
NumPy arrays can be created via the numpy.array
constructor directly:
In [1]: import numpy as np In [2]: ar1=np.array([0,1,2,3])# 1 dimensional array In [3]: ar2=np.array ([[0,3,5],[2,8,7]]) # 2D array In [4]: ar1 Out[4]: array([0, 1, 2, 3]) In [5]: ar2 Out[5]: array([[0, 3, 5], [2, 8, 7]])
The shape of the array is given via ndarray.shape
:
In [5]: ar2.shape Out[5]: (2, 3)
The number of dimensions is obtained using ndarray.ndim
:
In [7]: ar2.ndim Out[7]: 2
ndarray.arange
is the NumPy version of Python's range function:In [10]: # produces the integers from 0 to 11, not inclusive of 12
ar3=np.arange(12); ar3 Out[10]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]) In [11]: # start, end (exclusive), step size ar4=np.arange(3,10,3); ar4 Out[11]: array([3, 6, 9])
ndarray.linspace
generates linear evenly spaced elements between the start and the end:
In [13]:# args - start element,end element, number of elements ar5=np.linspace(0,2.0/3,4); ar5 Out[13]:array([ 0., 0.22222222, 0.44444444, 0.66666667])
These functions include numpy.zeros
, numpy.ones
, numpy.eye
, nrandom.rand
, numpy.random.randn
, and numpy.empty
.
The argument must be a tuple in each case. For the 1D array, you can just specify the number of elements, no need for a tuple.
The following command line explains the function:
In [14]:# Produces 2x3x2 array of 1's. ar7=np.ones((2,3,2)); ar7 Out[14]: array([[[ 1., 1.], [ 1., 1.], [ 1., 1.]], [[ 1., 1.], [ 1., 1.], [ 1., 1.]]])
The following command line explains the function:
In [15]:# Produce 4x2 array of zeros. ar8=np.zeros((4,2));ar8 Out[15]: array([[ 0., 0.], [ 0., 0.], [ 0., 0.], [ 0., 0.]])
The following command line explains the function:
In [17]:# Produces identity matrix ar9 = np.eye(3);ar9 Out[17]: array([[ 1., 0., 0.], [ 0., 1., 0.], [ 0., 0., 1.]])
The following command line explains the function:
In [18]: # Create diagonal array ar10=np.diag((2,1,4,6));ar10 Out[18]: array([[2, 0, 0, 0], [0, 1, 0, 0], [0, 0, 4, 0], [0, 0, 0, 6]])
The following command line explains the function:
In [19]: # Using the rand, randn functions # rand(m) produces uniformly distributed random numbers with range 0 to m np.random.seed(100) # Set seed ar11=np.random.rand(3); ar11 Out[19]: array([ 0.54340494, 0.27836939, 0.42451759]) In [20]: # randn(m) produces m normally distributed (Gaussian) random numbers ar12=np.random.rand(5); ar12 Out[20]: array([ 0.35467445, -0.78606433, -0.2318722 , 0.20797568, 0.93580797])
Using np.empty
to create an uninitialized array is a cheaper and faster way to allocate an array, rather than using np.ones
or np.zeros
(malloc
versus. cmalloc
). However, you should only use it if you're sure that all the elements will be initialized later:
In [21]: ar13=np.empty((3,2)); ar13 Out[21]: array([[ -2.68156159e+154, 1.28822983e-231], [ 4.22764845e-307, 2.78310358e-309], [ 2.68156175e+154, 4.17201483e-309]])
The np.tile
function allows one to construct an array from a smaller array by repeating it several times on the basis of a parameter:
In [334]: np.array([[1,2],[6,7]]) Out[334]: array([[1, 2], [6, 7]]) In [335]: np.tile(np.array([[1,2],[6,7]]),3) Out[335]: array([[1, 2, 1, 2, 1, 2], [6, 7, 6, 7, 6, 7]]) In [336]: np.tile(np.array([[1,2],[6,7]]),(2,2)) Out[336]: array([[1, 2, 1, 2], [6, 7, 6, 7], [1, 2, 1, 2], [6, 7, 6, 7]])
We can specify the type of contents of a numeric array by using the dtype
parameter:
In [50]: ar=np.array([2,-1,6,3],dtype='float'), ar Out[50]: array([ 2., -1., 6., 3.]) In [51]: ar.dtype Out[51]: dtype('float64') In [52]: ar=np.array([2,4,6,8]); ar.dtype Out[52]: dtype('int64') In [53]: ar=np.array([2.,4,6,8]); ar.dtype Out[53]: dtype('float64')
The default dtype
in NumPy is float
. In the case of strings, dtype
is the length of the longest string in the array:
In [56]: sar=np.array(['Goodbye','Welcome','Tata','Goodnight']); sar.dtype Out[56]: dtype('S9')
You cannot create variable-length strings in NumPy, since NumPy needs to know how much space to allocate for the string. dtypes
can also be Boolean values, complex numbers, and so on:
In [57]: bar=np.array([True, False, True]); bar.dtype Out[57]: dtype('bool')
The datatype of ndarray
can be changed in much the same way as we cast in other languages such as Java or C/C++. For example, float
to int
and so on. The mechanism to do this is to use the numpy.ndarray.astype()
function. Here is an example:
In [3]: f_ar = np.array([3,-2,8.18]) f_ar Out[3]: array([ 3. , -2. , 8.18]) In [4]: f_ar.astype(int) Out[4]: array([ 3, -2, 8])
More information on casting can be found in the official documentation at http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.astype.html.
Array indices in NumPy start at 0
, as in languages such as Python, Java, and C++ and unlike in Fortran, Matlab, and Octave, which start at 1
. Arrays can be indexed in the standard way as we would index into any other Python sequences:
# print entire array, element 0, element 1, last element. In [36]: ar = np.arange(5); print ar; ar[0], ar[1], ar[-1] [0 1 2 3 4] Out[36]: (0, 1, 4) # 2nd, last and 1st elements In [65]: ar=np.arange(5); ar[1], ar[-1], ar[0] Out[65]: (1, 4, 0)
Arrays can be reversed using the ::-1
idiom as follows:
In [24]: ar=np.arange(5); ar[::-1] Out[24]: array([4, 3, 2, 1, 0])
Multi-dimensional arrays are indexed using tuples of integers:
In [71]: ar = np.array([[2,3,4],[9,8,7],[11,12,13]]); ar Out[71]: array([[ 2, 3, 4], [ 9, 8, 7], [11, 12, 13]]) In [72]: ar[1,1] Out[72]: 8
Here, we set the entry at row1
and column1
to 5
:
In [75]: ar[1,1]=5; ar Out[75]: array([[ 2, 3, 4], [ 9, 5, 7], [11, 12, 13]])
Retrieve row 2:
In [76]: ar[2] Out[76]: array([11, 12, 13]) In [77]: ar[2,:] Out[77]: array([11, 12, 13])
Retrieve column 1:
In [78]: ar[:,1] Out[78]: array([ 3, 5, 12])
If an index is specified that is out of bounds of the range of an array, IndexError
will be raised:
In [6]: ar = np.array([0,1,2]) In [7]: ar[5] --------------------------------------------------------------------------- IndexError Traceback (most recent call last) <ipython-input-7-8ef7e0800b7a> in <module>() ----> 1 ar[5] IndexError: index 5 is out of bounds for axis 0 with size 3
Thus, for 2D arrays, the first dimension denotes rows and the second dimension, the columns. The colon (:
) denotes selection across all elements of the dimension.
Arrays can be sliced using the following syntax: ar[startIndex: endIndex: stepValue]
.
In [82]: ar=2*np.arange(6); ar Out[82]: array([ 0, 2, 4, 6, 8, 10]) In [85]: ar[1:5:2] Out[85]: array([2, 6])
Note that if we wish to include the endIndex
value, we need to go above it, as follows:
In [86]: ar[1:6:2] Out[86]: array([ 2, 6, 10])
Obtain the first n-elements using ar[:n]
:
In [91]: ar[:4] Out[91]: array([0, 2, 4, 6])
The implicit assumption here is that startIndex=0, step=1
.
Start at element 4 until the end:
In [92]: ar[4:] Out[92]: array([ 8, 10])
Slice array with stepValue=3
:
In [94]: ar[::3] Out[94]: array([0, 6])
To illustrate the scope of indexing in NumPy, let us refer to this illustration, which is taken from a NumPy lecture given at SciPy 2013 and can be found at http://bit.ly/1GxCDpC:
Let us now examine the meanings of the expressions in the preceding image:
a[0,3:5]
indicates the start at row 0, and columns 3-5, where column 5 is not included.a[4:,4:]
, the first 4 indicates the start at row 4 and will give all columns, that is, the array [[40, 41,42,43,44,45] [50,51,52,53,54,55]]. The second 4 shows the cutoff at the start of column 4 to produce the array [[44, 45], [54, 55]].a[:,2]
gives all rows from column 2.a[2::2,::2]
, 2::2
indicates that the start is at row 2 and the step value here is also 2. This would give us the array [[20, 21, 22, 23, 24, 25], [40, 41, 42, 43, 44, 45]]. Further, ::2
specifies that we retrieve columns in steps of 2, producing the end result array ([[20, 22, 24], [40, 42, 44]]).Assignment and slicing can be combined as shown in the following code snippet:
In [96]: ar Out[96]: array([ 0, 2, 4, 6, 8, 10]) In [100]: ar[:3]=1; ar Out[100]: array([ 1, 1, 1, 6, 8, 10]) In [110]: ar[2:]=np.ones(4);ar Out[110]: array([1, 1, 1, 1, 1, 1])
Here, NumPy arrays can be used as masks to select or filter out elements of the original array. For example, see the following snippet:
In [146]: np.random.seed(10) ar=np.random.random_integers(0,25,10); ar Out[146]: array([ 9, 4, 15, 0, 17, 25, 16, 17, 8, 9]) In [147]: evenMask=(ar % 2==0); evenMask Out[147]: array([False, True, False, True, False, False, True, False, True, False], dtype=bool) In [148]: evenNums=ar[evenMask]; evenNums Out[148]: array([ 4, 0, 16, 8])
In the following example, we randomly generate an array of 10 integers between 0 and 25. Then, we create a Boolean mask array that is used to filter out only the even numbers. This masking feature can be very useful, say for example, if we wished to eliminate missing values, by replacing them with a default value. Here, the missing value ''
is replaced by 'USA'
as the default country. Note that ''
is also an empty string:
In [149]: ar=np.array(['Hungary','Nigeria', 'Guatemala','','Poland', '','Japan']); ar Out[149]: array(['Hungary', 'Nigeria', 'Guatemala', '', 'Poland', '', 'Japan'], dtype='|S9') In [150]: ar[ar=='']='USA'; ar Out[150]: array(['Hungary', 'Nigeria', 'Guatemala', 'USA', 'Poland', 'USA', 'Japan'], dtype='|S9')
Arrays of integers can also be used to index an array to produce another array. Note that this produces multiple values; hence, the output must be an array of type ndarray
. This is illustrated in the following snippet:
In [173]: ar=11*np.arange(0,10); ar Out[173]: array([ 0, 11, 22, 33, 44, 55, 66, 77, 88, 99]) In [174]: ar[[1,3,4,2,7]] Out[174]: array([11, 33, 44, 22, 77])
In the preceding code, the selection object is a list and elements at indices 1, 3, 4, 2, and 7 are selected. Now, assume that we change it to the following:
In [175]: ar[1,3,4,2,7]
We get an IndexError
error since the array is 1D and we're specifying too many indices to access it.
IndexError Traceback (most recent call last) <ipython-input-175-adbcbe3b3cdc> in <module>() ----> 1 ar[1,3,4,2,7] IndexError: too many indices
This assignment is also possible with array indexing, as follows:
In [176]: ar[[1,3]]=50; ar Out[176]: array([ 0, 50, 22, 50, 44, 55, 66, 77, 88, 99])
When a new array is created from another array by using a list of array indices, the new array has the same shape.
Here, we illustrate the use of complex indexing to assign values from a smaller array into a larger one:
In [188]: ar=np.arange(15); ar Out[188]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]) In [193]: ar2=np.arange(0,-10,-1)[::-1]; ar2 Out[193]: array([-9, -8, -7, -6, -5, -4, -3, -2, -1, 0])
Slice out the first 10 elements of ar
, and replace them with elements from ar2
, as follows:
In [194]: ar[:10]=ar2; ar Out[194]: array([-9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 10, 11, 12, 13, 14])
A view on a NumPy array is just a particular way of portraying the data it contains. Creating a view does not result in a new copy of the array, rather the data it contains may be arranged in a specific order, or only certain data rows may be shown. Thus, if data is replaced on the underlying array's data, this will be reflected in the view whenever the data is accessed via indexing.
The initial array is not copied into the memory during slicing and is thus more efficient. The np.may_share_memory
method can be used to see if two arrays share the same memory block. However, it should be used with caution as it may produce false positives. Modifying a view modifies the original array:
In [118]:ar1=np.arange(12); ar1 Out[118]:array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]) In [119]:ar2=ar1[::2]; ar2 Out[119]: array([ 0, 2, 4, 6, 8, 10]) In [120]: ar2[1]=-1; ar1 Out[120]: array([ 0, 1, -1, 3, 4, 5, 6, 7, 8, 9, 10, 11])
To force NumPy to copy an array, we use the np.copy
function. As we can see in the following array, the original array remains unaffected when the copied array is modified:
In [124]: ar=np.arange(8);ar Out[124]: array([0, 1, 2, 3, 4, 5, 6, 7]) In [126]: arc=ar[:3].copy(); arc Out[126]: array([0, 1, 2]) In [127]: arc[0]=-1; arc Out[127]: array([-1, 1, 2]) In [128]: ar Out[128]: array([0, 1, 2, 3, 4, 5, 6, 7])
Here, we present various operations in NumPy.
Basic arithmetic operations work element-wise with scalar operands. They are - +
, -
, *
, /
, and **
.
In [196]: ar=np.arange(0,7)*5; ar Out[196]: array([ 0, 5, 10, 15, 20, 25, 30]) In [198]: ar=np.arange(5) ** 4 ; ar Out[198]: array([ 0, 1, 16, 81, 256]) In [199]: ar ** 0.5 Out[199]: array([ 0., 1., 4., 9., 16.])
Operations also work element-wise when another array is the second operand as follows:
In [209]: ar=3+np.arange(0, 30,3); ar Out[209]: array([ 3, 6, 9, 12, 15, 18, 21, 24, 27, 30]) In [210]: ar2=np.arange(1,11); ar2 Out[210]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
Here, in the following snippet, we see element-wise subtraction, division, and multiplication:
In [211]: ar-ar2 Out[211]: array([ 2, 4, 6, 8, 10, 12, 14, 16, 18, 20]) In [212]: ar/ar2 Out[212]: array([3, 3, 3, 3, 3, 3, 3, 3, 3, 3]) In [213]: ar*ar2 Out[213]: array([ 3, 12, 27, 48, 75, 108, 147, 192, 243, 300])
It is much faster to do this using NumPy rather than pure Python. The %timeit
function in IPython is known as a magic function and uses the Python timeit
module to time the execution of a Python statement or expression, explained as follows:
In [214]: ar=np.arange(1000) %timeit ar**3 100000 loops, best of 3: 5.4 µs per loop In [215]:ar=range(1000) %timeit [ar[i]**3 for i in ar] 1000 loops, best of 3: 199 µs per loop
Array multiplication is not the same as matrix multiplication; it is element-wise, meaning that the corresponding elements are multiplied together. For matrix multiplication, use the dot operator. For more information refer to http://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html.
In [228]: ar=np.array([[1,1],[1,1]]); ar Out[228]: array([[1, 1], [1, 1]]) In [230]: ar2=np.array([[2,2],[2,2]]); ar2 Out[230]: array([[2, 2], [2, 2]]) In [232]: ar.dot(ar2) Out[232]: array([[4, 4], [4, 4]])
Comparisons and logical operations are also element-wise:
In [235]: ar=np.arange(1,5); ar Out[235]: array([1, 2, 3, 4]) In [238]: ar2=np.arange(5,1,-1);ar2 Out[238]: array([5, 4, 3, 2]) In [241]: ar < ar2 Out[241]: array([ True, True, False, False], dtype=bool) In [242]: l1 = np.array([True,False,True,False]) l2 = np.array([False,False,True, False]) np.logical_and(l1,l2) Out[242]: array([False, False, True, False], dtype=bool)
Other NumPy operations such as log, sin, cos, and exp are also element-wise:
In [244]: ar=np.array([np.pi, np.pi/2]); np.sin(ar) Out[244]: array([ 1.22464680e-16, 1.00000000e+00])
Note that for element-wise operations on two NumPy arrays, the two arrays must have the same shape, else an error will result since the arguments of the operation must be the corresponding elements in the two arrays:
In [245]: ar=np.arange(0,6); ar Out[245]: array([0, 1, 2, 3, 4, 5]) In [246]: ar2=np.arange(0,8); ar2 Out[246]: array([0, 1, 2, 3, 4, 5, 6, 7]) In [247]: ar*ar2 --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-247-2c3240f67b63> in <module>() ----> 1 ar*ar2 ValueError: operands could not be broadcast together with shapes (6) (8)
Further, NumPy arrays can be transposed as follows:
In [249]: ar=np.array([[1,2,3],[4,5,6]]); ar Out[249]: array([[1, 2, 3], [4, 5, 6]]) In [250]:ar.T Out[250]:array([[1, 4], [2, 5], [3, 6]]) In [251]: np.transpose(ar) Out[251]: array([[1, 4], [2, 5], [3, 6]])
Suppose we wish to compare arrays not element-wise, but array-wise. We could achieve this as follows by using the np.array_equal
operator:
In [254]: ar=np.arange(0,6) ar2=np.array([0,1,2,3,4,5]) np.array_equal(ar, ar2) Out[254]: True
Here, we see that a single Boolean value is returned instead of a Boolean array. The value is True
only if all the corresponding elements in the two arrays match. The preceding expression is equivalent to the following:
In [24]: np.all(ar==ar2) Out[24]: True
Operators such as np.sum
and np.prod
perform reduces on arrays; that is, they combine several elements into a single value:
In [257]: ar=np.arange(1,5) ar.prod() Out[257]: 24
In the case of multi-dimensional arrays, we can specify whether we want the reduction operator to be applied row-wise or column-wise by using the axis parameter:
In [259]: ar=np.array([np.arange(1,6),np.arange(1,6)]);ar Out[259]: array([[1, 2, 3, 4, 5], [1, 2, 3, 4, 5]]) # Columns In [261]: np.prod(ar,axis=0) Out[261]: array([ 1, 4, 9, 16, 25]) # Rows In [262]: np.prod(ar,axis=1) Out[262]: array([120, 120])
In the case of multi-dimensional arrays, not specifying an axis results in the operation being applied to all elements of the array as explained in the following example:
In [268]: ar=np.array([[2,3,4],[5,6,7],[8,9,10]]); ar.sum() Out[268]: 54 In [269]: ar.mean() Out[269]: 6.0 In [271]: np.median(ar) Out[271]: 6.0
These operators are used to apply standard statistical operations to a NumPy array. The names are self-explanatory: np.std()
, np.mean()
, np.median()
, and np.cumsum()
.
In [309]: np.random.seed(10) ar=np.random.randint(0,10, size=(4,5));ar Out[309]: array([[9, 4, 0, 1, 9], [0, 1, 8, 9, 0], [8, 6, 4, 3, 0], [4, 6, 8, 1, 8]]) In [310]: ar.mean() Out[310]: 4.4500000000000002 In [311]: ar.std() Out[311]: 3.4274626183227732 In [312]: ar.var(axis=0) # across rows Out[312]: array([ 12.6875, 4.1875, 11. , 10.75 , 18.1875]) In [313]: ar.cumsum() Out[313]: array([ 9, 13, 13, 14, 23, 23, 24, 32, 41, 41, 49, 55, 59, 62, 62, 66, 72, 80, 81, 89])
Logical operators can be used for array comparison/checking. They are as follows:
Generate a random 4 × 4 array of ints
and check if any element is divisible by 7 and if all elements are less than 11:
In [320]: np.random.seed(100) ar=np.random.randint(1,10, size=(4,4));ar Out[320]: array([[9, 9, 4, 8], [8, 1, 5, 3], [6, 3, 3, 3], [2, 1, 9, 5]]) In [318]: np.any((ar%7)==0) Out[318]: False In [319]: np.all(ar<11) Out[319]: True
In broadcasting, we make use of NumPy's ability to combine arrays that don't have the same exact shape. Here is an example:
In [357]: ar=np.ones([3,2]); ar Out[357]: array([[ 1., 1.], [ 1., 1.], [ 1., 1.]]) In [358]: ar2=np.array([2,3]); ar2 Out[358]: array([2, 3]) In [359]: ar+ar2 Out[359]: array([[ 3., 4.], [ 3., 4.], [ 3., 4.]])
Thus, we can see that ar2
is broadcasted across the rows of ar
by adding it to each row of ar
producing the preceding result. Here is another example, showing that broadcasting works across dimensions:
In [369]: ar=np.array([[23,24,25]]); ar Out[369]: array([[23, 24, 25]]) In [368]: ar.T Out[368]: array([[23], [24], [25]]) In [370]: ar.T+ar Out[370]: array([[46, 47, 48], [47, 48, 49], [48, 49, 50]])
Here, both row and column arrays were broadcasted and we ended up with a 3 × 3 array.
There are a number of steps for the shape manipulation of arrays.
The np.ravel()
function allows you to flatten a multi-dimensional array as follows:
In [385]: ar=np.array([np.arange(1,6), np.arange(10,15)]); ar Out[385]: array([[ 1, 2, 3, 4, 5], [10, 11, 12, 13, 14]]) In [386]: ar.ravel() Out[386]: array([ 1, 2, 3, 4, 5, 10, 11, 12, 13, 14]) In [387]: ar.T.ravel() Out[387]: array([ 1, 10, 2, 11, 3, 12, 4, 13, 5, 14])
You can also use np.flatten
, which does the same thing, except that it returns a copy while np.ravel
returns a view.
The reshape function can be used to change the shape of or unflatten an array:
In [389]: ar=np.arange(1,16);ar Out[389]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]) In [390]: ar.reshape(3,5) Out[390]: array([[ 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10], [11, 12, 13, 14, 15]])
The np.reshape
function returns a view of the data, meaning that the underlying array remains unchanged. In special cases, however, the shape cannot be changed without the data being copied. For more details on this, see the documentation at http://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html.
There are two resize operators, numpy.ndarray.resize
, which is an ndarray
operator that resizes in place, and numpy.resize
, which returns a new array with the specified shape. Here, we illustrate the numpy.ndarray.resize
function:
In [408]: ar=np.arange(5); ar.resize((8,));ar Out[408]: array([0, 1, 2, 3, 4, 0, 0, 0])
Note that this function only works if there are no other references to this array; else, ValueError
results:
In [34]: ar=np.arange(5); ar Out[34]: array([0, 1, 2, 3, 4]) In [35]: ar2=ar In [36]: ar.resize((8,)); --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-36-394f7795e2d1> in <module>() ----> 1 ar.resize((8,)); ValueError: cannot resize an array that references or is referenced by another array in this way. Use the resize function
The way around this is to use the numpy.resize
function instead:
In [38]: np.resize(ar,(8,)) Out[38]: array([0, 1, 2, 3, 4, 0, 1, 2])
Arrays can be sorted in various ways.
In [43]: ar=np.array([[3,2],[10,-1]]) ar Out[43]: array([[ 3, 2], [10, -1]]) In [44]: ar.sort(axis=1) ar Out[44]: array([[ 2, 3], [-1, 10]])
In [45]: ar=np.array([[3,2],[10,-1]]) ar Out[45]: array([[ 3, 2], [10, -1]]) In [46]: ar.sort(axis=0) ar Out[46]: array([[ 3, -1], [10, 2]])
np.array.sort
) and out-of-place (np.sort
) functions.np.min()
: It returns the minimum element in the arraynp.max()
: It returns the maximum element in the arraynp.std()
: It returns the standard deviation of the elements in the arraynp.var()
: It returns the variance of elements in the arraynp.argmin()
: It indices of minimumnp.argmax()
: It indices of maximumnp.all()
: It returns element-wise and all of the elementsnp.any()
: It returns element-wise or all of the elements