NumPy has a specialized
chararray
object, which can hold strings. It is a subclass of ndarray
, and has special string methods. We will
download a text from the Python website and use those methods. The advantages of chararray
over a normal array of strings are as follows:
Let's create the character array.
We can create the character array as a view:
carray = numpy.array(html).view(numpy.chararray)
Expand tabs to spaces with the expandtabs
function. This function accepts the tab size as argument. The value is 8
, if not specified:
carray = carray.expandtabs(1)
The
splitlines
function can split a string into separate lines:
carray = carray.splitlines()
The following is the complete code for this example:
import urllib2 import numpy import re response = urllib2.urlopen('http://python.org/') html = response.read() html = re.sub(r'<.*?>', '', html) carray = numpy.array(html).view(numpy.chararray) carray = carray.expandtabs(1) carray = carray.splitlines() print carray