real*8
type
on the Fortran side. If you call older Fortran code with single precision
real
variables,
float32
is the corresponding type in Python.
13.
F2PY may introduce time-consuming copying of arrays.
NumPy arrays created in Python and sent to Fortran with the aid of
F2PY, will in the wrapper code be copied to new arrays with Fortran or-
dering unless they already have been transformed to this storage scheme
(see Chapters 9.3.2 and 9.3.3). To avoid overhead in such copying, the
calling Python code can explicitly call the
asarray
function with the
order=’Fortran’
argument to ensure array storage compatible with For-
tran (see page 464 for an illustrating example). It is a good habit to
explicitly convert all arrays to Fortran storage prior to calling the For-
tran code.
14.
Calling C++ classes via SWIG involves proxy classes.
C++ extension modules created by SWIG have their classes mirrored in
Python. However, the class used from Python is actually a proxy class
implemented in Python and performing calls to pure C++ wrapper func-
tions. There is some overhead with the proxy class so calling the under-
lying C++ wrapper functions directly from Python improves efficiency.
15.
wrap2callable
may be expensive.
The convenient
wrap2callable
tool from Chapter 12.2.2 may introduce
significant overhead when you wrap a constant or discrete data, see
page 628.
8.10.4
Case Study on Numerical Efficiency
We shall in the following investigate the numerical efficiency of several im-
plementations of a matrix-vector product. Various techniques for speeding
up Python loops will be presented, including rewrite with
reduce
and
map
,

446
8. Advanced Python
migration of code to Fortran 77, use of run-time compiler tools such as Psyco
and Weave, and of course calling a built-in NumPy function for the task. All
the implementations and the test suite are available in the file
src/py/examples/efficiency/pyefficiency.py
Pure Python Loops.
Here is a straightforward implementation of a matrix-
vector product in pure Python:
def prod1(m, v):
nrows, ncolumns = m.shape
res = zeros(nrows)
for i in xrange(nrows):
for j in xrange(ncolumns):
res[i] += m[i,j]*v[j]
return res
Rewrite with
map
and
reduce
.
Loops can often be replaced by certain com-
binations of the Python functions
map
,
reduce
, and
filter
. Here is a first try
where we express the matrix-vector product as a sum of the scalar products
between each row and the vector:
def prod2(m, v):
nrows = m.shape[0]
res = zeros(nrows)
for i in range(nrows):
res[i] = reduce(lambda x,y: x+y,
map(lambda x,y: x*y, m[i,:], v))
return res
Below is an improved version where we rely on the NumPy matrix multipli-
cation operator to perform the scalar product and a new
reduce
to replace
the
i
loop:
def prod3(m, v):
nrows = m.shape[0]
index = xrange(nrows)
return array(map(lambda i:
reduce(lambda x,y: x+y, m[i,:]*v),index))
The
prod2
function runs slightly faster than
prod1
, while
prod3
runs almost
three times faster than
prod1
.

