The code was developed and tested on an Intel Pentium 330MHz with 128MB main memory, 512KB cache and Linux operating system (Kernel 2.0.35). The compiler used on this platform was the g77 GNU Fortran compiler version 0.5.19.1. The only optimizing compiler options which have been used are -fthread-jumps and -fdefer-pop. Care was taken that no similar computationally intensive process was running during the time the case study was benchmarked.
Additional tests have been made on a
Silicon Graphics Challenge-L with 8 CPUs (200MHz, 4MB secondary cache
each), equipped with IRIX 5.3 operating system and
SGI MIPSpro f77 compiler, and on a Silicon Graphics Indigo
with IP26 board and IRIX64 6.1 operating system. On these platforms, only
the -static compiler option has been used. No porting problems have
been experienced. Tests made using shorter time periods confirmed that all
machines produced the same numerical values up to 5-6 digits of accuracy.
The code compiled also on a Sun UltraSparc with f77 compiler version 4.0. However, the resulting code was not able to handle the requested number of input files.
On all platforms, computation times have been obtained with the UNIX time command.
Preparing the input files by some filter and interpolation routines took 13 minutes on the Intel platform. After that, the MESOPAC preprocessor needed 30 minutes user time on the Intel, 69 minutes user time on the SGI Challenge, and 116 minutes on the Indigo to process the data described above. As a consequence, all further optimization runs on the SGI platforms have been postponed. The resulting data file containing the wind field and other meteorological variables had a size of ca. 250MB. Moreover, the meteorological input data files need an additional amount of disk space of ca. 12MB. Consequently, it has to be expected that a full year simulation takes about 4 hours time on the Intel platform and consumes ca. 1.9GB disk space.
OLAF saves all the time-dependent pollutant masses in the different compartments into temporary files. For the case study considered here, each temporary file needs ca. 25MB of disk space. Given that 10 compartments are used and the meteorological dispersion code writes some additional files, the amount of disk space needed by OLAF for this case study sums up to ca. 325MB (not counting the preprocessor files). Note that after the end of the optimization run these files stay on disk to allow for further investigation of the computed parameters.
Figure Figure 6.10 shows some approximated level curves of
the effect function for the starting
point
, while Figure 6.11
(p.
) shows the effect function
overlayed over a map of the region considered.
Figure 6.10: Some approximated level curves of the effect function
for the starting configuration.
Figure 6.11: The effect function for the starting
configuration.
Starting from an objective function value of 70.193, the code needed
ca. 260 hours to calculate a local minimum with objective function value
of 29.018. The final point is
To reach this point, the code needed 23 iterations and 201 function evaluations.
While this number seems to be comparatively large, one has to take into
account the fact that two of the derivatives are approximated by central
differences. Therefore, each gradient evaluation (of which there is one
per iteration step) needs 4 function evaluations. All other function evaluations
are made during the simple Armijo-type linesearch. Therefore, it has to be
expected that more efficient line search mechanisms can decrease the number
of function evaluations significantly. Note that the stack height and width
remain virtually unchanged. This might indicate that the pollution load
is insensitive to small scale changes of these variables on the data used.
Alternatively, one might have a scaling problem here, and a rescaling of
the variables might be in order.
It has to be noted that the relative decrease of the objective function value,
, cannot serve as a measure for the efficency
of the code. In fact, Figure 6.3 shows that the function
value of the starting point can be made rather large simply by placing
the polluting facility as close to Paris as possible. Therefore, the quotient
above has to meaningful interpretation.
Figure 6.12 (p. ) shows
the iterations the locational decision variables underwent during the
optimization process. Note that this picture is partially incomplete, since
the whole decision space is four-dimensional.
Figure 6.12:
Iterations in the locational domain. Line searches and
iterations in the stack decision variables are not depicted.
As it can be seen, the code seems to encounter some convergence problems close to the local minimum found. This might be attributed to the inevitable noise in the objective function, an effect which might also be seen in an iteration vs. function value plot, see Figure 6.13.
Figure 6.13: Iteration versus function values.
A notable decrease in step sizes during the iterations has not been observed,
see Figure 6.14, p. .
Figure 6.14: Step sizes
chosen by the optimization routine. Note that the descent direction in
iteration no. 5 was rejected.
Figure 6.15 (p. ) shows the effect
function
for the final point
overlayed over a map of the region, while Figure 6.16
(p.
) shows some approximated level curves of the
same function.
Figure 6.15: The effect function for the final
configuration.
Figure 6.16: Some approximated level curves of the effect function
for the final configuration.
Figure 6.17 (p. ) shows the mass of the
pollutant in different compartments at grid point (25, 19), which is a
close approximation to Paris. Similarly, Figure 6.18
(p.
) shows the corresponding values for the grid point
(40, 23), which is a close approximation to Reims. Moreover, the
corresponding values are plotted for
Metz (grid point (61, 21), Figure 6.19, p.
),
Düsseldorf (grid point (68, 43), Figure 6.20,
p.
),
and Brussel (grid point (45, 39), Figure 6.21,
p.
).
Note that for all these figures the scales of the y-axes are differing
from picture to picture.
Figure 6.17: Time-dependent pollutant mass in different compartments
at Paris for the starting configuration and the final
configuration
.
Figure 6.18: Time-dependent pollutant mass in different compartments
at Reims for the starting configuration and the final
configuration
.
Figure 6.19: Time-dependent pollutant mass in different compartments
at Metz for the starting configuration and the final
configuration
.
Figure: Time-dependent pollutant mass in different compartments
at Düsseldorf for the starting configuration and the final
configuration
.
Figure 6.21: Time-dependent pollutant mass in different compartments
at Brussel for the starting configuration and the final
configuration
.
For example, the total pollution load in Paris, Reims and Brussel is decreasing
when the starting configuration is compared to the final
configuration
. However, the opposite is true for the cities
of Metz and Düsseldorf. Nevertheless, the decision represented
by
has a smaller objective function value
than
. This is probably due to the high population density around
Paris. The same effect can be observed when comparing Figure 6.11
and 6.10 with Figure 6.15 and 6.16.
Finally, Table 6.5 shows the profile of OLAF when run on the
data used in the case study after two function evaluations, as reported
by gprof Version 2.9.1. The work load in the optimization module
involves mainly linear algebra on matrices of dimension , so
the time spent in the corresponding routines can be neglected.
Table 6.5: Percentages
of total running time spent in subroutines. All other subroutines need less
time.
As it can be seen, most of the computation time is spent in subroutine ODE, in which the system (3.3) is solved for all grid points. Subroutine CMPCLLR, which is called from within ODE, computes the cell death and differentiation rates of the cytodynamic model described in Section 6.8. All other routines listed are subroutines of the MESOPUFF code. The table shows the typical sharp profile of a numerical analysis code, where most of the time is spent in a small number of routines, leaving ample space for further optimization and fine tuning.