Algorithm 688; EPDCOL: a more efficient PDECOL code
Descripción
Algorithm 688 EPDCOL: A More Efficient
PDECOL Code
P. KEAST Dalhousie
University
and P. H. MUIR Saint Mary’s
The software nonlinear
package
partial
collocation
in
equations. the will
the
current
variable
to solve
the
of the equations
paper
will
performance Savings
a system
functions
which
and the
resulting
50 percent
full
differential
hnear
equation
component,
from
and
replacing
advantage
in total
of
with
in ~he collocation;
on the third
take
systems
approach,
of ordinary
equations;
of PDECOL
of over
to solve
employed
concentrate
code by modules
arise.
wishing
on a method-of-lines to
differential
This
of the
which
problem
of ordinary
in the
modules
the
the basis
algebra.
scientists
code is based
components:
system
improvement
algebra
code among
The
to reduce
principal
the linear
on the
linear
structure
space
handles
report
[7] is a popular equations.
are three
used
which
PDECOL
differential
There
method
solver
University
of the
execution
the special
time
can be
realized. Categories
and
Subject
rithms;
G. 1.8
method
of lines, parabolic
— certification General
Descriptors:
[Numerical and
Terms:
G. 1.0
Analysis]: equations;
testing,
portability,
Performance
Words
Phrases:
Key
differential
equations
and
Analysis]:
Differential
G.4 [Mathematics
efficiency,
Algorithms,
Additional
[Numerical
Partial
of Computing]:
reliability
Collocation,
General–numerical
Equations—finite
method
and
methods,
Mathematical
robustness,
of lines,
algo-
element
Software
verification
numerical
software,
partial
1. INTRODUCTION
differential equations arise in a wide variety of scientific and engiPartial neering applications. There are many techniques for obtaining numerical solutions to these problems, some of which have been implemented in large software packages. One of the more widely used packages is PDECOL [7]. The general purpose software which is available in the major packages for the solution of time dependent partial differential equations using the method of lines usually involves three main components. First of all, the space This
work
Authors’ P.H.
was supported Addresses:
Muir,
Saint
email:keast@cs. Permission not made
by the Natural
P. Keast, Mary’s
or distributed
specific
permission.
@ 1991
ACM
Transactions
Halifax, of this
commercial
date appear, Machinery.
0098-3500/91/0600-0153 ACM
Halifax, Nova
Nova
Research
Council
Scotia,
Canada,
Scotia,
Canada,
of Canada. B3H B3H
4H8; 3C3;
dal.ca.
fee all or part
for direct
of the publication and its Association for Computing
and Engineering
University,
University,
dal. ca, muir@cs.
to copy without
Sciences
Dalhousie
material
advantage,
is granted the ACM
provided copyright
and notice is given that copying To copy otherwise, or to republish,
that notice
the copies
are
and the title
is by permission of the requires a fee and/or
$01.50
on Mathematical
Software,
Vol.
17, No.
2, June
1991,
Pages
153-166.
154
.
P, Keast and
P, H. Muir
variable is discretized in some manner, so that the partial differential equation becomes a system of ordinary differential equations in time. Then a suitable ordinary differential equation solver is chosen to approximate the solution of this system. Finally, a linear algebra package is required to handle the linear equations which arise in the calculations done by the ode solver. If the linear systems are not assumed to have any special structure then a general purpose algorithm may be used. But, since a great many systems will be solved in the course of one calculation, clearly it is important to take advantage of any special properties the systems have. This paper will concentrate on the linear algebra component of the software package PDECOL. In PDECOL, the systems of linear equations are solved using a band solver. However, a closer inspection of the structure of the linear systems shows that they in fact have a special block structure called almost block diagonal. By taking this structure into account and employing a linear system solver able to take advantage of the special pattern of the nonzeros, major savings in both storage and execution time can be achieved. The package COLROW [3] is designed to solve efficiently the kinds of matrices that arise in PDECOL, while introducing no fill-in. The goal of our project has been to modify PDECOL so that it performs the solution of the linear systems in a more efficient fashion using the solver COLROW. After making the modifications, we conducted a series of tests to measure the improvements in execution time on several problems. In general, we have found major improvements in execution time for the linear system solution of as much as 70 percent in some cases. Furthermore, these savings result in overall savings in the execution of the package of as much as 50 percent with no loss in accuracy or stability in the solution of the linear systems. In addition to describing the efficiency improvements we have obtained through our modifications to PDECOL, another contribution of this paper is the provision of further external documentation for the PDECOL package. According to the original authors, PDECOL was by no means considered to be the last word in collocation based method-of-lines codes. Rather it was seen as a first attempt to produce software in that area; the later modification of the package was expected. This is generally true of all software packages; later mathematical, algorithmic, and software/hardware developments necessitate modification of the original package. The availability of good external documentation is critical not only to this software modification process, but also for those who wish to understand the implementation details of a newly published package, such as heuristic algorithms employed within the code, data structures used, data flow considerations, modularize tion concerns, etc, that is, the many important decisions that represent the difference between a precise, complete mathematical description of the solution algorithm and a high quality “production level” software package suitable for general distribution. In addition to describing the software modifications mentioned earlier, this paper provides further explanation of some of the implementation details of method-of-lines software, in general, and in particular, the specific implementation in PDECOL. ACM
Transactions
on Mathematical
Software,
Vol
17, No.
2, June
1991,
Algorithm
688:
EPDCOL:
A More
Efficient
PDECOL
Code
.
155
This paper is organized as follows. In Section 2 we describe certain aspects of the numerical algorithm employed by PDECOL. In Section 3 we look more closely at how the linear systems are generated and at the structure of the matrices which result, and point out why a band solver is inappropriate. This closer examination of the structure of the linear systems reveals that they are in fact almost block diagonal, and thus the use of a special block solver is suggested. Section 4 describes the software modifications we have undertaken and gives descriptions of the specific changes made to the four core modules INITAL, ADDA, PSETIB, and DIFFUN, as well as providing brief descriptions of these modules. In Section 5 we compare PDECOL with the modified version which we call EPDCOL, with respect to operation counts, storage, and execution time and comment on our results. We close, in Section 6, with some conclusions and suggestions for future work. 2. OVERVIEW
OF THE NUMERICAL
TECHNIQUE
A description of the class of partial differential equations which can be treated with PDECOL is given by Sincovec and Madsen [7]. The new version of PDECOL, EDPCOL, handles this same problem class except for a slight restriction, to be discussed shortly. The basic method-of-lines with collocation techniques as implemented in the PDECOL package is also described by Sincovec and Madsen [7] to which we refer the reader for the background to our current discussion. In the description of that algorithm, the k-th component of the vector solution U( t,x) of the system of partial differential equations, is denoted by uh(t,x) and is approximated by NCPTS
Uk(t,
x) =
~ 1=1
cL,k(t)@ z(x),
where c,, ~(t)is the k-th component of the vector of unknown coefficients !,(t) is the number of associated with the i-th basis function 0,(x) and NCPTS collocation points. Collocation at the internal collocation points leads to NPDE*( NCPTS-2) ordinary differential equations for the NPDE*NCPTS unknowns, where NPDE is the number of partial differential equations. In order to establish ordinary differential equations for the remaining unknowns, derivatives of the boundary conditions are employed for components where boundary conditions exist and collocation otherwise. We will return to this point shortly. As indicated by Sincovec and Madsen [71, the particular form of the ~(U, lJX) = equations obtained by differentiating the boundary conditions, 2 in the above equations and q~(x~) = O, results from the fact that OZ(XL) = 0, i = 2, . . . . NCPTS, i = 3, . ., NCPTS, when the basis functions are B-splines. Similarly only ‘NCPTS
and
‘NCPTS
I appear
Let us define the NPDE of the form
Then the above equations
[w
in
the
by NPDE
‘R.
equations
at
matricies
W and V to have components
based on the boundary
equations
can be written
as
dcd;t) .=$
v]
[1
–2
Ldt]
where g,(t) is the vector whose j-th component is C:,J(t). A similar condition can be written down for the right hand endpoint, using the boundary conditions at matricies W and V derived from differentiating the right endpoint. This leads to
of the right side values for the where dj( t)/ dt is the vector of derivatives boundary conditions imposed at XR. It is very important to note that, depending on the system of partial differential equations, not all components of the boundary condition equations will be applied at both XL and XR. We will say that a null boundary condition has arisen for the k-th partial differential equation, if both 8 b~ /8 U~ and 8 b~ /8( UX)h are zero. In such cases, PDECOL applies a collocation This ensures that condition in place of the missing boundary condition. equations will be associated with each endpoint. Therefore exactly NPDE some of the matrix equations above might be collocation equations (at either &~ or uxjuxz)
au af aq =+————
af
agx”ac+agxx”
agxz ac”
Of course, g
= [@,(&,)]
02($,)1
..
ONCPTS($J)I],
and the expressions for dQX,1d C and d~x, / d C are similar. substitutions, the ~“-th block row of d G( t, C)/ a C becomes [ x,,,
xJ,z
“““
‘,, AW*s]
With
these
,
where X~, ~ = (a f/ag)@l($J) + (a f/du.)@; (&J) + (~ f/aLL.)@j’(&J). Ho---, because of the properties of the B-splines only KORD basis functions will be nonzero for any collocation point. Thus the matrix is considerably sparser than indicated above. We consider this structure in the next section. 3. THE LINEAR SYSTEMS In this section we discuss the structure of the linear systems that arise during the numerical solution of partial differential equations using PDECOL. As a result of the support properties of B-splines and the choice of ACM
TransactIons
on
Mathematical
Soft
ware,
Vol
17,
No
2, June
1991
Algorithm 688: EPDCOL: A More Efficient PDECOL Code lVCC collocation A( t, C) and ~G(t,
points
in
.
159
each
subinterval, both of the block matrices, block diagonal structure. For examhave the following — structure with KORD = 4 and
C)/tI C, have an almost
ple, these matrices NCC = 2,
will
% Each of these blocks is of size NPDE by NPDE. The row of blocks at the top is associated with the boundary conditions at the left end; the row of blocks at the bottom is associated with the right hand boundary conditions. The other rows of blocks are associated with collocation points. Rows associated with collocation points on the same subinterval will have the same nonzero block structure. Each column of blocks is associated with one basis function. In the PDECOL code this matrix is viewed as having a banded structure as follows.
Obviously, there is substantial excess storage used just to store elements that are in fact known to be zero. Asymptotically, the extra storage represents about 50 percent of the actual nonzero storage. The band matrix decomposition and solution routines, DECB and SOLB, are used in PDECOL to solve this system. Obviously, since only the white blocks are actually nonzero, considerable fill-in occurs beyond the shaded area, since Gaussian elimination with partial pivoting on a banded structure increases the bandwidth by a further 50 percent. Clearly, by paying closer attention to the block structure of the matrix, a more efficient treatment of the matrix is possible. The key to improving this situation is to take advantage of the almost block diagonal structure of the linear system. Some preliminary work in this direction was done by Hindmarsh, Madsen and Dyer in 1977. They replaced DECB/SOLB with a slightly modified version of the block solver SOLVEBLOK [2]. In preliminary testing of the modified code improvements of 10 percent to 20 percent were noted [61, but because SOLVEBLOK still caused some fill-in, further improvements were possible. These can be achieved by ACM
Transactions
on Mathematical
Software,
Vol
17, No. 2, June 1991.
160
.
P. Keast and P.
H. Muir
means of the package COLROW, which makes use of an alterm ting row and column elimination algorithm to factor an almost block diagonal system no fill-in. For a detailed discussion of the COLROW while introducing package see Diaz et al. [3]. 4. MODIFICATION
OF THE PDECOL
PACKAGE
In this section we discuss only those modules in which extensive modifications were required in order to employ the almost block diagonal system solver, COLRO W. For each of the routines, we give a brief description and describe the changes made to it. The four modules in which major modifications were made to create the improved version, EPDCOL, are INITAL, PSETIB, DIFFUN and ADDA. There were two major modifications: (1) replacement of calls to the band matrix factorization and solution routines DECB and SOLB by the corresponding routines CRDCMP and CRSLVE which handle almost block diagonal matrices, and (2) changing the algorithm which assigns elements of the Newton matrix to the vector in which the matrix is stored. For this latter modification, since there is considerable difference in the way the assignment is done, depending on whether the Newton matrix is viewed as being banded or almost block diagonal, the details of the PDECOL and EPDCOL versions of these routines differ considerably. In the PDECOL package, the values for the basis functions and their first and second derivatives are assigned simply in the order of the collocation points. This is not a convenient ordering for the coefficient matrix stored in band form and thus considerable stepping through the matrix is required to make the assignments. The information associated with a collocation point is stored in the same block row of the coefficient matrix, but because this matrix is stored in a banded form, elements in the same block row are not stored together. In EPDCOL on the other hand, all the block rows associated with a particular subinterval are stored nearby, so the treatment of coefficient matrix elements by collocation point is quite convenient, and of course leads to more efficient access of the computer memory. We now give a brief overview of each of the four new routines which received major modification. For a more detailed description, we refer the reader to the internal documentation of each of these modules, included in the source listings in the Appendix of this paper. We first consider the INITAL routine; it received both modifications (1) and (2). The activities of this module include setting up the breakpoint sequence for the B-splines, the collocation point sequence, the matrix which contains the B-spline basis functions and their first and second derivatives evaluated at each collocation point, and the coefficient matrix and right hand side vector which will be used to produce the basis function coefficients for the initial conditions. The last action of this routine is to solve the linear system to obtain these coefficients. We next consider the routine PSETIB which is used to construct the Newton matrix required by the IVP solver STIFIB. It received modifications (1) and (2). This routine will calculate the Jacobian of the system, one block ACM
TransactIons
on Mathematical
Software,
Vol.
17, No
2, June
1991
Algorithm 688: EPDCOL: A More Efficient PDECOL Code
.
161
row at a time, corresponding to each collocation point. For each block row, it evaluates the solution approximation at the current collocation point and provides this information to a routine which computes the Jacobian of the system of ordinary differential equations. These values are combined with the basis function values and the derivative values in an appropriate fashion and are then assigned to the matrix PW. Again the major difference between the P13ECOL and EPDCOL versions is in the manner in which values are assigned to the P W matrix. The routine called ADDA received modification (1). This routine adds the matrix of basis function values and derivatives (stored in a vector called A) to the main data structure, PW, which contains the coefficient matrix to be factored. The routine DIFFUN received both modifications (1) and (2). This routine sets up and calculates the solution to the system PWY = G where G is the right hand side vector and P W is the matrix mentioned above. Two other routines received minor modifications. In STIFIB, the only change is the replacement of the SOLB routine by CRSLVE, which solves the decomposed linear systems. The main routine PDECOL received minor changes corresponding to the different storage requirements of the new version of the code. Thus, in order to make the transformation from PDECOL to EPDCOL, the subroutines to be replaced are PDECOL, INITAL, STIFIB, DIFFUN, PSETIB, ADDA, DECB, and SOLB. Once the first six are replaced by the modified modules, and the last two by CRDCMP and CRSLVE, the changes will be transparent to the user. No changes are made to the calling sequence of PDECOL. 5. EVALUATION
OF EPDCOL
In this section we compare the performance of the original version of PDECOL and the new version of the code, EPDCOL. We will compare the codes with respect to storage requirements and operations counts, in general, and with respect to execution time on a number of sample problems. 5.1.
Storage
and Operation
Count
Comparison
In our consideration of storage requirements, we will restrict ourselves to PW, since examining only the requirements for storing the primary matrix, all other storage requirements for the two codes are the same. It can be shown that the highest power terms in the storage cost expression for the band and almost block diagonal representations of PW are, respectively, and NPDE2 *NZNT*KORD2, where we have cho3*NPDE2 *NINT*KORD2 sen NCC = 2. Thus, the storage requirements for P W in the original code PDECOL are three times those of the new version, EPDCOL. Let us now consider the operation counts for the solution of the linear systems using the band solver and the almost block diagonal solver COL ROW. Upon substitution of the appropriate values in the expression for the operation costs involved in the decomposition phase of COLROW, as given by Diaz et al. [3], and for DECB, used in PDECOL, and considering only cubic ACM
Transactions
on Mathematical
Software,
Vol.
17, No. 2, June 1991.
162
P. Keast and P. H. Muir
.
terms,
we get, respectively, ~ *NINT*NPDE3*KORD3 and 2*NINT*NPDE3* where have set NCC’ = 2. Thus the expected execution time for the decomposition of the linear system based on COLROW is approximately + of the execution time using DECB. KORD3,
5.2.
Execution
Time Comparison
In this section we first describe the problems we have selected in order to perform a brief computational comparison of the two codes. For each problem we provide a table of execution times for various choices of the parameters NINT, NPDE, and KORD (we make the usual choice of NCC = 2 for all our tests). All times were computed on the VAX-785 of the Department of Mathematics, Statistics, and Computing Science, Dalhousie University, using double precision Fortran 77. All software was compiled using the UNIX !i’77 compiler with no optimization or other special switches set. 5.2.1 The Test Problems. The first and second problems are taken from Sincovec and Madsen [7], to which we refer the reader for detailed descripthe point in the t direction tions. For both of these problems, we set TOUT, where we want the solution, to be 1.0. For the first problem we set EPS, the tolerance to be used for the time integration, to 10-4 and DT, the initial step size for the time integration, to 10-7. For the second problem we set EPS = 10-4 and DT = 10-9. The third problem we consider is in fact a family of related problems in which we can easily control the size of the system of partial differential equations. The system consists only of a simple linear problem. Problem
3:
where ~ is a vector matrix. Boundary
a2u ax2’ and
NPDE
A is a constant
NPDE
by NPDE
Conditions: g=o _,
Initial
of length
ag —=A~ at
atx=O,
t>O,
u=
(),
atx=l,
t>O
Conditions:
g=
sin(mx),
at t=O,
x=[O,
l],
to be 0.5. We where Q is the zero vector of length NPDE. We choose TOUT choose A simply to be the identity; this does not change the cost of solving the system. This is not a difficult problem; we choose it to examine the KORD and NINT for larger values of execution cost as a function of NPDE, NPDE than we have considered in the previous problems. For this problem we set EPS = 10–4 and DT = 10-7. It should be emphasized that in all of these problems, we are not interested in the performance of the codes with respect to handling a difficult problem since both codes are applying exactly the same numerical technique. It is only the manner in which the linear systems are solved that differs between ACM
‘lkansactlons
on
Mathematical
Software,
Vol.
17,
No
2, June
1991
Algorithm 668: EPDCOL: A More Efficient PDECOL Code Table
I.
Execution
Times
(in seconds)
for Problem
PDECOL KORD
.
163
1.
EPDCOL KORD
NINT
4
6
8
4
6
8
30 60 90 120
76.12 157.73 217.10 303.13
217.90 406.82 663.85 866.90
457.78 987.44 1496.37 1894.48
71.77 146.13 199.27 273.02
171.78 318.15 482.77 712.13
313.92 651.53 1114.06 1452.54
Table
II.
Execution
Times
(in seconds)
Problem
PDECOL KORD NINT
30 60 90 120
4
6
16.92 28.75 43.85 57.75
40.40 68.82 110.53 152.32
2,
EPDCOL KORD 8
4
6
8
68.45 119.37 201.45 273,72
18.38 28.90 46.25 57.38
37.92 75.62 101.60 152.47
64.06 135.85 175.77 238.25
the two codes. The only differences that occur between the solutions are from the small differences caused by round-off error in the solution of the linear systems. 5.2.2 Execution Time Results. For the first sample problem, we present total execution time in seconds for values of KORD of 4, 6 and 8, and values of 30, 60, 90 and 120 (Table I). of NINT For the second problem we give total execution time results for the same parameter values as in problem one (Table II). For the third family of problems, we compare the execution times of the two codes for values of KORD of 4, 6 and 8; values of NINT of 20, 40, 60 and 80; and values of NPDE of 2, 4, 6 and 8 (Tables HI-VI). 5.2.3 Discussion of Results. A comparison of the results for the two small problems of size two and one shows fairly modest improvements in execution of perhaps 20 percent, at most. For the small systems this is the expected kind of performance improvement because the contribution of the cost of solving the linear systems is a minor part of the overall costs. However, as the size of the system increases, the relative contribution from the solution of the linear system to the overall costs increases and it thus becomes more important to perform the linear algebra efficiently. A more informative comparison can be obtained by considering the performance of PDECOL and EPDCOL on a family of slightly larger systems. By examining Tables III and IV, we see that the larger NPDE and KORD are chosen, the greater the difference in overall execution time. In fact, for NPDE = 8, KORD = 8 and NINT = 80, for example, the execution time for PDECOL is 7937 seconds, while for EPDCOL it is 2158 seconds. This is a saving of about 70 percent in overall execution time. ACM
Transactions
on Mathematical
Software,
Vol.
17, No. 2, June 1991.
164
P. Keast and P, H Muir
.
Table
KORD NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT
=+ = = = = = = = = = = = = = = = =
2 20 4 20 6 20 8 20 2 40 4 40 6 40 8 40
NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT
+ = 2 = 20 = 4 = 20 = 6 = 20 = 8 = 20 = 2 = 40 = 4 = 40 = 6 = 40 = 8 = 40
Execution
Times
6
(in seconds)
31.45
6840
35.87
112.05
27763
79.53
276.40
762.15
14928
62(I 83
22,67
63.27
137,70
6758
233.20
569.40
148.35
593.60
155578
282.52
1258.78
3523.08
IV
Execution
NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NZNT NPDE NINT
174663
Times
4
6
1093
2390
47.15
26.58
63.87
135.22
49,88
130,63
290.27
83.80
22955
54373
(in seconds)
47.88
4937
129.05
26817
92.62
263.35
57912
153.25
469,83
1086.15
+ = = = = = = = = = = = = = = = =
NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT
94.22
2 60 4 60 6 60 8 60 2 80 4 80 6 80 8 80
2 60 4 60 6 60 8 60 2 80 4 80 6 80 8 80
3.
6
35.02
98.73
8
210,97
108.72
37060
880.83
232.98
906.91
2333,87
443.55
1917,57
5339,38
4687
133.82
27880
144,35
481,57
1233,83
32802
121455
351620
638.80
2580,27
7937,06
a = = = = = = = = = = = = = = = =
Problem
4
for EPDCOL,
KORD
8
20.20
for PDECOL,
KORD
8
12.17
Table
KORD
III
4
Problem
3
4
6
8
31.23
75.57
141.38
76.87
202.90
40473
145.40
41187
873.47
241.00
729.85
1633,38
41.27
102.30
188.67
102.88
2’70.50
537.38
194.62
54847
116242
329.58
969.30
215855
The primary difference between the two versions is of course the cost of decomposing the linear systems. In Tables V and VI we compare the times of performing this operation in PDECOL and EPDCOL. The major savings that occur can be clearly seen by examining these tables. As expected, the larger KORD and NPDE chosen, the more dramatic the savings. For example, with KORD = 8, NPDE = 8, and NINT = 80, the total cost of decomposing all the linear systems that arise in EPDCOL is less than 20 percent of the cost using PDECOL. It should be emphasized that the numbers produced by both versions of the code are the same to within the requested tolerance and that, except for the ACM
TransactIons
on
Mathematical
Software,
Vol
17,
No
2, June
1991
Table
KORD
=)
NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT
= = = = = = = = = = = = = = = =
2 20 4 20 6 20 8 20 2 40 4 40 6 40 8 40
Table
KORD NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT
V.
=) = 2 = 20 = 4 = 20 = 6 = 20 = 8 = 20 = 2 = 40 = 4 = 40 = 6 = 40 = 8 = 40
Algorithm 688: EPDCOL: A More Efficient PDECOL Code
.
Linear
3.
System
Factoring
4
6
0.98
5.20
17,23
8.1’7
43.12
139.22
28.30
146.93
496.20
68.05
402.78
1305,32
1.85
11.65
35.27
15.10
96.15
290.50
52.12
334.40
1019.65
128.70
837.07
2628.48
VI.
Linear
4
System
8
Factoring
6
8
0.38
1.57
4.70
2.83
11.18
33.95
9.15
37.28
111.25
21.42
86.83
265.95
0.72
3.47
9.30
5.17
24.87
67.85
16.72
80.93
223.42
38.38
189.50
531.95
solution of the linear identical.
Times
system,
(in seconds)
KORD NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT
Times
= = = = = = = = = = = = = = = =
2 60 4 60 6 60 8 60 2 80 4 80 6 80 8 80
(in seconds)
KORD NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT NPDE NINT
the numerical
=) = 2 = 60 = 4 = 60 = 6 = 60 = 8 = 60 = 2 ==80 = 4 = 80 = 6 = 80 = 8 = 80
for PDECOL,
4
Problem
165
6
8
3,08
17.58
53.75
26,08
150.30
456.80
86.03
498.93
1529.03
209.12
1251.93
4007.58
4.92
23.97
71.83
39.65
194.27
657’ .55
136.60
669.05
2384,37
333,27
1691.52
6037.57
for EPDCOL,
4
Problem
3.
6
8
1.25
5.28
13.88
8.67
37.47
102.40
27.82
122.35
337.02
63.57
286.05
801.22
2.02
6.83
18.72
13.53
5010
135.37
43.07
162.93
448.80
99.78
379.73
1058.23
techniques
being
applied
are
6. CONCLUSIONS
PDECOL is a software package for the solution of second order partial differential equations. It is based on the method of lines with collocation. The matrices that arise during the computation of the solution are sparse. In PDECOL, these matrices are viewed as being banded, and thus decomposition and solution routines for a band matrix are employed by PDECOL to handle these systems. However, by examining the structure of these matrices more closely, it can be seen that they are in fact almost block diagonal. ACM
TransactIons
on Mathematical
Software,
Vol.
17, No
2, June 1991
166
P. Keast and P. H Muir
.
Our work has involved replacing the band matrix routine with a package, COLROW, designed to handle almost block matrices. This work has involved a very detailed investigation of the PDECOL package; however the results have been most rewarding. Except for very small systems of one or two partial differential equations, very significant savings in execution time are obtained for EPDCOL, over the original PDECOL code. Even for systems with only four to eight equations, we have found that savings in execution time of approximately 50 percent are obtained for various values of the parameters. In addition, savings in storage requirements of around 50 percent would be obtained. This suggests that the use of the COLROW package has produced a dramatic improvement in the performance of the PDECOL package. Our investigation of the PDECOL package has suggested several other possibilities for future work. One project would be to replace the STIFIB IVP solver with a modification of a more advanced IVP solver such as LSODI [5]. Another area of investigation would involve replacing the B-spline package with a package based on monomial splines, as has recently been done in another collocation based software package, COLSYS [1]. We wish to thank Dr. Graeme Fairweather for his many helpful suggestions during the course of this work. We also wish to thank the anonymous referees for many useful comments. REFERENCES
1. BADER,G , AND ODE
2. DEBOOR, linear
ASCHER, U.
SIAM
solver. C,,
AND WEISS,
systems.
ACM
R.
Trans.
4,
diagonal Math.
HINDMARSH, ODE
linear
with
Softw. A
C.
banded
basis
Cornput.
implementation 8, 4 (July
SOLVEBLOK:
!l’rans,
3. DIAZ, J. C., FAIRWEATHER, block
A new
J. SCZ. Stat.
G.,
systems
by
9, 3 (Sept.
order
boundary
value
483-500,
package
for
solving
1 (Mar. 1980), 80-87. ANDKEAST,P. FORTRAN packages
Math.
almost
block
diagonal
6,
So/tw.
modified
alternate
row
and
for solving column
certain
almost
ACM
ehmination,
1983),358-375.
Preliminary Jacobian.
A
for a mixed
1987),
documentation
of GEARIB:
Rep UCID-30130,
Lawrence
Solution Livermore
of implicit Lab,
systems
Livermore,
Calif.,
1976 5. HINDMARSH, Newsl. 6
A, C.
15, 4 (Dec.
HINDMARSH,
A. C.
LSODE 1980),
and LSODI,
private
communication,
7. SINCOVEC, R, F., AND MADSEN, 1979),
Received
ACM
two
new
initial
value
ODE
N.
TransactIons
1989;
on
revised
ACM
SIGNUM
1984.
K.
Algorithm
1989;
accepted
540.
ACM
Trans.
326-351,
June
solvers.
10-11
March
Mathematical
Software,
Vol
17,
April
1990
No
2, June
1991
Math.
Softzo, 5, 3 (Sept
of
Lihat lebih banyak...
Comentarios