USU Math 4610
Routine Name: parallel_contents_gradient
Author: Philip Nelson
Language: C++. The code can be compiled using the GNU C++ compiler (gcc). A make file is included to compile an example program
For example,
make
will produce an executable ./conjugateGradient.out that can be executed.
Description/Purpose: Conjugate Gradient is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is symmetric and positive-definite. conjugate_gradient
is implemented as an iterative algorithm, applicable to sparse systems that are too large to be handled by a direct implementation. 1
This code uses OpenMP to parallelize the iterations
Input: This routine takes two inputs, a matrix A, and a right hand side b
Output: The routine returns the solution x to the equation A * x = b.
Usage/Example:
int main()
{
auto A = generate_square_symmetric_diagonally_dominant_matrix(4u);
auto b = generate_right_side(A);
auto x = parallel_conjugate_gradient(A, b);
auto Ax = A * x;
std::cout << " A\n" << A << std::endl;
std::cout << " x\n" << x << std::endl;
std::cout << " b\n" << b << std::endl;
std::cout << " A * x\n" << Ax << std::endl;
}
Output from the lines above
A
| 10 -1 2 0 |
| -1 11 -1 3 |
| 2 -1 10 -1 |
| 0 3 -1 8 |
x
[ 1 2 -1 1 ]
b
[ 6 25 -11 15 ]
A * x
[ 6 25 -11 15 ]
explanation of output:
First, the matrix A is generated and displayed. It is a square matrix with uniformly distributed numbers and is symmetric and diagonally dominant. Then the rhs is computed and x is solved for and displayed. Finally b is shown and A * x is shown. We can see that b == A * x which is good.
Implementation/Code: The following is the code for parallel_conjugate_gradient
template <typename T>
std::vector<T> parallel_conjugate_gradient(
Matrix<T>& A,
std::vector<T> const& b,
unsigned int const& MAX_ITERATIONS = 1000u)
{
static const T tol = std::get<1>(maceps<T>());
std::vector<T> x_k(A.size(), 0), x_k1(A.size(), 0);
std::vector<T> r_k = b;
std::vector<T> r_k1 = r_k, r_k2 = r_k;
std::vector<T> p_k = r_k, p_k1 = r_k;
#pragma omp parallel
{
#pragma omp for
for (auto k = 0u; k < MAX_ITERATIONS; ++k)
{
if (k != 0)
{
auto b_k = (r_k1 * r_k1) / (r_k2 * r_k2);
p_k = r_k1 + b_k * p_k1;
}
auto s_k = A * p_k;
auto a_k = r_k1 * r_k1 / (p_k * s_k);
x_k = x_k1 + a_k * p_k;
r_k = r_k1 - a_k * s_k;
if (allclose(x_k, x_k1, tol))
{
return x_k;
}
r_k2 = r_k1;
r_k1 = r_k;
x_k1 = x_k;
p_k1 = p_k;
}
}
return x_k;
}
Last Modified: December 2018