Nils Berglund, who is the author of many amazing simulation videos on Youtube, recently posted videos showing visualizations of waves on a sphere. In particular, this video caught my eye, depicting tsunami waves of the 2011 Japan earthquake travelling across the Pacific ocean.
In fact, back in 2011 I had made a very similar visualization inspired by the events, and later uploaded it to Youtube - see video below. The video was made as promotional material to get new students to major in mathematics at Aalto University. At that time I was a graduate student at the maths department working toward my PhD.
Anyway, it was as an undegrad in 2007 when I first got interested in solving the wave equation on the sphere. I had been studying the finite element method with great interest, and had programmed my own simple implementation of nodal P1 elements on triangles. I had succesfully used that to solve waves in planar domains - see video below. I thought that in order to solve the wave equation on a sphere, I would just use a mesh approximating the sphere instead. Turns out it's almost as simple as that, but there are a couple nontrivial details.
My undergrad studies included a mandatory "special assignment". These assignments were often to write a research paper on a subject - though they were published only internally at the department. I suggested to my professor that as my assignment I could study solving the wave equation on spheres using FEM.
Now, finally, I thought I'd revisit the topic, and have something generic written down on the subject a bit more publicly.
Wave equation and the finite element method
To get up to speed, let's look at the wave equation and the finite element method in a quick and non-rigorous fashion. The wave equation (in strong form) is
in which \( \ddot{u} \) is the second derivative with respect to time, \( c \) is the wave speed, which we'll just take as 1 from here on, and \( \Omega \) is the domain in which we're solving the equation at. In addition, there are initial conditions and boundary conditions that need to be described, but let's skip those for now.
FEM works with the weak formulation of the problem. To get the weak form from the strong form, both sides of the equation are multiplied by a test function \( v\) and integrated over the domain \( \Omega \). Integration by parts is applied to reduce the regularity requirement of \( u \) from 2 derivatives down to 1. The weak form is then: find \( u \in V \) such that
\[ \int_\Omega \ddot{u} v\ \mathrm{d}x + \int_\Omega \nabla u \cdot \nabla v\ \mathrm{d}x = 0 \quad \forall v \in V \]
With FEM, as with other Galerkin methods, we replace the space \(V\) with a finite dimensional approximation \(V_h\). We come up with a basis \( \{v_1,\ldots,v_N\} \) of \(V_h\), and write the finite dimensional approximate solution \( u_h \) as
\[ u_h = \sum_{i=1}^N U_i v_i \]
The problem then becomes: find \( U_i \) such that
The idea in the finite element method specifically is to define the basis functions \( v_i \) using a mesh, which fills the domain \(\Omega\). Assuming now a triangular mesh, a simple approach is to take each basis function as a piecewise first order polynomial, each piece being one triangle. Then, assign \(v_i = 1\) when evaluated at vertex \(p_i\), and \(v_i = 0\) when evaluated at any other vertex.
The integrals can be written as the sum of the separate contributions from each element
For any given \(i\) and \(j\), on the vast majority of the elements, \(v_i\) or \(v_j\) (or both) is identially zero on the element, and thus they don't contribute to the integral. Contributions come from only the triangles, that contain both vertex \(p_i\) and vertex \(p_j\).
Now, to actually compute the integrals over a triangle, a standard trick is to do a change of variables and perform the integration over a reference triangle instead. A coordinate on the reference triangle \( \hat{x} \) is easily mapped to \( x \) on the target triangle \(T\) via an affine transform \( x = A_T \hat{x} + b_T \). Let's take the triangle \(\hat{T}\) with vertices \( \hat{p}_1 = (0,0) \), \( \hat{p}_2 = (1,0) \), \( \hat{p}_3 = (0,1) \) as our reference triangle. With this choice, and the fact that affine transforms map first order polynomials back to first order polynomials, the basis functions can be expressed on the reference element in a simple way.
The 3 basis functions on the reference element are such that they get the value 1 in one vertex and 0 in the others. Written explicitly, they are
Here's where things get at bit non-standard from the usual FEM implementation. Our reference element lives in \(\mathbb{R}^2\) while the global element lives in \(\mathbb{R}^3\). This means that the matrix in our affine mapping is non-square, and we need to be a bit more careful with considering how the area scales and how the functions change under the change of variables.
Usually, with the change of variables \(x = f(\hat{x})\), we see the differential area element change as \( \mathrm{d}x = |\det J_f(\hat{x})| \), where \( J_f(\hat{x}) \) is the Jacobian of \( f(\hat{x}) \). This does not however make sense for a non-square \(J_f\) and thus something more general is needed. Turns out, that a slightly more general expression for the differential area element is \( \mathrm{d}x = \sqrt{\det J_f(\hat{x})^t J_f(\hat{x}) }\ \mathrm{d}\hat{x} \). Since \(J_f^t J_f\) is always square, this is well-defined in our case. Note, that in the case where \(J_f\) is square, this reduces to exactly the usual form. In our case, the Jacobian of the mapping is simply the matrix \(A_T\) of the affine transform, and we have \(\mathrm{d}x = \sqrt{\det A_T^t A_T}\ \mathrm{d}\hat{x} \).
The contribution of element \(T\) to \(M_{ij}\) can be computed as
While that's all for the mass matrix, for the stiffness matrix we have the additional headache of how the gradient will transform under the mapping. For basis function \(v_i\) we have \(v_i(x) = v_i(A_T \hat{x} + b_T) = \hat{v}_\hat{i}(\hat{x})\). Skipping a bit of detail for brevity, it follows from the chain rule that \( (\nabla \hat{v}_\hat{i})(\hat{x}) = A_T^t (\nabla v_i)(x) \), which by itself is enough for the typical case as \(A_T^t\) is invertible. For our case, however, we must consider additionally that the gradient \((\nabla v_i)(x)\) is actually a vector in the tangent space of \(\Omega\) at \(x\). Since the tangent space of \(\Omega\) within element \(T\) is spanned by the range of \(A_T\), there must exist some \(g\) such that \((\nabla v_i)(x) = A_T g\). Thus
\[ A_T^t A_T g = (\nabla \hat{v}_\hat{i})(\hat{x}) \]
from which it follows that
\[ g = (A_T^t A_T)^{-1} (\nabla \hat{v}_\hat{i})(\hat{x}) \]
After spatial discretization with FEM we're left with the ordinary differential equation
\[ M \ddot{U} + K U = 0 \]
Without going into detail, the problem is particularly well suited for e.g. the Crank-Nicolson method or the leapfrog integration method. These methods are stable for undamped oscillating problems and they do not produce decaying solutions. The key word for such methods is symplectic.
Crank-Nicolson
For Crank-Nicolson, we'll first convert the equation into first order form as
\[ \begin{pmatrix} I & 0 \\ 0 & M \end{pmatrix} \begin{pmatrix} \dot{U} \\ \ddot{U} \end{pmatrix} = \begin{pmatrix} 0 & I \\ -K & 0 \end{pmatrix} \begin{pmatrix} U \\ \dot{U} \end{pmatrix} \]
Denote then
\[ B = \begin{pmatrix} I & 0 \\ 0 & M \end{pmatrix} \]
\[ A = \begin{pmatrix} 0 & I \\ -K & 0 \end{pmatrix} \]
and
\[ z = \begin{pmatrix} U \\ \dot{U} \end{pmatrix} \]
to get
\[ B \dot{z} = A z \]
For the discretization, consider the derivative on the left size replaced by the approximation \( \frac{z_{n+1} - z_n}{\Delta t} \), and consider the function evaluation on the right side being performed at the point \( \frac{z_{n+1} + z_n}{2} \). With this, we'll arrive at
\[ B \frac{z_{n+1}-z_n}{\Delta t} = A \frac{z_{n+1} + z_n}{2} \]
The larger down-side of the Crank-Nicolson method is the requirement for solving a large non-symmetric system of equations for every time step.
Leapfrog integration
Leapfrog integration is an explicit symplectic method particularly well suited for mechanical problems, where the accelerations are due to position only, i.e. of the form
\[ \ddot{u}(t) = f(u(t)) \]
which our problem is an example of, as for our problem we have
This might look like Euler's method, but it is subtly different. The position update at \(n+1\) uses the velocity at \(n+1\) and not at \(n\) as for Euler's method.
This still requires solving a system of equations, but now the coefficient matrix is half the size as for the Crank-Nicolson method, and is also symmetric.
Turns out, however, that the mass matrix \(M\) is strongly diagonally dominant and spectrally equivalent to the identity operator. Thus, for large systems \(M\) is quite accurately approximated by a diagonal matrix that has the row (or column) sums of \(M\) as the diagonal elements. The system of equations thus becomes trivial.