Sunday, March 3, 2024

GPS disciplined oscillator

Introduction

The best (overall) frequency reference I have is in my Racal-Dana 1992 frequency counter, which has the ovenized oscillator option 04A. It's great, but calibrating it with the equipment I have is an annoying task. I calibrate it against GPS. However, as GPS time is stable enough only in the long term, it requires measuring a lot of GPS PPS samples. This takes a long time, and even longer if I want to adjust the time base. This got me thinking if I could automate the process, and after some thought I realized the best approach would be a GPS disciplined oscillator.

While there are commercial GPSDO devices for sale, I can't justify the cost for buying one. I could however design and build one. Turns out I can justify much higher cost for such a project, as there's a a lot of benefit in the learning experience.

My goal would be to produce 10MHz +- 1ppb. I also want to keep the phase noise low, but that's somewhat secondary and without quantitative requirements.

The GPS modules I've used are really old Fastrax parts. They are specified to 50ns RMS jitter on the PPS output. Not great, but workable. This pushes the integration time to around 1 hour to get down to 1 ns jitter. I would not trust the PPS pulse below a 1 hour window anyway (more on that later). Anyway, this puts a requirement on the stability of the oscillator: it would need to be stable (better than +-1ppb) for much longer than the GPS integration period - say 10 hours. This puts me well into the OCXO domain.

DIY OCXO - fail

I happen to have some old VCTCXOs (voltage controlled temperature compensated crystal oscillator), which I found thrown out years ago. They seemed high quality and expensive (from Rakon), so I always wanted to find a use for them. I first thought to build temperature control around one of those and convert it to an OCXO. Using a DS18B20 and PWM controlling a heater I managed to get the temperature to be controlled well within 0.1 degrees Celsius. However, I could not get the crystal oscillator to behave though. I kept the control voltage grounded to make sure any noise on the control voltage wasn't causing the issue. In the end it may have been due to the supply voltage sensitivity of the part, which is not too good at +-300 ppb for +-5% supply voltage. This would require about 1 mV stability for the supply voltage for my application.

Rakon VCTCXO frequency and temperature, moving average over 5 minutes
Measured with Racal-Dana 1992 and DS18B20

Looking around the internets, I found that all the cool cats are playing with OSC5A2B02 OCXOs, which are available on Aliexpress for very cheap. So I ordered some. At about 3€ a piece it didn't seem like a too big investment.

OSC5A2B02 testing

The OSC5A2B02 is much less sensitive to supply variations (+-2ppb for +-5% supply voltage), so just about any regulator suffices. What is important to note however is the control voltage sensitivity, which is about 1000 ppb per volt, or equivalently 1ppb per millivolt. For the initial tests, I simply grounded the control voltage to eliminate it's contribution.

OSC5A2B02 frequency (insulated in PE foam), moving average over 5 minutes
Measured with Racal-Dana 1992

The variation was much larger than the datasheet of the oscillator promised. This is when I realized, that I actually didn't have a clear idea of what the stability of my frequency counter was, and that I could just be seeing it's variation.

To get a better idea of the frequency counter variation, and thus the true stability of the OCXO, I connected a GPS PPS on the B channel of the frequency counter.

OSC5A2B02 frequency (top) and GPS PPS (bottom), moving average over 20 minutes
Measured with Racal-Dana 1992

Bingo! Turns out most of the variation observed in the OSC5A2B02 frequency is actually due to variation of my frequency counter! Who would have guessed that a 3€ Aliexpress OCXO today is so much better than an instrument worth thousands in the 1980s.

Normalizing the OCXO frequency with the PPS frequency (after heavy filtering), the OCXO appears to be within +-1ppb over a day as long as there is no control voltage variation.

Phase noise and jitter

The OCXO datasheet gives some spot values for the phase noise:

  • -80 dBc @ 1 Hz
  • -120 dBc @ 10 Hz
  • -140 dBc @ 100 Hz
  • -145 dBc @ 1 kHz
  • -150 dBc @ 10 kHz
Integrating over this range gives ~0.3 mrad RMS of jitter, or about 4.8 ps.

Control voltage generation

As said earlier, the control voltage is about 1 ppb per millivolt over a range of 4 volts total. Using 16 bit discretization for the range leads to about 0.06 ppb resolution. To get a ballpark figure of how least significant bit transitions affect the phase noise, let's consider the case of modulating the 10 MHz carrier by +-0.03 ppb ( = +-0.3 mHz) at 1 Hz frequency (assume single tone sinusoidal modulation for simplicity). This is given by

\[ y(t) = \cos(2 \pi f_c t + A \sin(2 \pi f_m t)) \]

In which \( A = \frac{\Delta f}{f_m} = \frac{0.3 \text{mHz}}{1 \text{Hz}} = 3 \cdot 10^{-4} \).

Using the sum of angles identity for cosine, we get

\[ y(t) = \cos(2 \pi f_c t) \cos(A \sin(2 \pi f_m t)) - \sin(2 \pi f_c t) \sin(A \sin(2 \pi f_m t)) \]

Since \( A << 1 \) we can approximate \( \cos(A \sin(2 \pi f_m t)) \approx 1 \) and \( \sin(A \sin(2 \pi f_m t)) \approx A \sin(2 \pi f_m t) \). This gives

\[ y(t) \approx \cos(2 \pi f_c t) - A \sin(2 \pi f_c t) \sin(2 \pi f_m t) \]

And finally, by the sine product identity we get

\[ y(t) \approx \cos(2 \pi f_c t) + \frac{A}{2} ( \cos(2 \pi (f_c+f_m) t) - \cos(2 \pi (f_c - f_m) t) ) \]

Thus considering the single sideband phase noise, we see that the modulating frequency is attenuated by \( \frac{A}{2} \) with respect to the carrier, which is -76.5 dBc with our numbers - or an additive 3.3 ps. So not great, but not terrible - especially if we minimize the occurrence of transitions by adding some hysteresis. Anyway, we shouldn't use any less than 16 bits of resolution.

The next issue is how to implement a 16 bit DAC cheaply. Here a pretty obvious candidate is to use PWM. Proper DAC chips with 16 bits cost close to 10 euros, while PWM and a lot of filtering can be achieved with less than a euro. Much less than 1 Hz of bandwidth is perfectly fine, so the question is just how much filtering is needed. Assume the PWM repetition period is 500 Hz and amplitude is 4V. Take then the attenuation of the filter as \(G\). Recalling that the sensitivity of the OCXO is 1 ppm per volt, the power of such a modulation relative to the carrier is given by \( 0.05 G \). To push the additive jitter down to the same scale as the intrinsic jitter of the oscillator, the attenuation needs to be at least 44 dB. This should be easily achievable with simple RC filters.

The main issue with the control voltage is thus the stability requirement. The control voltage reference should remain within 1 mV over several hours. This could be combated with a high stability voltage reference. Problem is that those are expensive, and I would like to keep everything as cheap as possible. As we already have a temperature control loop (in the OCXO itself that is), we might as well use that to keep a voltage reference at constant temperature also. Turns out a TL431C has a typical stability of 4 mV over the entire temperature range, and hopefully better than 1 mV when kept at constant voltage. Also turns out TL431s are really cheap.

Microcontroller

The plan is to implement the PLL using a microcontroller. The uC would be clocked from the OCXO and it times the interval between PPS pulses. This allows determining the OCXO frequency.

The OCXO works on 5V. It would be useful if the microcontroller would also operate on a 5V supply. It also needs to have hardware input capture features, allowing precise measurement of the PPS period against the OCXO frequency. And like everything else, it must be cheap. As luck would have it, I recently ordered 50 units of CH32V003 controllers for 0.20€ per piece (including shipping). They check all the boxes for this project. The even have an internal PLL to allow doubling the 10 MHz OCXO clock for improved timing resolution.

Prototype


I implemented a very quick hack of the system on a CH32V003. It is clocked from the OSC5A2B02 with an internal PLL configured to double the frequency. This clock is used to drive a timer peripheral, which is configured for input capture from the PPS pulse. Timer resolution is increased in software from the HW provided 16 bits to 32 bits - otherwise a full PPS period could not be counted.

After each captured PPS pulse, a simple validation is performed to try to ignore erroneous pulses. A simple PI controller then controls to minimize the frequency error (frequency lock loop). The controller is just parametrized by the control bandwidth and was designed to have critically damped dynamics. Controlling just the frequency still leaves an uncontrolled phase error. I'll be looking into that too, but so far it is of no concern.

The DAC is implemented with a 16 bit PWM running at 305 Hz and filtered with a ~0.2 Hz (-3 dB) first order lowpass filter. The filtering leaves a lot to desire, and this will have to be improved for the final product. The voltage reference is provided by a jellybean TL431C, which is not temperature controlled.

Crude schematic of the prototype voltage control

Operating the control voltage as shown in the figure above allows reducing the full swing range, which increases resolution.

Experiments with the prototype have shown better than expected performance. Though I'm still lacking a data channel for the GPS data, which means that I don't know when the PPS is valid, nor can I keep absolute phase. This leads to some bad samples passing validation and causing trouble.

Closed loop controlled frequency, locked to GPS PPS
GPS PPS moving average over 5 minutes

The figure above shows closed loop control of the OCXO locking on to the GPS PPS frequency. The initial condition was deliberately set about 100ppb off to see the dynamics of the control. The bandwidth of the controller was chosen as ~4 mHz for this experiment to perform the experiment at a reasonable speed. This is too wide a bandwidth for proper operation though. From the plots, we see that the dynamics are nearly critically damped, with only very minor overshoot. This is good enough for me!

75 hour long term stability experiment
GPS PPS moving average over 1 hour

The next experiment was a long term stability test. Here I set the controller bandwidth to ~0.1 mHz to better reject noise in the PPS signal. During the experiment, there were a few moments at which the PPS became invalid. This causes the sudden spikes seen in the error graph. Regardless, the error remained well within +-1 ppb, which was the design goal. Also, the prototype only implements very simple control voltage filtering and the voltage reference is still just at room temperature. The design goal thus seems very much achievable, and possibly going down to +-0.25 ppb is possible with more care taken. The control voltage graph is computational, based on an assumed ideal voltage. I don't know for sure if the drift seen in it is due to the OCXO really needing to be adjusted or the voltage reference drifting and needing compensation, but my guess is on the voltage reference.

Sunday, December 10, 2023

Wave equation on curvilinear domains

Introduction

Nils Berglund, who is the author of many amazing simulation videos on Youtube, recently posted videos showing visualizations of waves on a sphere. In particular, this video caught my eye, depicting tsunami waves of the 2011 Japan earthquake travelling across the Pacific ocean.

In fact, back in 2011 I had made a very similar visualization inspired by the events, and later uploaded it to Youtube - see video below. The video was made as promotional material to get new students to major in mathematics at Aalto University. At that time I was a graduate student at the maths department working toward my PhD.

Anyway, it was as an undegrad in 2007 when I first got interested in solving the wave equation on the sphere. I had been studying the finite element method with great interest, and had programmed my own simple implementation of nodal P1 elements on triangles. I had succesfully used that to solve waves in planar domains - see video below. I thought that in order to solve the wave equation on a sphere, I would just use a mesh approximating the sphere instead. Turns out it's almost as simple as that, but there are a couple nontrivial details.

My undergrad studies included a mandatory "special assignment". These assignments were often to write a research paper on a subject - though they were published only internally at the department. I suggested to my professor that as my assignment I could study solving the wave equation on spheres using FEM.

Now, finally, I thought I'd revisit the topic, and have something generic written down on the subject a bit more publicly.

Wave equation and the finite element method

To get up to speed, let's look at the wave equation and the finite element method in a quick and non-rigorous fashion. The wave equation (in strong form) is
\[ \ddot{u} - c^2 \nabla \cdot \nabla u = 0 \quad \mathrm{in}\ \Omega \]
in which \( \ddot{u} \) is the second derivative with respect to time, \( c \) is the wave speed, which we'll just take as 1 from here on, and \( \Omega \) is the domain in which we're solving the equation at. In addition, there are initial conditions and boundary conditions that need to be described, but let's skip those for now.

FEM works with the weak formulation of the problem. To get the weak form from the strong form, both sides of the equation are multiplied by a test function \( v\) and integrated over the domain \( \Omega \). Integration by parts is applied to reduce the regularity requirement of \( u \) from 2 derivatives down to 1. The weak form is then: find \( u \in V \) such that
\[ \int_\Omega \ddot{u} v\ \mathrm{d}x + \int_\Omega \nabla u \cdot \nabla v\ \mathrm{d}x = 0 \quad \forall v \in V \]

With FEM, as with other Galerkin methods, we replace the space \(V\) with a finite dimensional approximation \(V_h\). We come up with a basis \( \{v_1,\ldots,v_N\} \) of \(V_h\), and write the finite dimensional approximate solution \( u_h \) as
\[ u_h = \sum_{i=1}^N U_i v_i \]

The problem then becomes: find \( U_i \) such that
\[ \sum_{i=1}^N \ddot{U_i} \int_\Omega v_i v_j\ \mathrm{d}x + \sum_{i=1}^N U_i \int_\Omega \nabla v_i \cdot \nabla v_j\ \mathrm{d}x = 0 \quad \forall j=1,\ldots,N \]
In other words,
\[ M \ddot{U} + K U = 0 \]
in which
\[ M_{ij} = \int_\Omega v_i v_j\ \mathrm{d}x \]
is called the mass matrix and
\[ K_{ij} = \int_\Omega \nabla v_i \cdot \nabla v_j\ \mathrm{d}x \]
is the stiffness matrix.

The idea in the finite element method specifically is to define the basis functions \( v_i \) using a mesh, which fills the domain \(\Omega\). Assuming now a triangular mesh, a simple approach is to take each basis function as a piecewise first order polynomial, each piece being one triangle. Then, assign \(v_i = 1\) when evaluated at vertex \(p_i\), and \(v_i = 0\) when evaluated at any other vertex.

The integrals can be written as the sum of the separate contributions from each element
\[ M_{ij} = \sum_{T} \int_T v_i v_j\ \mathrm{d}x \]
and
\[ K_{ij} = \sum_{T} \int_T \nabla v_i \cdot \nabla v_j\ \mathrm{d}x \]
For any given \(i\) and \(j\), on the vast majority of the elements, \(v_i\) or \(v_j\) (or both) is identially zero on the element, and thus they don't contribute to the integral. Contributions come from only the triangles, that contain both vertex \(p_i\) and vertex \(p_j\).

Now, to actually compute the integrals over a triangle, a standard trick is to do a change of variables and perform the integration over a reference triangle instead. A coordinate on the reference triangle \( \hat{x} \) is easily mapped to \( x \) on the target triangle \(T\) via an affine transform \( x = A_T \hat{x} + b_T \). Let's take the triangle \(\hat{T}\) with vertices \( \hat{p}_1 = (0,0) \), \( \hat{p}_2 = (1,0) \), \( \hat{p}_3 = (0,1) \) as our reference triangle. With this choice, and the fact that affine transforms map first order polynomials back to first order polynomials, the basis functions can be expressed on the reference element in a simple way.

The 3 basis functions on the reference element are such that they get the value 1 in one vertex and 0 in the others. Written explicitly, they are
\[ \begin{align} \hat{v}_1(\hat{x}) & = 1 - \hat{x}_1 - \hat{x}_2 \\ \hat{v}_2(\hat{x}) & = \hat{x}_1 \\ \hat{v}_3(\hat{x}) & = \hat{x}_2 \end{align} \]

Change of variables

Here's where things get at bit non-standard from the usual FEM implementation. Our reference element lives in \(\mathbb{R}^2\) while the global element lives in \(\mathbb{R}^3\). This means that the matrix in our affine mapping is non-square, and we need to be a bit more careful with considering how the area scales and how the functions change under the change of variables.

Usually, with the change of variables \(x = f(\hat{x})\), we see the differential area element change as \( \mathrm{d}x = |\det J_f(\hat{x})| \), where \( J_f(\hat{x}) \) is the Jacobian of \( f(\hat{x}) \). This does not however make sense for a non-square \(J_f\) and thus something more general is needed. Turns out, that a slightly more general expression for the differential area element is \( \mathrm{d}x = \sqrt{\det J_f(\hat{x})^t J_f(\hat{x}) }\ \mathrm{d}\hat{x} \). Since \(J_f^t J_f\) is always square, this is well-defined in our case. Note, that in the case where \(J_f\) is square, this reduces to exactly the usual form. In our case, the Jacobian of the mapping is simply the matrix \(A_T\) of the affine transform, and we have \(\mathrm{d}x = \sqrt{\det A_T^t A_T}\ \mathrm{d}\hat{x} \).

The contribution of element \(T\) to \(M_{ij}\) can be computed as
\[ \begin{aligned} \int_T v_i v_j\ \mathrm{d}x &= \int_\hat{T} v_i(A_T\hat{x} + b_T) v_j(A_T \hat{x} + b_T) \sqrt{\det A_T^tA_T}\ \mathrm{d}\hat{x} \\ &= \sqrt{\det A_T^tA_T} \int_\hat{T} \hat{v}_\hat{i}(\hat{x}) \hat{v}_\hat{j}(\hat{x})\ \mathrm{d}\hat{x} \end{aligned} \]
where \( \hat{i} \) and \(\hat{j}\) are the reference element basis indexes that global basis indexes \(i\) and \(j\) map to, respectively.

With the choice of reference basis functions from earlier, the integrals evaluate to one of two cases:
\[ \int_\hat{T} \hat{v}_\hat{i}(\hat{x}) \hat{v}_\hat{j}(\hat{x})\ \mathrm{d}\hat{x} = \begin{cases} \frac{1}{12}, & \text{when }\hat{i} = \hat{j} \\ \frac{1}{24}, & \text{when }\hat{i} \neq \hat{j} \end{cases} \]

While that's all for the mass matrix, for the stiffness matrix we have the additional headache of how the gradient will transform under the mapping. For basis function \(v_i\) we have \(v_i(x) = v_i(A_T \hat{x} + b_T) = \hat{v}_\hat{i}(\hat{x})\). Skipping a bit of detail for brevity, it follows from the chain rule that \( (\nabla \hat{v}_\hat{i})(\hat{x}) = A_T^t (\nabla v_i)(x) \), which by itself is enough for the typical case as \(A_T^t\) is invertible. For our case, however, we must consider additionally that the gradient \((\nabla v_i)(x)\) is actually a vector in the tangent space of \(\Omega\) at \(x\). Since the tangent space of \(\Omega\) within element \(T\) is spanned by the range of \(A_T\), there must exist some \(g\) such that \((\nabla v_i)(x) = A_T g\). Thus
\[ A_T^t A_T g = (\nabla \hat{v}_\hat{i})(\hat{x}) \]
from which it follows that
\[ g = (A_T^t A_T)^{-1} (\nabla \hat{v}_\hat{i})(\hat{x}) \]
and finally
\[ (\nabla v_i)(x) = A_T (A_T^t A_T)^{-1} (\nabla \hat{v})(\hat{x}) \]

With this, the contribution of element \(T\) to \(K_{ij}\) can be computed as
\[ \begin{aligned} \int_T \nabla v_i \cdot \nabla v_j\ \mathrm{d}x &= \int_\hat{T} A_T (A_T^t A_T)^{-1} \nabla \hat{v}_\hat{i} \cdot A_T (A_T^t A_T)^{-1} \nabla \hat{v}_\hat{j} \sqrt{\det A_T^tA_T}\ \mathrm{d}\hat{x} \\ &= \sqrt{\det A_T^tA_T} \int_\hat{T} A_T^t A_T (A_T^t A_T)^{-1} \nabla \hat{v}_\hat{i} \cdot (A_T^t A_T)^{-1} \nabla \hat{v}_\hat{j}\ \mathrm{d}\hat{x} \\ &= \sqrt{\det A_T^tA_T} \int_\hat{T} \nabla \hat{v}_\hat{i} \cdot (A_T^t A_T)^{-1} \nabla \hat{v}_\hat{j}\ \mathrm{d}\hat{x} \end{aligned} \]

As the basis functions are piecewise first order polynomials, the gradient will be a piecewise constant and thus
\[ \int_\hat{T} \nabla \hat{v}_\hat{i} \cdot (A_T^t A_T)^{-1} \nabla \hat{v}_\hat{j}\ \mathrm{d}\hat{x} = \frac{1}{2} \nabla \hat{v}_\hat{i} \cdot (A_T^t A_T)^{-1} \nabla \hat{v}_\hat{j} \]

Time propagation

After spatial discretization with FEM we're left with the ordinary differential equation
\[ M \ddot{U} + K U = 0 \]

Without going into detail, the problem is particularly well suited for e.g. the Crank-Nicolson method or the leapfrog integration method. These methods are stable for undamped oscillating problems and they do not produce decaying solutions. The key word for such methods is symplectic.

Crank-Nicolson

For Crank-Nicolson, we'll first convert the equation into first order form as
\[ \begin{pmatrix} I & 0 \\ 0 & M \end{pmatrix} \begin{pmatrix} \dot{U} \\ \ddot{U} \end{pmatrix} = \begin{pmatrix} 0 & I \\ -K & 0 \end{pmatrix} \begin{pmatrix} U \\ \dot{U} \end{pmatrix} \]
Denote then
\[ B = \begin{pmatrix} I & 0 \\ 0 & M \end{pmatrix} \]
\[ A = \begin{pmatrix} 0 & I \\ -K & 0 \end{pmatrix} \]
and
\[ z = \begin{pmatrix} U \\ \dot{U} \end{pmatrix} \]
to get
\[ B \dot{z} = A z \]

For the discretization, consider the derivative on the left size replaced by the approximation \( \frac{z_{n+1} - z_n}{\Delta t} \), and consider the function evaluation on the right side being performed at the point \( \frac{z_{n+1} + z_n}{2} \). With this, we'll arrive at
\[ B \frac{z_{n+1}-z_n}{\Delta t} = A \frac{z_{n+1} + z_n}{2} \]
which, after some manipulation, yields
\[ z_{n+1} = (B - \frac{\Delta t}{2} A)^{-1} (B + \frac{\Delta t}{2} A) z_n \]

The larger down-side of the Crank-Nicolson method is the requirement for solving a large non-symmetric system of equations for every time step.

Leapfrog integration

Leapfrog integration is an explicit symplectic method particularly well suited for mechanical problems, where the accelerations are due to position only, i.e. of the form
\[ \ddot{u}(t) = f(u(t)) \]
which our problem is an example of, as for our problem we have
\[ \ddot{U} = -M^{-1}KU \]

The method is then as follows
\[ \begin{aligned} \dot{U}_{n+1} &= \dot{U}_n - \Delta t\ M^{-1} K U_n \\ U_{n+1} &= U_n + \Delta t\ \dot{U}_{n+1} \end{aligned} \]

This might look like Euler's method, but it is subtly different. The position update at \(n+1\) uses the velocity at \(n+1\) and not at \(n\) as for Euler's method.

This still requires solving a system of equations, but now the coefficient matrix is half the size as for the Crank-Nicolson method, and is also symmetric.

Turns out, however, that the mass matrix \(M\) is strongly diagonally dominant and spectrally equivalent to the identity operator. Thus, for large systems \(M\) is quite accurately approximated by a diagonal matrix that has the row (or column) sums of \(M\) as the diagonal elements. The system of equations thus becomes trivial.

Further videos






Monday, October 2, 2023

Skin and proximity effects

Introduction

In an earlier post I looked at coaxial cables over a ground plane as multiconductor transmission lines. This was to understand quarter wave folded balun operation. Computing the transmission line inductances, however, I realized that the current distribution would depend on how the return current flows.

If the current flows through the center conductor and returns through the shield, the current would be most concentrated on the outside of the center conductor, and return current would concentrate on the inside of the shield. However, if the current flows through the shield and returns through a the ground plane, then the current in the shield would concentrate on the outside part of the shield facing toward the ground plane - and return current would concentrate on the ground plane surface closest to the cable. This concentration of the current is caused by what is known as the skin and proximity effects.

The figures below show a visualization of the current density in a coax under the previously descibed situations. Coax geometry is that of RG316, which is placed very close to a ground plane. The right hand side figure shows the situation where current flows through the inner and back through the shield. while the left hand side shows the situation where current flows through the shield and back through the ground plane. Color indicates current density: dark red is maximal density into the display, dark blue is maximal density out of the display and light green is zero current density. The current is driven by a sinusoidally oscillating uniform electromotive force.


All computations in this article are performed using scikit-fem and my own code, and visualized using matplotlib.

The calculations in my earlier post didn't take these effects into account, and the current assumed a uniform density in the entire conductor in each case. This caused the inductance matrix of the transmission line system to be somewhat incorrect. I mentioned it at the time of writing that post, that accounting for this would be a whole new can of worms. It's now time to open that can.

Deriving the appropriate equations

I'll try to keep the following fairly general, but here we are mostly interested in a long multiconductor transmission line (with n conductors), with dielectrics separating the conductors from each other. The cross-section of the transmission line is assumed to remain constant along its length.

To simplify notation, we'll consider all fields as time-harmonic at frequency w, such that e.g. \(\mathbf{B}(x,y,z,t) = \mathrm{Re} \{ \mathbf{B}(x,y,z) \exp(i \omega t) \}\), and omit the time variable in the following. In particular, we'll omit writing down the time derivative operators and just expand out the result.

Let's start with Ampere-Maxwell's law
\[\nabla \times \mathbf{H} = \mathbf{J} + i\omega \mathbf{D}\]
Inside the conductors we assume Drude's conduction model. The model gives us that \(\mathbf{J} = \sigma \mathbf{E}\) (when considering time scales significantly longer than the mean collision time). Also, it states that there is no polarization in the medium, giving \(\mathbf{D} = \epsilon_0 \mathbf{E}\). (Note: alternatively we could have assumed that there is no conduction current \(\mathbf{J}\), but that we're dealing with an extremely lossy dielectric medium. This would have been represented by a large imaginary component of the permittivity).

Following Drude, inside the conductor we have
\[\nabla \times \mathbf{H} = \sigma \mathbf{E} + i\omega \epsilon_0 \mathbf{E}\]
However, for copper and the frequencies we're intersted in, \(\sigma \mathbf{E}\) dominates \(i\omega \epsilon_0 \mathbf{E}\) by more than a factor of a million, so we'll ignore the term \(i\omega \epsilon_0 \mathbf{E}\).

Inside the insulators we assume no conduction and that \(\mathbf{D} = \epsilon \mathbf{E}\), where \(\epsilon\) is the permittivity of the insulator (and is taken as different within each material). This gives
\[\nabla \times \mathbf{H} = i\omega \epsilon \mathbf{E}\]
Again, for the materials and frequencies we're interested in, \(i\omega \epsilon \mathbf{E}\) significantly smaller than \(\sigma \mathbf{E}\), and thus we can ignore this source of the magnetic field.

For simplicity, let us assume that all magnetic materials are linear and isotropic, and thus that the constitutive relation \(\mathbf{H} = \mu^{-1} \mathbf{B}\) holds within each material, where mu describes the permeability of the material (and is taken as different within each material).

So, thus far, inside the conductors we have
\[\nabla \times (\mu^{-1} \mathbf{B}) = \sigma \mathbf{E}\]
and elsewhere
\[\nabla \times (\mu^{-1} \mathbf{B}) = 0\]
To simultaneously also satisfy Gauss's law for magnetism and Faraday's law, let us take the potential field approach by introducing the magnetic vector potential \(\mathbf{A}\) and the electric potential \(\varphi\), such that \(\mathbf{B} = \nabla \times \mathbf{A}\) and \(\mathbf{E} = -\nabla \varphi - i\omega \mathbf{A}\).

Thus inside the conductors we have
\[\nabla \times (\mu^{-1} \nabla \times \mathbf{A}) = -\sigma \nabla \varphi - i \omega \sigma \mathbf{A}\]
while elsewhere
\[\nabla \times (\mu^{-1} \nabla \times \mathbf{A}) = 0\]

Solving the equations

Take the geometry as such that the transmission line is directed along the \(z\) coordinate direction.  The cross-sectional geometry of the transmission line is then the \(xy\) plane. Consider then that for each conductor \(c\) our electric potential is such that \( (\nabla \varphi)(x,y) = -U_c \mathbf{e}_z\), when \( (x,y) \in \Omega_c\), where \(\Omega_c\) is the cross-section surface of conductor \(c\), and \(U_c\) is a given constant value. In other words, consider that for each conductor \(c\), we have some voltage drop of \(U_c\) per unit length which drives the current in the conductor.

Assume then that \(\mathbf{A} = \mathbf{A}(x,y)\), i.e. that \(\mathbf{A}\) is only a function of position in the cross-section of the transmission line and does not vary in the length-wise direction (i.e. the transmission line is long). It then follows (though not entirely trivially) that \(\mathbf{A}(x,y) = A(x,y) \mathbf{e}_z\), and our problem actually reduces to a two-dimensional scalar problem in the \(xy\) plane.

For such an A, we have
\[\nabla \times (\mu^{-1} \nabla \times \mathbf{A}) = -\nabla \cdot (\mu^{-1} \nabla A) \mathbf{e}_z\]
Thus, inside conductor \(c\) we have the scalar problem
\[i \omega \sigma A - \nabla \cdot (\mu^{-1} \nabla A) = \sigma U_c\]
while outside the conductors
\[-\nabla \cdot (\mu^{-1} \nabla A) = 0\]
With finite elements, and a finite computation domain, we must consider boundary conditions also. We choose to enforce the boundary condition \(\nabla A \cdot \mathbf{n} = 0\), which corresponds with the magnetic field being normal to the boundary. (Note: the boundary condition \(A = 0\) corresponds with the magnetic field being tangential to the boundary). Our choice of boundary condition results in all currents in the system always getting automatically balanced, although it's otherwise a somewhat dubious condition to enforce. The domain must be large enough that these edge effects become negligible.

The problem is then solved using standard finite element methods. The solution of course gives the magnetic vector potential. To find the currents in the conductors, we'll need to do some post-processing still. For this, we'll recall that
\[\mathbf{J} = \nabla \times (\mu^{-1} \mathbf{B}) = \nabla \times (\mu^{-1} \nabla \times \mathbf{A})\]
Let's denote \(\mathbf{J} = J \mathbf{e}_z\), so that for the scalar current density field we have
\[J = -\nabla \cdot (\mu^{-1} \nabla A),\]
which is very straightforward to compute from \(A\) using our finite element approach. Further, the total current in conductor \(c\) is then the integral
\[I_c = \int_{\Omega_c} J\ \mathrm{d}\mathbf{x}, \]
where \(\Omega_c\) is again the cross-section surface of conductor \(c\).

Consider then that the voltage gradient applied across conductor \(c\) was \(U_c\). This allows the conductor currents to be described by the admittance matrix \(\mathbf{Y}\) of the system as
\[\begin{pmatrix} I_1 \\ I_2 \\ \vdots \\ I_n \end{pmatrix} = \mathbf{Y} \begin{pmatrix} U_1 \\ U_2 \\ \vdots \\ U_n \end{pmatrix}\]
or
\[\mathbf{I} = \mathbf{Y} \mathbf{U}\]
To find the admittance matrix of our system, we can solve the currents \(\mathbf{I}\) a total of \(n\) separate times, each time using a different linearly independent set of voltage drops. For example, we can take \(\mathbf{U}\) as each of the members of the canonical basis in turn and solve the currents.

Perhaps an even more useful way to look at the system is the impedance matrix, which is the inverse of the admittance matrix.
\[\mathbf{Z} = \mathbf{Y}^{-1}\]
We expect that our system has some resistance, which is due to the resistivity of the conductors. This would be described by the real part of the impedance matrix. The system is also expected to exhibit inductive reactance, which is described by the imaginary part of the impedance matrix. Dividing the reactance by \(\omega\) gives us the inductance. Thus, the inductance matrix of the system is simply the imaginary part of the impedance matrix divided by \(\omega\)
\[\mathbf{L} = \omega^{-1} \mathrm{Im}\{ \mathbf{Z}\}\]

Applications

Skin effect in a circular conductor

A copper conductor of 1 mm diameter is subjected to various frequencies of AC excitation. The permeability of both copper and air in this system are taken as approximately \(\mu_0\), which is the vacuum permeability.

Current density is visualized in the cross-section surface with color, but also along the diameter as a 1D plot.

Coaxial cable

A coax of RG316 dimensions
  • Inner conductor OD: 0.53 mm
  • Inner insulator OD: 1.53 mm
  • Outer conductor OD: 1.98 mm
  • Outer insulator OD: 2.5 mm
is placed over a ground plane, with center to plane height 1.75 mm. All conductors are made of copper, while \(\mu \approx \mu_0\) for all materials, where \(\mu_0\) is the vacuum permeability.


Consider the two cases:
  1. Current is balanced between inner and outer conductors: no current flows through the ground plane
  2. Current flows through the outer conductor and ground plane: no current flows through the inner conductor

Inner - outer




Outer - ground plane




Differential microstrip

Take a differential microstrip of dimensions
  • Trace width: 0.2mm
  • Trace separation: 0.2mm
  • Trace thickness: 0.035mm
  • PCB layer thickness: 0.2mm
All conductors are made of copper, with \(\sigma = 5.96\ \mathrm{S} \mathrm{m}^{-1}\) while \(\mu \approx \mu_0\) for all materials, where \(\mu_0\) is the vacuum permeability.


Consider two cases:
  1. The current enters the left conductor and returns via the right conductor: the odd mode
  2. The current enters through both the left conductor and the right conductor and returns through the ground plane: the even mode

Odd mode


Even mode


Observations

The higher the frequency, the more concentrated the current and return current are toward each other. With increasing frequency the current is passing through an ever decreasing cross-sectional area. Thus the resistance increases with frequency. On the other hand, the closer the opposite currents are moving to each other, the more the resulting magnetic fields cancel each other out. Thus the inductance decreases with increasing frequency.