Monday, February 17, 2025

GPS disciplined oscillator - part 2

Introduction

After the good results I got from my prototype GPSDO, I wanted to go a bit further than the mess of wires that was the prototype. Besides achieving a better build quality, I needed more than just one 10 MHz output to support a few key instruments in my lab - and a PPS output would also be nice to have. I also wanted to improve the temperature dependency of the device by insulating it better, and by keeping the control voltage reference at constant temperature.

The device was built in April - June of 2024, but I only now got to documenting it.

Overview

The device is composed of two parts: the main unit and the GNSS receiver unit. They are connected to each other through standard twisted pair Ethernet cabling, which allows placing the receiver unit to a convenient location, which may be far away from the main unit.

GNSS receiver unit on the left, main unit on the right
Reverse sides

GNSS receiver unit

The GNSS receiver unit gets power through one of the four pairs of the cable. A suitable voltage for the receiver is then regulated locally from the provided voltage. The PPS output of the receiver is fed into a RS422 line driver and transmitted back to the main unit over another pair of the cable.

The receiver serial TX and RX are also connected to a RS422 transceiver to allow data communication with the main unit. Thus, all 4 pairs of the cable are used.

Cover dome, board carrier and pole mount are held together with 4 M3 screws




Close up on board carrier. RJ45 is on a separate board and connects via ribbon cable

RJ45 board is held in place with snap tabs. Receiver PCB is attached with 2 M2.5 screws and hold down tabs.
Reverse sides

Main unit

The main unit is comprised of several sub-units. These are the main board sub-unit, the OCXO board sub-unit and the LCD board sub-unit. Each sub-unit handles a separate part of the system and communicate with each other through I2C.

Main board sub-unit and LCD board sub-unit are attached to the cover plate. OCXO board sub-unit is inside block of foam insulation.
Mess of wires for the connectors. All wires connect to the main board. RJ45 connector is on a separate board, which is held in place by snap tabs.
LCD board sub-unit (top) and main board sub-unit (bottom)
Block of foam insulation with OCXO board sub unit inside removed from the main unit enclosure

LCD board sub-unit

The LCD board sub-unit handles displaying information to the user through a HD44780 type liquid crystal display. It acts as an I2C master and pulls information to display from the main board sub-unit as well as from the OCXO board sub-unit. It is built around a CH32V003 MCU.

Close up on the LCD board sub-unit with main board sub-unit

OXCO board sub-unit

The OCXO board sub-unit handles the 10 MHz frequency generation and its control voltage generation, and thus carries the OSC5A2B02 OCXO itself. It is housed inside a thermally insulating foam block to reduce the temperature dependency of the 10 MHz output.

The OCXO adjustment is about 1 ppb per millivolt, so the control voltage needs to be quite stable. This stability is provided by a TL431C voltage reference, which is at close thermal contact with the OCXO and thus also at a constant temperature. A 16 bit DAC is then implemented using a PWM signal from a CH32V003, followed by heavy filtering to try to keep the phase noise low.

The OCXO board sub-unit also features a DS18B20 temperature sensor, which is at close thermal contact with the OCXO. This is for now only used for monitoring, but could perhaps be used for some additional temperature compensation in the future.

The 10 MHz output from the OCXO is not used by any of the circuitry on the sub-unit itself, but is instead passed to the main board sub-unit.

The OCXO board sub-unit acts as an I2C slave and allows setting the OCXO control voltage based on a control word, which it receives from the main board sub-unit. Additionally, the LCD board sub-unit queries it for temperature information.

Block of foam insulation with OCXO board sub-unit inside
Foam cover removed
OCXO board sub-unit removed from block of insulation

Main board sub-unit

The main board sub-unit performs three functions: GNSS data interfacing, OCXO PLL control and clock buffering.

Main board
Reverse side

GNSS data interface

The GNSS data interface has a RS422 transceiver to allow serial data communication with the GNSS receiver. This is used to configure the receiver, as well as for receiving a time stamp for each PPS pulse for absolute phase control. The GNSS data interface is built around a CH32V003 and acts as both an I2C master and an I2C slave. As a master, it actively pushes validity and time stamp information to the OCXO phase lock loop system. As a slave it allows the LCD board sub-unit to fetch the validity, time stamp, positioning and other GNSS information.

PFD controller

The phase-frequency detector controller is the heart of the system, and is built around a CH32V003. The MCU is clocked from the OCXO board sub-unit 10 MHz output. This 10 MHz is internally clock doubled for a 20 MHz cycle rate at the MCU core.

Via an RS422 receiver, the MCU gets the PPS signal from the GNSS unit. This allows the MCU to count the number of 10 MHz cycles between PPS pulses at half-cycle (i.e. 50 nanosecond) resolution. This, averaged over a long period, represents the true frequency of the OCXO output.

At each PPS, the accumulated phase count of the OCXO is compared to the ideal phase (i.e. the phase derived from the time stamp). The difference of these represents the phase error. The phase error is then propagated through the PLL software controller to compute an OCXO control word.

The OCXO PFD controller MCU acts both as an I2C master and as an I2C slave. As a master, it transmits the OCXO control word to the OCXO board sub-unit. As a slave it allows the LCD board sub-unit to fetch status information regarding the PLL lock.

Clock buffers

The clock buffers allow isolating the external users of the 10 MHz clock signal from the clock generation. Thus the PLL keeps running even if some clock outputs were e.g. accidentally shorted or otherwise abused. There are a total of four 10 MHz outputs and one PPS output.

Each clock buffer is made from a 74HC04 hex inverter. Five of the inverters are paralleled for the output stage. This is done to provide a strong drive for a 50 ohm output. The sixth inverter is used as a buffer between the OCXO signal and the five paralleled inverters. This reduces the capacitance which the OCXO needs to drive as it now only sees the capacitance of one CMOS input per clock buffer instead of six.

Characterization

I think I've now spent more time measuring and tuning the device than actually building and designing it. In fact, I'm not yet even using it as a time base for any of my lab instruments.

OCXO gain

The OCXO is modeled as having a linear response to its control voltage (see previous post), while the control voltage is assumed to be linear to the control word. The OCXO frequency change per control word change is here called the OCXO gain. In order to tune the controller for the best response, the OCXO gain needs to be measured at the operating point.

OCXO gain measurement

The control word given to the OCXO board sub-assembly was varied between two values, one of which was 1000 counts above the operating point and one which was 1000 counts below the operating point. The frequency of the OCXO was then measured against the GNSS PPS signal. Note that since the PFD operates with a clock-doubled OCXO input, the gain is defined with respect to the clock-doubled 20 MHz frequency and not the 10 MHz output.

The clock-doubled OCXO output changed frequency by 0.458 Hz with a control word change of 2000, thus giving the OCXO gain an approximate value of 229 micro-Hz per count.

OCXO PLL time-scale

A critical parameter to choose is the bandwidth of the PLL. This is because the GNSS PPS is noisy at short time scales, while the OCXO is unstable at long time scales. To gain understanding on the matter, I collected phase error data from the OCXO while set as free running.

As comparison points, I found some stability data published for the OSC5A2B02 as well as for bare GPS PPS signals. These came from the blogs PA1EJO and www.febo.com. These sources used rubidium standards as their reference clocks, while I could only compare against the GNSS PPS.

Phase error of the OCXO against GNSS PPS
Low frequency phase noise on the 10 MHz output
Modified Allan deviations. My measurement and other published data.

Looking especially at the modified Allan deviation graphs, my data follows the [PA1EJO] GPS PPS data quite well at the low frequencies. This indicates that the deviation at those frequencies comes from the PPS and not from the OCXO. On the other hand, at slightly higher frequencies my data follows the [www.febo.com] OSC5A2B02 data, indicating that here the OCXO is causing the deviation. Though my OCXO in this measurement appears to be more stable than the unit in the external data. This could be due to additional thermal insulation my unit has.

The OCXO has apparent linear frequency shift, which at such long time scale could be easily corrected by the PLL. To still get more understanding of the stability, I computed the Hadamard deviation as well. This is insensitive to linear frequency drift, and the [PA1EJO] GPS PPS data has this metric also published.

Hadamard deviation comparison

In the Hadamard sense, the OCXO is remarkably stable. There is a slight change in the slope at around 10 seconds time scale, but it isn't anything to really worry about.

It looks that it's only important to select the time scale short enough, that the linear frequency drift is compensated, but still long enough that phase error measurements (through low-pass filtering) become accurate. This is probably a pretty wide time scale range. Mostly it's trying to get the scale as short as possible, while keeping the control stable.

To explain the last part a bit. The hardware can directly observe the phase error at only 50 ns resolution. This by itself is not nearly good enough for proper control. However, due to jitter in the PPS, the phase error is actually observed alternating between two values. Low-pass filtering this alternating raw error gives much improved resolution. As an additional trick, the controller can provoke jitter in the measurement by deliberately controlling the OCXO phase to lie at a midpoint between two values.

Conclusion

 I'm still in the progress of measuring the response of parameter choices, as well as making tweaks to the controller algorithm itself. There will be a part 3, in which I'll try to get some performance measurements. I'll also try to get the electronics and code published as well.

Tuesday, July 23, 2024

Single parameter controller for a GPS disciplined oscillator phase locked loop

Introduction

In a previous post I described some of my experiments with building a GPS disciplined oscillator. In my experiments, I lock a voltage controlled oscillator to the PPS signal of a GPS receiver. The locking is achieved by a control loop, which observes the phase error between GPS and the VCO and produces a control signal to adjust the frequency of the VCO.

The GPS PPS signal is very noisy at higher frequencies (above say 1 milli-Hz), so the controller must have a low bandwidth to reject the higher frequency noise. Also, since I don't want to spend a lot of time tuning the controller, I want the bandwidth to be the only parameter.

VCO phase error model

The VCO has a control voltage input \(u\), which controls its frequency \(f_\text{vco}\). This relationship is well approximated as linear around an operating point. Thus near our target frequency \(f_\text{target}\) we can model the VCO frequency as \[f_\text{vco}(t) = f_\text{target} + g (u(t) - u_o),\] in which \(g\) is the VCO gain and \(u_o\) is the needed control value to attain the target frequency. The frequency error is thus given by \[ f(t) = f_\text{target} - f_\text{ocxo}(t) = -g (u(t) - u_o) \]

As the goal is to achieve phase lock, the quantity of interest is actually the phase error. We take the unit of phase to be a full cycle. This makes the phase error \(e\) simply the integral of the frequency error over time. \[e'(t) = - g (u(t) - u_o) \]

As the controller will be implemented in the digital domain, \(u\) will be piecewise constant. This allows us to discretize the model as the recurrence \[ e_{n+1} = e_n - \Delta t g (u_n - u_o),\] in which \(\Delta t\) is the time between the discrete changes of \(u\), \(u_n\) is the value of \(u\) at \(t \in [n \Delta t, (n+1) \Delta t) \) and \(e_n = e(n \Delta t)\).

Controller

The simplest form of controller which can control the phase error to zero is a PI controller. Here the control is determined as a linear combination of the phase error \(e\) and its integral \(i\) as \[ u_n = P e_n + I i_n, \] in which \(P\) and \(I\) are tuning parameters of the controller.

Unfortunately for us, our phase error measurement is very noisy at short time scales. With the simple PI controller the noise would couple straight to the control output through the P term. We thus need to consider something else. One could obviously use a longer interval between the measurements, but that introduces its own set of problems. Our solution is to instead apply a simple first order low-pass filter to the phase error measurement and then combine that with a PI controller.

Since the filter and the controller are implemented in the digital domain, let's transition now to an entirely discretized domain and ignore the continuous domain altogether. In this context, we take our low-pass filter as \[ \hat{e}_{n+1} = (1-\alpha) \hat{e}_n + \alpha e_n, \] in which \(\hat{e}\) is the filtered phase error and \(0 < \alpha < 1\) is a parameter defining the filter bandwidth.

The integral of the filtered phase error, similarly, is considered only in the discretized sense, and is given as \[ \hat{i}_{n+1} = \hat{i}_n + \hat{e}_n \]

The control law of the controller is then \[ u_n = P\hat{e}_n + I\hat{i}_n, \] in which \(P\) and \(I\) are tuning parameters to set the behavior of the controller.

Combining the VCO phase error model with the controller gives the model of the entire system as \[ \begin{equation} \begin{pmatrix} e_{n+1} \\ \hat{e}_{n+1} \\ \hat{i}_{n+1} \end{pmatrix} = \begin{pmatrix} 1 & -\Delta t g P & -\Delta t g I \\ \alpha & 1-\alpha & 0 \\ 0 & 1 & 1 \end{pmatrix} \begin{pmatrix} e_n \\ \hat{e}_n \\ \hat{i}_n \end{pmatrix} + \begin{pmatrix} \Delta t g u_o \\ 0 \\ 0 \end{pmatrix} \end{equation} \]

For the controller to work, we want the errors to converge. The dynamics of the convergence are determined by the eigenvalues of the matrix. For simplicity, we'll choose all three eigenvalues as \(r\), where \(0 < r < 1\).

The characteristic polynomial of the matrix is \[ \lambda^3 + (\alpha - 3) \lambda^2 + (\Delta t g P \alpha - 2\alpha + 3) \lambda + \Delta t g I \alpha - \Delta t g P \alpha + \alpha - 1. \] On the other hand, to have the desired eigenvalues, we want the characteristic polynomial to be \[ \lambda^3 - 3r\lambda^2 + 3r^2 \lambda - r^3. \] Matching the coefficients, we get the equations \[ \begin{align} \alpha - 3 &= -3r \\ \Delta t g P \alpha - 2\alpha + 3 &= 3r^2 \\ \Delta t g I \alpha - \Delta t g P \alpha + \alpha - 1 &= -r^3 \end{align} \]

Solving those equations yield \[ \begin{align} \alpha &= 3(1-r) \\ P &= \frac{1-r}{\Delta t g} \\ I &= \frac{(1-r)^2}{3 \Delta t g} \end{align} \]

The resulting matrix is unfortunately non-diagonalizable, making analyzing the resulting dynamics a bit tedious. The controller however appears to be well-behaved and produces convergence without too much overshoot.

The following figures show simulated responses from a controller with \( r = 0.001 \).

Impulse response of the controller against phase error change
Impulse response of the controller against control offset change

Sunday, April 28, 2024

The confusing I2C bit rate register of CH32V003

The CH32V003 reference manual does not explain some key information for using the I2C peripheral. The issue is mainly with the I2C1_CKCFGR register, but there is also confusion around the FREQ field of the I2C1_CTLR2 register.

Digging a bit, it turns out that WCH appears to be using the same I2C peripheral IP as Gigadevices and Puya use in many of their microcontrollers. Both of the other vendors have a bit more documentation in their manuals.

The explanation of the I2C1_CKCFGR register turns out to be fairly simple, and the values used by the I2C examples going around the net appear to be correct. Its documentation in the reference manual says the following

However, what is left unsaid is the following

ModeDutyT_high clock cyclesT_low clock cycles
F/S=0EitherCCRCCR
F/S=1DUTY=0CCR2*CCR
F/S=1DUTY=19*CCR16*CCR

The bitrate is just BR = F_APB1 / (T_high + T_low), and thus the magical factors of 2, 3 and 25 used in the examples are explained. This also explains where the quoted duty cycles of 33% and 36% come from.

This still leaves the FREQ field in I2C1_CTLR2 to be explained. It is described as

The Gigadevices and Puya documentation agree with WCH's documentation on this: you're expected to program the field with a value, which is the peripheral clock frequency given in integer megahertz.

All examples for the CH32V003 I've seen on the net, however, program it as F_APB1 / 2000000, which is half of the correct value. Also, many of the examples document the value 2000000 as the frequency of the peripheral, which is not correct at all. WCH's documentation is worded in less obvious terms than the other vendors, which may be the source of the confusion.

But what does this register do? If it's possible to write only half of the correct value to the field and still have the peripheral seemingly work, it can't be very critical. This is only covered in the Puya documentation, and still with not enough detail. The English translation states:

This register must be configured with the value of the APB clock frequency to generate data setup and hold times that are compatible with the I2C protocol.
Not sure why the timings given in the I2C1_CKCFGR are not enough to guarantee proper setup and hold, but apparently there is additional signal conditioning which this field affects.