Friday, June 27, 2025

GPS disciplined oscillator - part 3

Stability characterization

Unfortunately I don't have access to a frequency standard which would be accurate enough for proper comparisons. What I can do, however, is look at the Allen deviation referenced to the GPS PPS.

I measured the OCXO as free running twice and then running with the PLL controller. The PLL controller was configured to estimate the phase error every 8 seconds. The control law was then set to a time constant of 48 cycles - leading to a total time constant of around 384 seconds for the PLL.

Modified Allan deviation of the OCXO as free running (2 measurements) and under PLL control

In the short time scales the Allan deviation is seen to be approximately the same between the free running oscillator measurement and the PLL controlled oscillator measurement. This is desirable, as in the short time scales it is expected that the free running OCXO has smaller phase noise than the GPS PPS. Since the stability here is measured against the GPS PPS, the true short time scale behavior cannot be observed as it is corrupted by the noise of the GPS PPS itself.

On the other hand, going to larger time scales, we know that the GPS PPS noise becomes very low. In those time scales we see that the PLL controlled oscillator Allan deviation keeps decreasing as it follows the GPS PPS ever closer, while running free, the OCXO would start exhibiting an increase in the deviation.

For the long time scale it's quite easy to get to a low deviation with just about any PLL implementation locked to the GPS PPS. It seems to be more difficult to not amplify the noise too much in the short time-scale. Even after several iterations and a lot of tuning, there is some noise gain left. I am however happy with this for now. Perhaps in the future I'll revisit this by measuring it against an atomic reference, but for my home lab needs, this is most likely more than good enough.

Phase noise characterization

At the time of completing the build I happened to have access to a lab with a Rohde & Schwarz FSW26 spectrum analyzer. I used it to measure the phase noise behavior of the device.

Phase noise from 10 Hz to 10 MHz

The OCXO datasheet gives the typical phase noise values as

-80 dBc/Hz @ 1 Hz
-120 dBc/Hz @ 10 Hz
-140 dBc/Hz @ 100 Hz
-145 dBc/Hz @ 1 kHz
-150 dBc/Hz @ 10 kHz

I did not measure down to 1 Hz or below due to the sheer amount of time it would have taken, and I only had day time access to said lab. It would seem likely, however, that the unit is better than -80 dBc/Hz at 1 Hz. Noise density of -120 dBc/Hz at 10 Hz appears to be met. However, for the higher frequencies the noise performance is clearly not reached. This discrepancy can either be because of the OCXO itself or due to additive noise from the 74HC04s used as clock buffers. The result is still pretty good, so I don't mind.

The period RMS jitter can be calculated from the phase noise through correlating the phase noise at time t with the phase noise at t + period. This can be integrated from the phase noise spectrum. We get

50 ns: RMS jitter = 11.48 ps
100 ns: RMS jitter = 14.88 ps

To sanity check the calculations, I used a Rohde & Schwarz RTP164 oscilloscope - also present in the lab - to directly measure the period jitter at different time offsets

50 ns: RMS jitter = 15.53 ps
100 ns: RMS jitter = 21.05 ps
10 ms: RMS jitter = 18.36 ps
100 ms: RMS jitter = 17.88 ps

Jitter performance is thus quite uniform over a large correlation time range and matches the order of magnitude computed from the phase noise.

While I wasn't targeting a low jitter clock source, I think the jitter performance is easily good enough for anything I could consider doing at my home lab currently. I'm extremely pleased with the results.

Schematics, layout and source code

Resources are published at https://github.com/ahhuhtal/gpsdo

Sunday, February 23, 2025

On quantization errors, and recovering resolution from dithering and filtering

Preface

Just a quick post about recovering a signal, which has been corrupted by quantization error. This is something I discovered while working on my GPSDO project, but has general applicability. This all might be common knowledge and trivial, but I enjoyed coming up with my own solution. Documenting it here so I don't forget about it.

Introduction

Consider a continuous observable, which is sampled by a low resolution process. By low resolution, I mean that there is a limited number of steps that can be differentiated. In a noiseless ideal case the response would look something like shown in figure 1.

Figure 1. Ideal quantization of continuous observable.

However, consider then that the observable is corrupted by additive noise. This causes the sampled signal to also exhibit noise. The result might look as shown in figure 2.

Figure 2. Effect of additive noise on sampling.

This noise in the sampled signal can be used as a way to increase resolution, where several samples are averaged to produce a single sample. Noise is sometimes deliberately injected into measurement signals to take advantage of this. With enough samples averaged, not only is the additive noise reduced, so is also the quantization error. This is illustrated in figure 3.

Figure 3. Result of averaging noisy sampled data.

Now, the example above exhibits a large amount of additive noise. Noise standard deviation was taken as 40% of quantization resolution. However, if the additive noise is small compared to the quantization resolution, this breaks down. See figure 4, where the noise standard deviation is just 15% of quantization resolution.


Figure 4. Effect of small additive noise.

Suddenly, the quantization error is no longer removed through averaging and a step-like pattern emerges.

An important thing of note is that there are again some continuous values for which there is only a single sampled value. It seems rather clear that in general no further resolution can be extracted for those values. However, for the following discussion it's more interesting to observe the values that do still exhibit two distinct sampled values. This is also the situation I have in my GPSDO project that motivated all this.

In the quantization transition regions averaging clearly gives more resolution, but the averaged values do not match the true values. The obvious question is whether this can be improved.

Bayes saves the day

Initially I took a Bayesian approach to the problem. When presented with the quantized value \(q\), what can be inferred of the continuous value \(c\)? This is answered by Bayes' theorem. \[ p(c | q) = \frac{p(q | c) p(c)}{p(q)}. \]

That is, the probability of the continuous variable having value \(c\) after we've observed quantized value \(q\) is proportional to the prior probability of observing \(c\) times the probability of observing quantized value \(q\) given that the continuous variable has value \(c\), which is also called the likelihood. The probability of observing the quantized value \(q\) in the denominator can be thought of just a normalizing factor.

The formula expresses how we update our knowledge of the continuous variable, when presented with a measurement. The posterior distribution is proportional to the prior times the likelihood.

Before we can formulate the likelihood, we must properly define the additive noise and the quantization. To keep things simple, let's take almost the simplest quantization possible: the round function. Also let's take the noise as normally distributed and zero mean \(n \sim N(0,\sigma) \).

The likelihood is the probability of sampling a value, given that the continuous variable has some value. With our choice of quantization, we observe quantized value \(q\) if the continuous value plus noise \(c + n\) is in \([q-0.5, q+0.5)\). The probability can be evaluated as \[ p(q|c) = p(q-0.5 \leq c+n \leq q+0.5) = p(q-0.5-c \leq n \leq q+0.5-c)\]. Which, since the noise is normally distributed, can be further evaluated as \[ p(q|c) = F_n(q+0.5-c) - F_n(q-0.5-c) = \frac{1}{2} \mathrm{erf}(\frac{q+0.5-c}{\sqrt{2}\sigma}) - \frac{1}{2} \mathrm{erf}(\frac{q-0.5-c}{\sqrt{2}\sigma}). \]

To gain more intuition, let's look at what the likelihoods look like. Let's take the noise standard deviation as \(\sigma = 0.1\). This is shown in figure 5.

Figure 5. Likelihood functions for observing quantized value 0 and 1.

Near the continuous values 0 and 1 there is an almost 100% probability for observing quantized values 0 and 1, which causes the stairs we saw in figure 4. The significant overlap area of the probabilities is only a small fraction of the range, which leads to the average not matching the true value also seen in figure 4.

Then, say that we have observed a pattern 1, 0, 0, 0, 1, 0, 0, 0. What can we say about the continuous variable? Taking the prior distribution as uniform (i.e. no knowledge of the continuous variable), the Bayesian inference progresses as shown in figure 6.

Figure 6. Inference of pattern 1, 0, 0, 0, 1, 0, 0, 0.

This approach is very powerful, but also very heavy. And as an additional complexity, if the system is dynamic, we need to add uncertainty to the distribution between inference steps, to allow the inferred continuous variable to evolve. For the usual applications, this likely adds too much complexity to be practical.

This exercise, however, gave me insight in why the simple averages of the sampled values don't give the correct result, as well as how the likelihoods look like.

A practical solution

In practice, the sampled values alternate at most between 2 distinct values. While it's not strictly impossible to observe 3 or more values, for the situations of interest here, the probability is just extremely low. This allows us to simplify things by allowing only two values. The likelihood of observing a zero is then simplified as \[ p(q=0|c) = F_n(0.5-c) = \frac{1}{2} + \frac{1}{2} \mathrm{erf}(\frac{0.5-c}{\sqrt{2}\sigma}), \]while the likelihood of observing a 1 is the complement of this. These likelihood functions are visualized in figure 7.

Figure 7. Likelihoods of the two observable values.

Since the two cases are complements of each other, inferring the continuous variable becomes much easier: probability of observing, say, the value 1, can be directly mapped to the value of the continuous variable. The probability, on the other hand (for the values 0 and 1), can be simply estimated as the average of observed samples. Figure 8 shows the mapping from probability to value.

Figure 8. Continuous variable inference from probability of observed value.

While it appears in figure 8 that a probability of 0 gets mapped the continuous variable value 0 and that the probability of 1 gets mapped to the continuous variable value 1, this is not quite the case. This is because there are still very tiny probabilities unaccounted for in figure 8. However, since these are practically impossible, we can round the remaining probability to zero.

Since estimating the probability of observing a 1 is the same thing as computing the average of ones and zeros, we can extend the concept of averaging the values to the other values as well. Turns out this way we can define a single curve that takes the average of the observations and outputs an estimate of the continuous variable. See figure 9.

Figure 9. Continuous variable estimation from average of observations.

This approach is very lightweight to compute, and allows handling dynamic situations by choosing the length of time that the average is computed over.

Let's revisit the situation of figure 4 using this approach. The results are presented in figure 10.

Figure 10. Estimation method demonstration.

Monday, February 17, 2025

GPS disciplined oscillator - part 2

Introduction

After the good results I got from my prototype GPSDO, I wanted to go a bit further than the mess of wires that was the prototype. Besides achieving a better build quality, I needed more than just one 10 MHz output to support a few key instruments in my lab - and a PPS output would also be nice to have. I also wanted to improve the temperature dependency of the device by insulating it better, and by keeping the control voltage reference at constant temperature.

The device was built in April - June of 2024, but I only now got to documenting it.

Overview

The device is composed of two parts: the main unit and the GNSS receiver unit. They are connected to each other through standard twisted pair Ethernet cabling, which allows placing the receiver unit to a convenient location, which may be far away from the main unit.

GNSS receiver unit on the left, main unit on the right

Reverse sides

GNSS receiver unit

The GNSS receiver unit gets power through one of the four pairs of the cable. A suitable voltage for the receiver is then regulated locally from the provided voltage. The PPS output of the receiver is fed into a RS422 line driver and transmitted back to the main unit over another pair of the cable.

The receiver serial TX and RX are also connected to a RS422 transceiver to allow data communication with the main unit. Thus, all 4 pairs of the cable are used.


Cover dome, board carrier and pole mount are held together with 4 M3 screws

Close up on board carrier. RJ45 is on a separate board and connects via ribbon cable

RJ45 board is held in place with snap tabs. Receiver PCB is attached with 2 M2.5 screws and hold down tabs.

Reverse sides

Main unit

The main unit is comprised of several sub-units. These are the main board sub-unit, the OCXO board sub-unit and the LCD board sub-unit. Each sub-unit handles a separate part of the system and communicate with each other through I2C.

Main board sub-unit and LCD board sub-unit are attached to the cover plate. OCXO board sub-unit is inside block of foam insulation.

Mess of wires for the connectors. All wires connect to the main board. RJ45 connector is on a separate board, which is held in place by snap tabs.

LCD board sub-unit (top) and main board sub-unit (bottom)

Block of foam insulation with OCXO board sub unit inside removed from the main unit enclosure

LCD board sub-unit

The LCD board sub-unit handles displaying information to the user through a HD44780 type liquid crystal display. It acts as an I2C master and pulls information to display from the main board sub-unit as well as from the OCXO board sub-unit. It is built around a CH32V003 MCU.

Close up on the LCD board sub-unit with main board sub-unit

OXCO board sub-unit

The OCXO board sub-unit handles the 10 MHz frequency generation and its control voltage generation, and thus carries the OSC5A2B02 OCXO itself. It is housed inside a thermally insulating foam block to reduce the temperature dependency of the 10 MHz output.

The OCXO adjustment is about 1 ppb per millivolt, so the control voltage needs to be quite stable. This stability is provided by a TL431C voltage reference, which is at close thermal contact with the OCXO and thus also at a constant temperature. A 16 bit DAC is then implemented using a PWM signal from a CH32V003, followed by heavy filtering to try to keep the phase noise low.

The OCXO board sub-unit also features a DS18B20 temperature sensor, which is at close thermal contact with the OCXO. This is for now only used for monitoring, but could perhaps be used for some additional temperature compensation in the future.

The 10 MHz output from the OCXO is not used by any of the circuitry on the sub-unit itself, but is instead passed to the main board sub-unit.

The OCXO board sub-unit acts as an I2C slave and allows setting the OCXO control voltage based on a control word, which it receives from the main board sub-unit. Additionally, the LCD board sub-unit queries it for temperature information.

Block of foam insulation with OCXO board sub-unit inside

Foam cover removed

OCXO board sub-unit removed from block of insulation

Main board sub-unit

The main board sub-unit performs three functions: GNSS data interfacing, OCXO PLL control and clock buffering.

Main board

Reverse side

GNSS data interface

The GNSS data interface has a RS422 transceiver to allow serial data communication with the GNSS receiver. This is used to configure the receiver, as well as for receiving a time stamp for each PPS pulse for absolute phase control. The GNSS data interface is built around a CH32V003 and acts as both an I2C master and an I2C slave. As a master, it actively pushes validity and time stamp information to the OCXO phase lock loop system. As a slave it allows the LCD board sub-unit to fetch the validity, time stamp, positioning and other GNSS information.

PFD controller

The phase-frequency detector controller is the heart of the system, and is built around a CH32V003. The MCU is clocked from the OCXO board sub-unit 10 MHz output. This 10 MHz is internally clock doubled for a 20 MHz cycle rate at the MCU core.

Via an RS422 receiver, the MCU gets the PPS signal from the GNSS unit. This allows the MCU to count the number of 10 MHz cycles between PPS pulses at half-cycle (i.e. 50 nanosecond) resolution. This, averaged over a long period, represents the true frequency of the OCXO output.

At each PPS, the accumulated phase count of the OCXO is compared to the ideal phase (i.e. the phase derived from the time stamp). The difference of these represents the phase error. The phase error is then propagated through the PLL software controller to compute an OCXO control word.

The OCXO PFD controller MCU acts both as an I2C master and as an I2C slave. As a master, it transmits the OCXO control word to the OCXO board sub-unit. As a slave it allows the LCD board sub-unit to fetch status information regarding the PLL lock.

Clock buffers

The clock buffers allow isolating the external users of the 10 MHz clock signal from the clock generation. Thus the PLL keeps running even if some clock outputs were e.g. accidentally shorted or otherwise abused. There are a total of four 10 MHz outputs and one PPS output.

Each clock buffer is made from a 74HC04 hex inverter. Five of the inverters are paralleled for the output stage. This is done to provide a strong drive for a 50 ohm output. The sixth inverter is used as a buffer between the OCXO signal and the five paralleled inverters. This reduces the capacitance which the OCXO needs to drive as it now only sees the capacitance of one CMOS input per clock buffer instead of six.

Characterization

I think I've now spent more time measuring and tuning the device than actually building and designing it. In fact, I'm not yet even using it as a time base for any of my lab instruments.

OCXO gain

The OCXO is modeled as having a linear response to its control voltage (see previous post), while the control voltage is assumed to be linear to the control word. The OCXO frequency change per control word change is here called the OCXO gain. In order to tune the controller for the best response, the OCXO gain needs to be measured at the operating point.

OCXO gain measurement

The control word given to the OCXO board sub-assembly was varied between two values, one of which was 1000 counts above the operating point and one which was 1000 counts below the operating point. The frequency of the OCXO was then measured against the GNSS PPS signal. Note that since the PFD operates with a clock-doubled OCXO input, the gain is defined with respect to the clock-doubled 20 MHz frequency and not the 10 MHz output.

The clock-doubled OCXO output changed frequency by 0.458 Hz with a control word change of 2000, thus giving the OCXO gain an approximate value of 229 micro-Hz per count.

OCXO PLL time-scale

A critical parameter to choose is the bandwidth of the PLL. This is because the GNSS PPS is noisy at short time scales, while the OCXO is unstable at long time scales. To gain understanding on the matter, I collected phase error data from the OCXO while set as free running.

As comparison points, I found some stability data published for the OSC5A2B02 as well as for bare GPS PPS signals. These came from the blogs PA1EJO and www.febo.com. These sources used rubidium standards as their reference clocks, while I could only compare against the GNSS PPS.

Phase error of the OCXO against GNSS PPS

Low frequency phase noise on the 10 MHz output

Modified Allan deviations. My measurement and other published data.

Looking especially at the modified Allan deviation graphs, my data follows the [PA1EJO] GPS PPS data quite well at the low frequencies. This indicates that the deviation at those frequencies comes from the PPS and not from the OCXO. On the other hand, at slightly higher frequencies my data follows the [www.febo.com] OSC5A2B02 data, indicating that here the OCXO is causing the deviation. Though my OCXO in this measurement appears to be more stable than the unit in the external data. This could be due to additional thermal insulation my unit has.

The OCXO has apparent linear frequency shift, which at such long time scale could be easily corrected by the PLL. To still get more understanding of the stability, I computed the Hadamard deviation as well. This is insensitive to linear frequency drift, and the [PA1EJO] GPS PPS data has this metric also published.

Hadamard deviation comparison

In the Hadamard sense, the OCXO is remarkably stable. There is a slight change in the slope at around 10 seconds time scale, but it isn't anything to really worry about.

It looks that it's only important to select the time scale short enough, that the linear frequency drift is compensated, but still long enough that phase error measurements (through low-pass filtering) become accurate. This is probably a pretty wide time scale range. Mostly it's trying to get the scale as short as possible, while keeping the control stable.

To explain the last part a bit. The hardware can directly observe the phase error at only 50 ns resolution. This by itself is not nearly good enough for proper control. However, due to jitter in the PPS, the phase error is actually observed alternating between two values. Low-pass filtering this alternating raw error gives much improved resolution. As an additional trick, the controller can provoke jitter in the measurement by deliberately controlling the OCXO phase to lie at a midpoint between two values.

Conclusion

I'm still in the progress of measuring the response of parameter choices, as well as making tweaks to the controller algorithm itself. There will be a part 3, in which I'll try to get some performance measurements. I'll also try to get the electronics and code published as well.