Let's suppose you've observed a pulsar. You've built a reference pulse profile and used it to calculate the precise time of arrival (TOA) of a number of pulses from the pulsar. Now you want to use these arrival times to measure, say, the spin period of the pulsar (it could as easily be any other parameter that affects the timing, like spin-down rate, binary period, or position).
In principle, the procedure is simple: for each plausible spin period, you predict the phase at the time each pulse arrived. Since the pulse arrived at phase zero, the size of the non-zero prediction - the residual - serves as a measure of how far wrong your model was. Hopefully your procedure for calculating the arrival times yields an uncertainty, so that you can see how likely the error is to be due to the random scatter we expect from a noisy signal. You can then build a measure of total size of all the residuals - a chi-squared, probably. Then you vary the spin period (or whatever) to try to minimize the chi-squared. Simple, right?
Well, yes and no. If your model is too far wrong, it may predict the wrong number of turns; since all you have are arrival phases, your residuals are modulo one turn. This means that, for example, if you started from the best-fit solution and began changing your parameter, the fit would get worse smoothly up to a point - but then it would suddenly start decreasing as your error wrapped around. Once the number of turns becomes ambiguous, your minimization problem begins to become very rough.
Traditionally, tempo has left the number-of-turns issue to the user. If the number of turns is wrong, well, tough luck. tempo simply assumes the number of turns is whatever is closest to your initial model's prediction. There are various schemes to change this number of turns that tempo assumes, but tempo leaves the problem up to you. In this tempo2 takes the same approach.
The second problem with this minimization is that it's a very large problem - one can easily have tens of parameters to fit, and thousands of residuals to try to match. On the other hand, the problem is very nearly linear (aside from that nasty phase wrapping, which is your problem, not tempo/tempo2's) - while the ways the parameters affect the residuals can be complicated, they rarely interact very much. So tempo and tempo2 treat the problem as a linear one. This means that there's no need for complex nonlinear optimizers.
The way tempo2 does the actual fitting is this:
- Take the initial guess at model parameters and form residuals for each TOA.
- Adjust these residuals using any PHASE or TRACK commands the user specified; this locks in the number of turns between each pair.
- Compute the partial derivatives of the residuals with respect to each parameter.
- Use an SVD to do a linear least-squares fit, finding the parameter offsets that minimize the residuals.
- Output the modified model.
Tempo2 can be told to repeat this process a number of times; this is necessary in principle to take into account nonlinearities in the model. In fact, iterating this process is technically the steepest-descent optimization algorithm, which is, frankly, bad. But for a very-nearly-linear problem, in which the objective function and its derivatives are very expensive, this gives an approximate solution much less slowly than a "better" algorithm.
What could go wrong with this? Well, nonlinearities are bad news. Phase wraps, if present on input, make fitting pointless, but at least the linear approximation ensures that they don't appear after you've started. It's also the case that tempo2 doesn't actually check whether the "best-fit" model is better than the initial model. But the problem I recently tripped over is, I think, numerical: although the SVD is a pretty good way to do least-squares fitting, you can still have problems if the matrix of partial derivatives has entries of wildly different sizes. Unfortunately, this can easily happen, since tempo's standard way of representing something that varies is as a truncated Taylor series. The magnitude of the coefficient of x^10/10! can easily be 10^-40 even for a rather modest polynomial.
How does tempo2 deal with this? After all, it includes a model for the pulsar's spin frequency that accepts up to twelve derivatives. Well, in fact it scales those derivatives during fitting so that they, hopefully, have reasonable sizes. I am hoping that there is some general mechanism for accomplishing this.