In the modern science of data analytics, sometimes oldies are goodies. I once took an optimization class where the answer to every question posed by the professor was “the Taylor series,” referring to a popular numerical method that will be 300 years old next year. Brook Taylor’s 1715 formulation, which can be traced back even further to James Gregory in the seventeenth century, is the foundation of a great many of today’s numerical methods, of which one of the most powerful is nonlinear batch least squares.

Depending on your perspective the Taylor series can be both mundane and profound. Its basic idea is that the value of a function, say y = x 2 , at a particular point, say x = 3.1, can be computed as the value at some other point, say x = 3, with an adjustment to account for the difference. The slope of y = x 2 at x = 3 is 6, so between x = 3 and x = 3.1, the function will increase about 6 × 0.1, or 0.6. So if y is 9 at x = 3 (32 is 9), then y is about 9 + 0.6, or 9.6, at x = 3.1. My calculator tells me the exact value is 9.61.

If you draw this out with pencil and paper you will see the idea is quite simple. If you know the slope, or derivative, of a function, then you can approximate nearby values of the function. I once knew a fellow who could tell you the value of functions, such as the square root or cosine, at arbitrary values faster than you could punch it into a calculator.

But the Taylor series is a bit more profound when you consider higher order derivatives. When we used the first derivative, or slope, of y = x 2 above, we could approximate nearby values fairly accurately. But if we also account for the second derivative—the slope of the slope—we can have the answer exactly.

In fact we could compute the exact value of y = x 2 at any point, knowing only the function value at any given point, and its first and second derivative. Simply put, what the Taylor series tells us is that for well-behaved functions such as y = x 2 , the entire function can be described in terms of information at only a single point in the function.

In practice there are many applications in which there are multiple output and input variables (y1, y2, …, x1, x2, …) and the function derivatives cannot be analytically derived. In such applications the Taylor series can nonetheless be useful.

Consider the problem of modeling an aircraft flight using radar tracking data. If you compute the trajectory using flight segments such as climbs, descents and turns, then you can derive the radar measurements. The difference between your derived measurements and the actual measurements indicates your modeling error. You may have errors in the start and stop times of the segments, or in the aircraft performance, such as the rate of climb, during the segments.

Using the Taylor series idea you can perturb each of the modeling parameters to estimate the slope of each of the output variables (the errors in your derived radar measurements) with respect to each of the input variables (the flight parameters).

You can then adjust the flight parameters in the direction that reduces the radar measurement errors, until those errors are minimized. This method was suggested by Donald Marquardt in 1963 and traces back to Kenneth Levenberg in 1944. There is no guarantee of success and you must check your answer carefully, but the Marquardt-Levenberg algorithm is robust and has been successfully applied to a wide range of problems. In the aircraft trajectory example, the result is a description of the flight, not in terms of a long list of radar measurements, but rather a short list of meaningful performance parameters. These can then be used, for example, to predict future trajectories.