Adding Constrained Discontinuities to Gaussian 
Process Models of Wind Fields 
Dan Cornford* Ian T. Nabney Christopher K. I. Williams t 
Neural Computing Research Group 
Aston University, BIRMINGHAM, B4 7ET, UK 
d. cornfordaston.ac.uk 
Abstract 
Gaussian Processes provide good prior models for spatial data, but can 
be too smooth. In many physical situations there are discontinuities 
along bounding surfaces, for example fronts in near-surface wind fields. 
We describe a modelling method for such a constrained discontinuity 
and demonstrate how to infer the model parameters in wind fields with 
MCMC sampling. 
1 INTRODUCTION 
We introduce a model for wind fields based on Gaussian Processes (GPs) with 'constrained 
discontinuities'. GPs provide a flexible framework for modelling various systems. They 
have been adopted in the neural network community and are interpreted as placing priors 
over functions. 
Stationary vector-valued GP models (Daley, 1991) can produce realistic wind fields when 
run as a generative model; however, the resulting wind fields do not contain some features 
typical of the atmosphere. The most difficult features to include are surface fronts. Fronts 
are generated by complex atmospheric dynamics and are marked by large changes in the 
surface wind direction (see for example Figures 2a and 3b) and temperature. In order 
to account for such features, which appear discontinuous at our observation scale, we have 
developed a model for vector-valued GPs with constrained discontinuities which could also 
be applied to surface reconstruction in computer vision, and geostatistics. 
In section 2 we illustrate the generative model for wind fields with fronts. Section 3 ex- 
plains what we mean by GPs with constrained discontinuities and derives the likelihood of 
data under the model. Results of Bayesian estimation of the model parameters are given, 
*To whom correspondence should be addressed. 
tNow at: Division of Informatics, University of Edinburgh, 5 Forrest Hill, Edinburgh EH1 2QL, 
Scotland, UK 
862 D. Cornford, I. T. Nabney and C. K. I. Williams 
using a Markov Chain Monte Carlo (MCMC) procedure. In the final section, the strengths 
and weaknesses of the model are discussed and improvements suggested. 
2 A GENERATIVE WIND FIELD MODEL 
We are primarily interested in retrieving wind fields from satellite scatterometer observa- 
tions of the ocean surface . A probabilistic prior model for wind fields will be used in 
a Bayesian procedure to resolve ambiguities in local predictions of wind direction. The 
generative model for a wind field including a front is taken to be a combination of two 
vector-valued GPs with a constrained discontinuity. 
A common method for representing wind fields is to put GP priors over the velocity poten- 
tial  and stream function 9, assuming the processes are uncorrelated (Daley, 1991). The 
horizontal wind vector u = (u, v) can then be derived from: 
u = -/9"-' + /9-' v = /9--'- + /9V' (1) 
This produces good prior models for wind fields when a suitable choice of covariance 
function for  and  is made. We have investigated using a modified Bessel function 
based covariance 2 (Handcock and Wallis, 1994) but found, using three years of wind data 
for the North Atlantic, that the maximum a posteriori value for the smoothness parameter 3 
in this covariance function was ~ 2.5. Thus we used the correlation function: 
( .) 
p(r)= I+Z +-/ exp - (2) 
where L is the correlation length scale, which is equivalent to the modified Bessel function 
and less computationally demanding (Cornford, 1998). 
ISimulate Frontal Position, Orientation and mecfion I 
J Simulate Frontal Wind Angle J 
I Simulate Wind Speed at Front J 
I Simulate Along Both Sides of Front using GP1 I 
Simulate Wind Fields Either Side of Front Conditionally I 
on that Sides Frontal Winds using GP2 
Origin 
(a) (b) 
Figure 1: (a) Flowchart describing the generative frontal model. See text for full descrip- 
tion. (b) A description of the frontal model. 
The generafive model has the form outlined in Figure la. Initially the frontal position and 
orientation are simulated. They are defined by the angle clockwise from north (4;) that 
the front makes and a point on the line (z;, $ti). Having defined the position of the front, 
1 See http: //www. ncrg. aston. ac. uk/Proj ects/NEUROSAT/NEUROSAT. html 
for details of the scatterometer work. Technical reports describing, in more detail, methods for 
generating prior wind field models can also be accessed from the same page. 
2The modified Bessel function allows us to control the differentiability of the sample realisations 
through the 'smoothness parameter', as well as the length scales and variances. 
3This varies with season, but is the most temporally stable parameter in the covariance function. 
Adding Constrained Discontinuities to GP Models of Wind Fields 863 
the angle of the wind across the front (of) is simulated from a distribution covering the 
range [0, r). This angle is related to the vertical component of vorticity () across the front 
through  = k. V x u (x cos (-z) and the constraint o s E [0, r) ensures cyclonic vorticity 
at the front. It is assumed that the front bisects o s. The wind speed (ss) is then simulated at 
the front. Since there is generally little change in wind speed across the front, one value is 
simulated for both sides of the front. These components 0! = ((,bS, zS, YS, S, ss) define 
the line of the front and the mean wind vectors just ahead of and just behind the front 
(Figure lb): 
- - (sssin (cks + ,sscos 
(3) 
ml -- (u,v) -- (--sssin (ckl -- -- 
2) '-sI cs 2)) (4) 
A realistic model requires some variability in wind vectors along the front. Thus we use a 
GP with a non-zero mean (ml or mXb) along the line of the front. In the real atmosphere 
we observe a smaller variability in the wind vectors along the line of the front compared 
with regions away from fronts. Thus we use different GP parameters along the front (GP1), 
from those used in the wind field away from the front (GP2), although the same GPx 
parameters are used both sides of the front, just with different means. The winds just ahead 
of and behind the front are assumed conditionally independent given rnx and mXb, and 
are simulated at a regular 50 km spacing. The final step in the generative model is to 
simulate wind vectors using GP2 in both regions either side of the front, conditionally on 
the values along that side of the front. This model is flexible enough to represent fronts, yet 
has the required constraints derived from meteorological principles, for example that fronts 
should always be associated with cyclonic vorticity and that discontinuities at the model 
scale should be in wind direction but not in wind speed 4. To make this generative model 
useful for inference, we need to be able to compute the data likelihood, which is the subject 
of the next section. 
3 GPs WITH CONSTRAINED DISCONTINUITIES 
(a) (b) 
Figure 2: (a) The discontinuity in one of the vector components in a simulation. (b) Frame- 
work for GPs with boundary conditions. The curve D1 has nl sample points with values 
Zx. The domain D2 has n2 points with values 
4The model allows small discontinuities in wind speed, which are consistent with fromm dynam- 
ics. 
864 D. Cornford, I. T. Nabney and C. K. I. Williams 
We consider data from two domains Di and D2 (Figure 2b), where in this case D1 is a 
curve in the plane which is intended to be the front and D2 is a region of the plane. We 
obtain nl variables Zx at points rx along the curve, and we assume these are generated 
under GP1 (a GP which depends on parameters 0x and has mean mx= ml 1 which will be 
determined by (3) or (4)). We are interested in determining the likelihood of the variables 
Z2 observed at n2 points r2 under GP2 which depends on parameters 02, conditioned on 
the 'constrained discontinuities' at the front. 
We evaluate this by calculating the likelihood of Z2 conditioned on the rzl values of Zx 
from GP1 along the front and marginalising out Zx' 
p(Z21192,19x) - p(Z21Zx,192,19x,mx)p(Zx119x,mx) dZ. (5) 
o 
From the definition of the likelihood of a GP (Cressie, 1993) we find: 
p(Z21Zx,192,19x,mx)- (2)Y'15221 exp - Z'SiZ 
where: 
 --1   --1 
'22 -- K2212 -- Kx212Kll12K1212, Z 2 = Z2 - K1212Klli2Z1. 
(6) 
To understand the notation consider the joint distribution of Zx, Z2 and in particular its 
covariance matrix: 
[Kill2 K1212] 
K = [K2112 K2212J (7) 
where Kill2 is the rzl x rzl covariance matrix between the points in D1 evaluated using 
t92, K1212 = Kll2 the rzl x n2 (cross) covariance matrix between the points in D1 and D2 
evaluated using 192 and K2212 is the usual n2 x n2 covariance for points in D2. Thus we 
can see that ,22 is the n 2 x n2 modified covariance for the points in D2 given the points 
along D1, while the Z is the corrected mean that accounts for the values at the points in 
D1, which have non-zero mean. 
We remove the dependency on the values Zx by evaluating the integral in (5). 
p(g1101, ml) is given by: 
( 
p(Zx119x,mx)- (2r)lKlllll exp - (Zx-tax Kllll (8) 
where K1111 is the nl x nl covariance matrix between the points in D1 evaluated under 
the covariance given by 19x. Completing the square in Zx in the exponent, the integral (5) 
can be evaluated to give: 
1 1 1 1 
p(Z21192,19x,mx)- (2r) 1q221 IKll111 IBI  x (9) 
exp C'B-1C- Z2 '22 Z2 -m Kllllmx 
where: 
B (K1212 --1 t t --1 
= ' qll2) s& 1 +/c511 
C' - g2's 1 ' -i , -1 
-- K1212Kll12 + m Kllll 
The algorithm has been coded in MATLAB and can deal with reasonably large numbers of 
points quickly. For a two dimensional vector-valued GP with nl - 12 and n2 = 200 5 and 
5This is equivalent to nt = 24 and n. = 400 for a scalar GP. 
Adding Constrained Discontinuities to GP Models of Wind Fields 865 
a covariance function given by (2), computation of the log likelihood takes 4.13 seconds on 
an SGI Indy R5000. 
The mean value just ahead and behind the front define the mean values for the constrained 
discontinuity (i.e. ml in (9)). Conditional on the frontal parameters the wind fields either 
side (Figure 3a) are assumed independent: 
p(Z2,,z2t, lO2, Ol, or) = p(Z2aJo2, 01,mlc,)p(mlc, lO.r) x 
p( Z2t, lO2, , m xt,)p(m xt, I  ! ) 
where we have performed the integration (5) to remove the dependency on Zxa and Zxt,. 
Thus the likelihood of the data Z2 = (Zaa, Zat,) given the model parameters Oa, Ox, ! 
is simply the product of the likelihoods of two GPs with a constrained discontinuity which 
can be computed using (9). 
Front 
(a) (b) 
Figure 3: (a) The division of the wind field using the generative frontal model. Zla, Zlb 
are the wind fields just ahead and behind the front, along its length, respectively. Z,, 
Z2t, are the wind fields in the regions ahead of and behind the front respectively. (b) An 
example from the generative frontal model: the wind field looks like a typical 'cold front'. 
The model outlined above was tested on simulated data generated from the model to assess 
parameter sensitivity. We generated a wind field ,go = (Z,, Zt,) using known model 
parameters (e.g. Figure 3b). We then sampled the model parameters from the posterior 
distribution: 
p(02, 0, OlZ )  p(ZlO:, ox, Oi)p(02)p(Ox)p(0l) (10) 
where p(02), p(Ox), p(Oi) are prior distributions over the parameters in the GPs and front 
models. This brings out one advantage of the proposed model. All the model parameters 
have a physical interpretation and thus expert knowledge was used to set priors which 
produce realistic wind fields. We will also use (10) to help set (hyper)priors using real data 
in Z . 
MCMC using the Metropolis algorithm (Neal, 1993) is used to sample from (10) using the 
NETLAB 6 library. Convergence of the Markov chain is currently assessed using visual in- 
spection of the univariate sample paths since the generating parameters are known, although 
other diagnostics could be used (Cowles and Carlin, 1996). We find that the procedure is 
insensitive to the initial value of the GP parameters, but that the parameters describing the 
location of the front (05i, dl) need to be initialised 'close' to the correct values if the chain 
is to converge on a reasonable time-scale. In the application some preliminary analysis of 
the wind field would be necessary to identify possible fronts and thus set the initial param- 
eters to 'sensible' values. We intend to fit a vector-valued GP without any discontinuities 
6Available from http: //www. ncrg. aston. ac. uk/netlab/index. html. 
866 D. Corn ford, I. T. Nabney and C. K. I. Williams 
Sample number x lO 4 Sample number x lO 4 
(a) (b) 
Figure 4: Examples from the Markov chain of the posterior distribution (10). (a) The 
energy = negative log posterior probability. Note that the energy when the chain was ini- 
tialised was 2789 and the first 27 values are outside the range of the y-axis. (b) The angle 
of the front relative to north (bf). 
and then measure the 'strain' or misfit of the locally predicted winds with the winds fitted 
by the GP. Lines of large 'strain' will be used to initialise the front parameters. 
o 2'5 ,  
10 1 2 3 
Sample number 
3500 
3000 
2500 
cu>'2000 
1500 
1000 
5OO 
2 2.5 3 
Angle of wind (radians) 
3.5 
(a) (b) 
Figure 5: Examples from the Markov chain of the posterior distribution (10). (a) The angle 
of the wind across the front (or f). (b) Histogram of the posterior distribution of otf allowing 
a 10000 iteration burn-in period. 
Examples of samples from the Markov chain from the simulated wind field shown in Fig- 
ure 3a can be seen in Figures 4 and 5. Figure 4a shows that the energy level (= negative log 
posterior probability) falls very rapidly to near its minimum value from its large starting 
value of 2789. In these plots the true parameters for the front were bf = 0.555, ctf = 2.125 
while the initial values were set at bf = 0.89, af = 1.49. Other parameters were also in- 
correctly set. The Metropolis algorithm seems to be able to find the minimum and then 
stays in it. 
Figure 4b and 5a show the Markov chains for 4f and ai. Both converge quickly to an ap- 
parently stationary distributions, which have mean values very close to the 'true' generating 
parameters. The histogram of the distribution of oti is shown in Figure 5b. 
Adding Constrained Discontinuities to GP Models of Wind Fields 867 
4 DISCUSSION AND CONCLUSIONS 
Simulations from our model are meteorologically plausible wind fields which contain 
fronts. It is possible similar models could usefully be applied to other modelling prob- 
lems where there are discontinuities with known properties. A method for the computation 
of the likelihood of data given two GP models, one with non-zero mean on the boundary 
and another in the domain in which the data is observed, has been given. This allows us 
to perform inference on the parameters in the frontal model using a Bayesian approach of 
sampling from the posterior distribution using a MCMC algorithm. 
There are several weaknesses in the model specifically for fronts, which could be improved 
with further work. Real atmospheric fronts are not straight, thus the model would be im- 
proved by allowing 'curved' fronts. We could represent the position of the front, oriented 
along the angle defined by bf using either another smooth GP, B-splines or possibly poly- 
nomials. 
Currently the points along the line of the front are simulated at the mean observation spac- 
ing in the rest of the wind field (~ 50 kin). Interesting questions remain about the (in-fill) 
asymptotics (Cressie, 1993) as the distance between the points along the front tends to zero. 
Empirical evidence suggests that as long as the spacing along the front is 'much less' than 
the length scale of the GP along the front (which is typically ~ 1000 kin) then the spacing 
does not significantly affect the results. 
Although we currently use a Metropolis algorithm for sampling from the Markov chain, 
the derivative of (9) with respect to the GP parameters 0x and 02 could be computed ana- 
lytically and used in a hybrid Monte Carlo procedure (Neal, 1993). 
These improvements should lead to a relatively robust procedure for putting priors over 
wind fields which will be used with real data when retrieving wind vectors from scatterom- 
eter observations over the ocean. 
Acknowledgements 
This work was partially supported by the European Union funded NEUROSAT programme 
(grant number ENV4 CT96-0314) and also EPSRC grant GR/L03088 Combining Spatially 
Distributed Predictions from Neural Networks. 
References 
Cornford, D. 1998. Flexible Gaussian Process Wind Field Models. Technical Report 
NCRG/98/017, Neural Computing Research Group, Aston University, Aston Trian- 
gle, Birmingham, UK. 
Cowles, M. K. and B. P. Carlin 1996. Markov-Chain Monte-Carlo Convergence 
Diagnostics--A Comparative Review. Journal of the American Statistical Associ- 
ation 91, 883-904. 
Cressie, N. A. C. 1993. Statistics for Spatial Data. New York: John Wiley and Sons. 
Daley, R. 1991. Atmospheric Data Analysis. Cambridge: Cambridge University Press. 
Handcock, M. S. and J. R. Wallis 1994. An Approach to Statistical Spatio-Temporal 
Modelling of Meteorological Fields. Journal of the American Statistical Associa- 
tion 89, 368-378. 
Neal, R. M. 1993. Probabilistic Inference Using Markov Chain Monte Carlo Methods. 
Technical Report CRG-TR-93-1, Department of Computer Science, University of 
Toronto. URL: http: //www. cs. utoronto. ca/radford. 
