One min and 5-min solar wind data sets
at the Earth's bow shock nose

Joe King and Natalia Papitashvili, GSFC/SPDF and ADNET Systems, Inc.

New upgrades to this data set were made in February 15, 2013

Wind and ACE proton parameters. Previously we used Wind/SWE parameters based on anisotropic nonlinear fits to Wind/SWE plasma distributions through November 2004, and we used cross-normalized Wind/SWE Key Parameter data thereafter. Now, owing to their greater "robustness," the only Wind/SWE proton data we use for 1995-current are the cross-normalized SWE KP data. (SWE KP cross-normalization is to the SWE nonlinear fit data.)
New cross-normalizations for Wind KP and ACE proton density and temperature values have been developed and used. These are -

```     For Wind/SWE KP Np and Tp data
For Np, for all V and time,
LogN(Wind/KP, norm) = -0.055 + 1.037 * LogN(Wind/KP, obsvd)
For Tp, for all V and for 1995-7,
LogT(Wind/KP, norm) = -0.030 + 1.055 * LogT(Wind/KP, obsvd)
For Tp, for all V and for >= 1998,
LogT(Wind/KP, norm) = LogT(Wind/KP, obsvd)

ACE/SWEPAM Np and Tp data:
Let t be fractional years since 1998.0. (E.g., t = 1.5 on July 1, 1999.)
Let V = solar wind speed
N = ACE/SWEPAM proton density as observed
Nn = value of N as normalized to equivalent Wind/SWE nonlinear fit proton densities
For V < 395 km/s, Nn = [0.925 + 0.0039 * t] * N
For V > 405 km/s, Nn = [0.761 + 0.0210 * t] * N
For 395405, Nn = [74.02 - 0.164*V - 6.72*t + 0.0171*t*V] * N/10
For temperature (all V), LogT(norm) = -0.069 + 1.024 * LogT(obsvd)

Below is the full old OMNI documentation package
A new comprehensive documentation page will be available soon.
```

Contents

This note describes the building and contents of several 1-min- and 5-min resolution, solar wind magnetic field and plasma data sets time-shifted to the Earth's bow shock nose. Data from the ACE, Wind and IMP 8 spacecraft were processed in 2005-6, while Geotail data were added later, in 2007. Initially the data were for 1995 to near-current. In 2009, the IMP 8 shifted data were extended back in time to 11/04/1973, shortly after launch. Also in 2009, we added GOES fluxes of protons above 10, 30 and 60 MeV to 5-min OMNI. These products are primarily intended to support studies of the effects of solar wind variations on the magnetosphere and ionosphere. In addition, we address 1998-2000 1-min ACE data sets shifted using various techniques to the Wind location.

Time shifting is based on the assumption that solar wind magnetic field values observed by a spacecraft at a given time and place lie on a planar surface (a "phase front") convecting with the solar wind, and that the same values will be seen at a different place at the time that the phase front sweeps over that location. A key element of the time shifting is use of the phase front normal (PFN) directions, which are to be determined individually for each input 15-16 sec magnetic field observation by analysis of it and its near neighbors. We identify and compare results of two distinct PFN determination analysis techniques (minimum variance and cross products) and two separate combinings of these, for a total of four shift techniques.

The family of products introduced herein consist of
(a) 1-min averaged 1998-2000 ACE magnetic field and plasma data shifted to the Wind location by each of the four shift techniques, along with 1-min unshifted Wind averages, with which interested persons can make independent judgements on the relative effectiveness of the various shift techniques,
(b) 1-min and 5-min averaged ACE (1998-2006), Wind (1995-2006), IMP 8 (1973-2000) and Geotail (1995-2005) magnetic field and plasma data sets shifted to the Earth's bow shock nose,
(c) a 1-min spacecraft-interspersed data set at the bow shock nose that we call the High Resolution OMNI (HRO) data set and
(d) a 5-min averaged version of HRO having GOES proton fluxes appended.
Time tags in records of all these products are target-arrival times and not observation times.

This note addresses in sequence: (a) the input data sets and their preparations, (b) the time shifting used, including discussion of the multiple PFN determination techniques available and including consideration and handling of "out-of-sequence" arrivals, (c) the building of 1-min averages from the shifted 15-16 sec IMF values and shifted 1-2 min plasma values, (d) discussion of the various data sets created (spacecraft-specific and the spacecraft- interspersed HRO), including their record formats and meanings of each word in the records, (e) results of analysis of the 1998-2000 Wind data and ACE data shifted to Wind for predictability of IMF and plasma variations at one point, given observations elsewhere, as a function of the two-point separation vector, of the solar wind state (variation level, fast or slow, etc.), and of the PFN determination technique. In addition, a series of Appendices address (f) interspacecraft comparisons of magnetic field and plasma parameter values for finding systematic differences and parameter cross-normalization used in interspersing data from three spacecraft, (g) selection criteria for which data to use in High Resolution OMNI when data from multiple spacecraft are available for a given interval.

The table below identifies the dates for which different parameters of high res. OMNI data are available.

```YYYY DDD - YYYY DDD
--------------------
1981 001 - 1994 365 IMF (from IMP8 only)
1981 001 - 1994 365 Plasma (from IMP8 only)
---------------------------------------

1995 001 - 2014 010 IMF (from Wind, ACE, IMP8)
1995 001 - 2014 010 Plasma (from Wind, ACE, IMP8)

1981 001 - 1988 366] Final
1989 001 - 2013 243  Provisional AE, AL, AU indexes

1981 001 - 2013 365  Provisional SYM/D, SYS/H, ASYM/D, ASYS/H indexes

1981 001 - 2014 031  (PCn index)

1986 001 - 2014 031 ( Fluxes from Goes, for 5-min res. only)
-------------------
Time span of the data shifted to bow shock nose
Geotail:  1995-03-15 - 2006-12-31(365)
IMP-8:    1973-11-04 - 2000-06-09(366)
ACE:      1998-02-05 - 2013-10-21(294)
Wind:     1995-01-01 - 2014-01-10(010)

```
Proton Fluxes from GOES (>10 MeV, >30 MeV, >60 MeV ) are taken from NGDC: http://goes.ngdc.noaa.gov/data/avg/ originally, and more recently, http://satdat.ngdc.noaa.gov/sem/goes/data/new_avg/. See near end of Section 2 below for further detail.
Minute AE, AL, AU and SYM/D, SYS/H, ASYM/D, ASYS/H indexes have been computed at WDC for Geomagnetism at U. Kyoto:
http://swdcwww.kugi.kyoto-u.ac.jp/aeasy/
PC(N) is the Polar Cap Index determined from the North polar cap station at Thule,Greenland.
It has been computed at World Data Center for Geomagnetism, Copenhagen at the National Space Institute (DTU Space), Technical Universtiy of Denmark:
ftp://ftp.space.dtu.dk/WDC/indices/pcn/.

We have used publicly available ACE, Wind, IMP 8 and Geotail magnetic field and plasma in building the 1-min and 5-min data products described herein.

ACE (Advanced Composition Explorer) was launched August 25, 1997, and continues to provide magnetic field, plasma and energetic particle data from a ~180 day L1 orbit having X, Y, and Z (GSE) ranges of 220 to 250 Re, -40 to +40 Re, and -24 to +24 Re. The ACE home page is at http://www.srl.caltech.edu/ACE/.

Wind was launched November 1, 1994, as part of NASA's contribution to the International Solar Terrestrial Program. It continues to obtain magnetic field, plasma, energetic particle and plasma wave data. Since mid-2004, it has been in an L1 orbit with excursions in Y(GSE) between +/- 100 Re. It had multiple earlier phases, including an interval spanning the last third of 2000 through mid 2002 with Y(GSE) excursions in excess of 200 Re and an interval in late 2003 and early 2004 in orbit about the Lagrange point on the anti-sunward side of Earth. The Wind home page is at http://pwg.gsfc.nasa.gov/wind.shtml.

IMP 8 was launched October 26, 1973, into a low eccentricity Earth orbit. Apogee and perigee distances have been in the ranges 38-45Re and 28-34 Re. On average IMP 8 is out of the solar wind for about 5 days of every 12.5 day orbit. The IMP 8 magnetometer failed June 10, 2000. Data from the MIT plasma instrument and from three energetic particle detectors were acquired until October, 2006. The IMP 8 web page is at http://spdf.gsfc.nasa.gov/imp8/project.html.

Geotail was launched July 24, 1992, into an eccentric orbit with apogee deep in the geotail. In early 1995, the Geotail orbit was adjusted to about 10 x 30 Re, and then to 9 x 30 Re in 1997 where it continues today (2008). In this orbit, Geotail has annual solar wind "seasons" with apogee local times on or near the Earth's dayside, and it has solar wind intervals during each ~5 day orbit of the solar wind seasons.

```The web pages for the contributing investigations are:

Magnetic field
ACE:   http://www.srl.caltech.edu/ACE/ASC/level2/index.html
Wind:   http://wind.gsfc.nasa.gov/mfi/
IMP 8:   http://wind.gsfc.nasa.gov/imp8/
Geotail:   http://www.stp.isas.jaxa.jp/geotail/

Plasma:
ACE:   http://swepam.lanl.gov/
Wind:   http://web.mit.edu/space/www/wind/wind.html
IMP 8:   ftp://space.mit.edu/pub/plasma/imp/www/imp.html
Geotail:   http://www-pi.physics.uiowa.edu/www/cpi/

The input data were pulled from:

Magnetic field
ACE: (ACE Science Center): http://www.srl.caltech.edu/ACE/ASC/
Wind:  ftp://spdf.gsfc.nasa.gov/pub/data/wind/mag/3sec_ascii/
IMP 8:  ftp://spdf.gsfc.nasa.gov/pub/data/imp/imp8/mag/15s_ascii_v3/
Geotail:   http://cdaweb.gsfc.nasa.gov/ (GE_EDB3SEC_MGF)

Plasma
ACE: (ACE Science Center): http://www.srl.caltech.edu/ACE/ASC/
Wind:  ftp://spdf.gsfc.nasa.gov/pub/data/wind/plasma_swe/swe_kp_unspike/
IMP 8:  ftp://spdf.gsfc.nasa.gov/pub/data/imp/imp8/plasma_mit/sw_msheath_min/
Geotail:   http://cdaweb.gsfc.nasa.gov/ (GE_H0_CPI)

Key persons for these data sets include:

Magnetic field:
ACE:  Chuck Smith (UNH), Norman Ness (Bartol)
Wind:   Ron Lepping (GSFC), Adam Szabo (GSFC), Norman Ness
IMP 8:  Adam Szabo, Ron Lepping, Norman Ness
Geotail:  Tsugunobu Nagai (Tokyo Inst. Tech.), S. Kokobun
Plasma:
ACE:  Dave McComas (SWRI), Ruth Skoug (LANL)
Wind:  Al Lazarus (MIT), Justin Kasper (MIT), Keith Ogilvie (GSFC)
IMP 8:  Al Lazarus, John Richardson (MIT)
Geotail:  Bill Paterson (Hampton U.), L. Frank & K. Ackerson (U. Iowa)
```

ACE magnetic field and plasma data

"Level 2" 16-s magnetic field data and 64-s plasma data were pulled from the ACE Science Center. (Credit goes to Andrew Davis and the ASC team for a very effective data management and distribution facility). The field and plasma data there start on September 2, 1997, and February 5, 1998, respectively. Owing to the critical need for plasma flow speed data in time shifting magnetic field data to the bow shock nose or elsewhere, we limit the coverage of ACE data in our new data products to February 5, 1998, and later.

Wind magnetic field data.

The Wind magnetic field data are standardly produced by the instrument team at 3-s, 1-m and 1-h resolutions. Because we apply phase front normal determination algorithms to 15.36-s IMP magnetic field data and to 16-s ACE data, we form 15-s averages from the available 3-s data to have similarly resolved Wind magnetic field data as input

The Wind magnetic field data are standardly available at 3-sec resolution with no discrimination for orbit phase, in particular, for solar wind vs. non-solar wind phases. We have filtered at hourly resolution the time- continuous 3-sec data against the Wind bow shock crossing identifications made by the Wind magnetometer team and available at http://lepmfi.gsfc.nasa.gov/mfi/bow_shock.html to give a solar-wind-only input data set. We have made our own identifications of the few crossings that occurred after the October, 2003, end of the Wind team's list.

In October 2011, the Wind/MFI team finished the reprocessing of all MFI data. Among other things, well-determined Bz offset values were used. The new MFI data were inserted into High Resolution OMNI when they became available, replacing the earlier MFI data. The new data were used to re-determine solar wind phase front normals used in shifting data. The next four paragraphs, inside the ************* marks, become obsolete and irrelevant to the new OMNI data, but are retained since many files having old Wind/MFI data were downloaded from OMNI during 2008-2011.

****************
There are rare spikes in the Wind magnetometer data. We have taken a simple approach to eliminating most of these by rejecting any 3-sec record with a magnetic field magnitude or component in excess of 70 nT.

Finally, it should be noted that the magnetometer team has made its data accessible in a series of versions of increasing reliability. Version 4 data are the most reliable and, as of November, 2006, were available through November 28, 2004. The version 3 data are available to within about two weeks of current and are used as input to our data products, at least in their initial creation in 2006, when the Version 4 data are unavailable.

As of December, 2008, we still had only MFI V3 data in OMNI for times after November 28, 2004. While it is likely V4 data for most of this interval will become available soon, we have adjusted all MFI V3 data in both hourly OMNI 2 and in High Resolution OMNI (HRO) with the equation:

Bz(adjusted) = Bz(original) - {1.16 + 0.36 * sin[(DFDOY - 76)/58.13]}

where DFDOY is day of year dot fraction of day. Parameters depending on Bz have been recomputed, including phase front normals, time shifts, etc. Further details on the adjustment equation are given at http://omniweb.gsfc.nasa.gov/html/omni2_doc.html#source
******************

Wind/SWE plasma data

Wind/SWE plasma parameter data are available at ~92-s resolution in three versions corresponding to three approaches to their production from underlying distribution functions. There are "key parameter" data, non-linear fits-based data (fits assumed convecting bimaxwellian distributions), and anisotropic moments-based data. These are discussed at the MIT Wind/SWE web page cited above. The latter two are further discussed in Justin Kasper's dissertation whose most salient parts are web-accessible at ftp://spdf.gsfc.nasa.gov/pub/data/wind/plasma_swe/2-min/thesis.pdf. Finally, "physics-based" tests of the goodness of the nonlinear fits (NLF)-based velocities (~0.16% in speed, ~3 deg in direction), densities (~3%)and temperatures (~8%) are discussed in Kasper et al. (2006).

The NLF data and the anisotropic moments-based data are accessible through the November 28, 2004, date of availability of Wind magnetic field version 4 data. The SWE KP data, on the other hand, are typically available to within several weeks of the current date. Given this and given the urging of the MIT plasma team to use the very good and more robust KP data, we have chosen to use the KP data in our high resolution OMNI data set.

But given that it was the NLF data for which the relatively small uncertainties cited above were determined, we shall normalize the KP density and temperature values to equivalent NLF values in the spacecraft-interspersed HRO data set. This point is further discussed and quantified in Appendices 1 and 2 addressing comparisons and cross-normalizations of the available multi-spacecraft data.

As for the Wind magnetic field data, the SWE KP data are available with no discrimination for orbit phase. We have extracted a solar wind-only set of SWE KP data by again filtering at hourly resolution against the Wind bow shock crossing identifications cited above.

The SWE KP data are initially computed and loaded to CDAWeb. The SWE team at MIT improves this product by passing it through a despiking routine that compares a value with the median of three points (the point being tested and its immediate predecessor and follower). Some spikes elude detection. We have run a further despiking routine requiring (to be a non-spike) that the difference between a parameter value and the mean of the two preceding and two following values should be less than four times the standard deviation in that mean or that that difference relative to the mean should be less than some (parameter-dependent) value. This is further discussed in Appendix 3.

IMP 8 magnetic field data

IMP 8 magnetic field data have long been available at 15.36 sec resolution (cf. ftp://spdf.gsfc.nasa.gov/pub/data/imp/imp8/mag/15s_ascii/) in a data set that makes no distinction between the solar wind and non-solar wind phases of the IMP orbit. We have used the IMP 8 bow shock crossing information at http://omniweb.sci.gsfc.nasa.gov/ftpbrowser/bowshock.html to separate, at 1-minute resolution, the solar wind and non solar wind phases of the IMP orbit to ensure that only IMP 8 solar wind magnetic field data are included in the products described herein.

There are occasional data spikes in the 15.36 sec data. We have hopefully eliminated most if not all of these by applying the spike finder software discussed in Appendix 3.

As noted above, the IMP data in the products discussed in these notes run only to the June 10, 2000, failure of the IMP 8 magnetometer.

IMP 8 plasma data

Plasma parameters from the MIT Faraday cup are available at ~1 min resolution (cf. http://omniweb.sci.gsfc.nasa.gov/ftpbrowser/imp_mit_min.html). Parameter values as determined both by non-linear fitting to assumed convecting Maxwellian distributions and moments are available. As in earlier work, we use the non-linear fit-based data, as these are believed by the MIT team to be the more reliable. (Note that readers may compare the fit-based and moment-based parameter values using the interface at http://omniweb.sci.gsfc.nasa.gov/ftpbrowser/imp_mit_min_s.html

This data set has data from both the solar wind and magnetosheath phases of the IMP 8 orbit. However, each record has an MIT-assigned flag indicating whether the data definitely are, or are not, from the solar wind, or whether they may be from solar wind or magnetosheath. We have used this flag to eliminate from the products discussed in this documentation any data not tagged as being definitely in the solar wind.

There are some spikes in the IMP 8 plasma data. To eliminate most of these, we have applied the spike finder software discussed in Appendix 3 to the data. However, because the software assumes that the first two and last two data points of every interval not having a data gap in excess of one hour are good data, we have visually scanned plots of data after the application of the spike finder software, and have identified and eliminated a few extra points as being likely bad points.

The IMP 8 plasma flow elevation angle has long been recognized as having a ~2 deg offset. This is further discussed in Appendix 1. We have not taken this bias out of the data of the products discussed herein.

Geotail magnetic field and plasma data

First, we created 15-s averaged magnetic field averages from 3-sec values for input- compatibility with ACE, Wind and IMP IMF data used. Second, we determined the principal time intervals during which Geotail was beyond the Earth's bow shock, in the solar wind. This process, which does not distinguish foreshock intervals from non-foreshock solar wind intervals, is extensively discussed at ftp://spdf.gsfc.nasa.gov/pub/data/geotail/sw_min_merged/00readme Our despiking of Geotail magnetic field and plasma data is also described in this readme file.

The despiked, 15-s, solar-wind-only magnetic field data set is accessible from http://omniweb.sci.gsfc.nasa.gov/ftpbrowser/geotail_mag15s.html

We used CPI plasma data rather than Geotail LEP plasma data as the former seemed to have cleaner solar wind parameter values and were more immediately accessible to us.

As we were doing this work, the magnetometer PI team was working to reprocess its data using more definitive Bz offset values. As of this date (February 5, 2008) and with less than three months left on our grant, we had not received reprocessed data. So we have done our Bz corrections using the expectation that, when averaged over a year, the Bz component in geocentric solar ecliptic coordinates should be within 0.1 of zero. It is possible that one day our new Geotail data sets and multi-spacecraft OMNI data sets will incorporate the not-yet-available reprocessed data of the PI team.

GOES energetic proton fluxes

Fluxes of protons above 10, 30 and 60 MeV, as measured by NOAA's geosynchronous GOES spacecraft are included in 5-minute OMNI. Data from the following spacecraft were used for the indicated years: GOES 7, 1995; GOES 8, 1996-2002; GOES 10, 2003; GOES 11, 2004-2010; GOES 13, 2011 and later. Data are as taken from http://satdat.ngdc.noaa.gov/sem/goes/data/new_avg/ except that for GOES 13, where separate fluxes are given at NGDC for eastward- and westward-looking sensors. For GOES 13, we have averaged these two fluxes for inclusion in 5-min OMNI. To view separate eastward- and westward-looking fluxes, and their ratios, see the FTPBrowser interface at http://omniweb.sci.gsfc.nasa.gov/ftpbrowser/goes13_flux_5m.html Principal Investigator for the GOES energetic particle instruments is currently T. Onsager, and key responsible NGDC person is D. Wilkinson.

Extra notes

Data providers may occasionally create replacement versions of their data. In such cases, we replace the superseded data in OMNI with the newer data values, and typically make note that this has happened at http://omniweb.gsfc.nasa.gov/html/ow_news.html. Such changes are relatively rare are typically involve only small parameter value changes.

We sometimes refer to "15-s" input magnetic field data throughout these pages. Readers should appreciate this is a shorthand notation for 16-s ACE data, 15-s Wind and Geotail data and 15.36-s IMP data.

To best support solar wind - magnetosphere coupling studies, it is desired to time-shift solar wind magnetic field and plasma data from their location of observation, which may be an hour upstream of the magnetosphere and several tens of Re or more removed from the Earth-sun line, to a point close to the magnetosphere. We choose this point to be the bow shock nose. In addition, to assess the goodness of such shifts, we separately shift ACE data to Wind (by each of several shift techniques) and compare the shifted ACE data and in situ Wind data.

Given the availability of data on a specific solar wind magnetic field or plasma parameter P as a function of time t at the location Ro of an observing spacecraft, i.e., P(t, Ro), it is desired to infer values of this parameter at some displaced location Rd, i.e., P(t', Rd). The key underlying assumptions enabling estimation of the time shift, delta-t = t'-t, between observation of the parameter at Ro and t and arrival of this value/variation at Rd at t', is that solar wind variations are organized in series of phase fronts (flat planes) that convect with the solar wind velocity V. Curvature of variation surfaces is ignored and propagation of these phase fronts relative to the solar wind flow is ignored. The unphysical interpenetration of these phase fronts is discussed later. Thus the time shift equation is delta-t = n · (Rd – Ro) / n · V, where n is the variation phase front normal (PFN) and where “·” is the normal dot or scalar product of two vectors.

The target Rd to which we shall shift ACE, Wind, IMP 8 and Geotail data is the bow shock nose. This will best support future solar wind - magnetosphere coupling studies. We use the field and plasma parameters determined at a given time, and the bow shock model of Farris and Russell (1994) with the magnetopause model of Shue et al (1997), to determine where the bow shock will be when the phase front reaches it. See Appendix 4 for a discussion of these models. We include solar wind flow aberration associated with Earth's ~30 km/s orbital motion about the sun in bow shock nose location determination.

It is recognized that this is a very simplified approach, neglecting finite response times of the magnetosphere to solar wind variations, that may introduce some error. However, except for extreme excursions in solar wind parameters, the bow shock will not move enough to introduce significant uncertainty in the timing of arrival of solar wind structures observed upstream. (Uncertainties connected with other factors such as planarity of features and the interpenetration of variation phase planes are larger and affect the parameter profiles and not merely the timing of arriving plasma.)

The bow shock location to which the data are shifted is included in the output data records, among many other parameters.

In addition to shifting data to the bow shock nose, we shall also shift ACE data, by each of four techniques, to the location of the Wind spacecraft so that we can assess the predictability of solar wind variations as a function of the shift technique, the observer-target separation geometry, the variation level in the solar wind, and the nature of the flow (e.g., fast vs. slow).

There was no shifting of GOES energetic particle fluxes.

Minimum variance analysis (MVA) has long been used to determine normals to discontinuity planes in the solar wind magnetic field. See for example Sonnerup and Cahill, 1968. In this approach, a 3x3 variance matrix

Mij = Sum (Bi(t)*Bj(t))/N – Sum Bi(t) * Sum Bj(t)/N**2 = <Bi*Bj> - <Bi>*<Bj>

is formed, with averages taken over a set of N points spanning the discontinuity and with i,j representing any two spatial directions. The matrix is diagonalized, and the eigenvector associated with the minimum eigenvalue gives the minimum variance direction (MVD). The number of points N to be used in the analysis, and the ratio of intermediate to minimum eigenvalues to take as a lower limit below which the MVD is considered not reliably determined, are part of the “art” of MVA.

3a.1. Technique 1, "Modified" MVA

Weimer et al (2003) applied the basic concepts of MVA to determine an MVD for each point of a continuous time series of interplanetary magnetic field data. In effect, they assumed each point lay on a planar phase front whose normal could be used, along with the solar wind flow velocity, in the determination of when that value (assumed constant everywhere on the plane) would be seen elsewhere in space.

After determining surprisingly good correspondence of time-varying time shifts thus determined with shifts determined by multi-spacecraft analysis (e.g., Weimer et al, 2002), an error was discovered in the Weimer et al. (2003) application of MVA. In particular, the 1/N**2 in the expression above was inadvertently replaced by 1/N. When the correct expression above was used, agreement with the multi-spacecraft time shift determinations deteriorated.

Shortly thereafter, Bargatze et al. (2005) demonstrated that the MVA equations used in Weimer et al (2003) corresponded approximately to an MVA constrained by the condition that the mean magnetic field vector over the analysis interval should lie in the plane of minimum variance, that is, that <B>·n (n is the MVD) ~ 0. The Weimer et al (2003) came to be known, at least on a limited basis, as Modified MVA.

Much of our early work in this two-year effort utilized the Weimer-provided code used in his 2003 analysis. None of the final products made available from our effort are based on this technique, although some interim products, no longer available, were.

3a.2 Technique 2, MVAB-0

A Comment by Haaland et al (2006) pointed out that MVA exactly constrained by the <B>·n = 0 condition was first used by Sonnerup and Cahill (1968) and has been discussed by Sonnerup and Scheible (1998). Such an MVA, called MVAB-0 by Haaland et al, diagonalizes not the matrix M (see above), but the matrix P*M*P where the symmetric matrix P (Pij = deltaij – eiej; deltaij is kronecker delta and e is the unit vector in the direction of the mean magnetic field) projects each vector B onto the plane perpendicular to e.

Weimer has developed and provided to us new code that correctly implements the MVAB-0 approach.

We have used the MVAB-0 code generously provided by Weimer in mid 2006. It is the only MVA code used in our final products.

Weimer spent significant effort determining parameters for the MVAB-0 technique, by seeking parameter sets whose results gave best agreement with multi-spacecraft determinations of phase front normals. In particular, he found, and we have used, for the MVAB-0 technique optimal results with 77 15-s points in each analysis (~19 min spans for each MVD determination), eigenvalue ratio greater than or equal to 5.2 (for a reliable MVD determination), and angle between MVD and solar wind flow vector less than 73 deg. (Larger angles lead to excessively long predicted delays.)

To eliminate spurious PFN determinations associated with data gaps, we added the requirement that the interval between the first and last point involved in each PFN determination should be no more than 1.25 times what it would be in the absence of data gaps.

3a.3 Technique 3, Cross Product (CP)

A totally distinct approach to determining a phase front normal, that should be perfect for an ideal tangential discontinuity, is to take a cross product of magnetic field vectors just prior to, and following, a discontinuity. Weimer has developed code that determines phase front normals continuously using the cross product concept and has generously also provided this to us. In a private communication to us, Weimer cites the work of Knetter et al (2004)as the inspiration for developing this cross product (CP) code.

Weimer also spent significant effort determining parameters for the CP technique, by seeking parameter sets whose results gave best agreement with multi-spacecraft determinations of phase front normals. In particular, he found, and we have used, for the CP technique optimal results with the angle between the “before” and “after” vectors greater than 13 deg, that these vectors should be based on 17 points each, centered on the points 14 points before and after the point for which the PFN is sought (thus a span of 46 points, or ~12 mins, for the PFN determinatioin for each point), and that the component of the mean field vector normal to the phase front should be less than 0.035 nT. He also used a 73 deg limiting angle as for the MVAB-0 technique. As for the MVAB-0 technique, we added the requirement that the interval between the first and last point involved in each PFN determination should be no more than 1.25 times what it would be in the absence of data gaps.

3a.4 Technique 4, JK/NP Combination of Techniques 2 and 3.

Now, having two fundamentally different techniques for PFN determination, we are able to add combinations of these two. We devised one, called Technique 4, which is the one we in fact used for producing the bow shock nose-shifted products discussed in these notes. The technique consists of first applying the CP method for a given point and its relevant neighbors, if an acceptable PFN is determined, this is used for this point. If CP does not produce an acceptable PFN (e.g., if the included angle between the “before” and “after” vectors is less than 13 deg), then the MVAB-0 technique is applied and its resultant PFN, if acceptable, is used. If neither CP nor MVAB-0 techniques produce an acceptable PFN, that point is marked for later interpolation, and a PFN is attempted for the next point in the time series.

3a.5 Technique 5, DW Combination of Techniques 2 and 3.

Weimer and King (2008) took an alternative approach and required that both the CP and MVAB-0 techniques should produce the same PFN (to within some accuracy, arbitrarily set at 5 deg) in order to be acceptable, otherwise the point was marked for later interpolation. Weimer has provided the code implementing this technique, which we call Technique 5.

In all cases (Techniques 2-5), a PFN direction satisfying relevant tests may or may not be determined. Typically, such points are marked. Then, in a second pass, for each such point, a PFN is determined by linear interpolation between the last good and next good PFN. In our implementations, the span across which such interpolations are made can be no longer than 3 hours. Data belonging to such extended gaps are not shifted nor included in our new data products.

We hope to modify this in the future, as an IMF that was not varying over many hours would be highly predictable at the bow shock nose yet would not lead to acceptable PFN's and hence would not be "shifted" and included in our products. We have searched the interval March-December, 1998, for such occurrences, and find 45 days with multi-hour data gaps in shifted data despite there being no gaps in the input ACE data. The average gap duration is 4-6 hours, so the fraction of data lost in our shifted data set is about 45*6 / 300*24 = 4%. Fortunately, this is when the IMF is most quiet and accurate bow shock nose predictions least critical.

Technique 5, the DW combination of 2 and 3, involves more interpolation of PFNs than the individual MVAB-0 or CP technique, or than the JK/NP combination thereof, which is one of the main reasons we did our production work with Technique 4. In the same search of March-December, 1998, data mentioned in the preceding paragraph, we found 60 days having intervals of 3 hours or more having Technique 4 data but not Technique 5 data (because no good PFN's were produced over such intervals by Technique 5.) Again assuming an average 6-hour gap duration, the fraction of time for which we do not have Technique 5 data relative to the time for which we have Technique 4 data is 60 * 6 / 0.96 * 300 * 24 = 5%

Interestingly, which technique is used does not have a statistically significant affect on the profiles, as will be further discussed later.

We introduced above the time shift equation as delta-t = n·(Rd – Ro) / n · V. n is the phase plane normal, determined by analysis of magnetic field data only. V is the solar wind velocity, including the ~30 km/s in the Ygse direction associated with the Earth’s orbital motion about the sun. We initially shift 15-sec magnetic field data using the vector velocity determined by interpolating velocity values most immediately preceding and following the time tag of the observed magnetic field value, as long as the interval of interpolation is less than one hour. Magnetic field data points whose most immediately preceding and following velocity data are separated by more than an hour are not carried forward into our output data products.

Shifting means changing the time tags of data records. There is no changing of observed parameter values in the process.

After shifting magnetic field data, plasma data are shifted by using the time shift duration associated with the magnetic field observation whose pre-shift time tag lies closest to the plasma record’s time tag, so long as two time tags lie within 2 minutes of each other.

Because the n and the V in the time shift equation vary at various time scales, it sometimes happens that, if phase front A is observed before phase front B, B may nevertheless be predicted to arrive at a remote location (e.g., the BSN location) before A arrives there. Such out-of-sequence arrivals may be due to “overtaking” associated with speed gradients or to “interpenetration” of variously oriented phase planes (especially given a significant separation of the locations of the BSN and of the observing spacecraft in the direction normal to the solar wind flow).

This “interpenetration” is clearly unphysical and is one of the primary shortcomings in our work. But there is no physically justified alternative yet. Two different alternatives have been considered. In Weimer’s earliest work, he imagined that, for any pair of out-of-sequence phase fronts, the latter-arriving phase front would be precluded from arriving by the earlier arriving phase front and so could be dropped from further consideration. In more recent work, he imagined that the latter arriving phase front would displace the earlier arriving phase front, so that the earlier arriving phase front could be dropped from further consideration.

Our sense is that, while we cannot specify the physical processes that will occur and prevent interpenetration and out-of-sequence arrivals, there is no good a prior reason for favoring earlier-arriving or later-arriving phase fronts in cases of out-of-sequence arrivals. As such, our approach is to accept all shifted data as belonging to the newly assigned time tags that each record acquires via our simple time shift equation, and to build 1-min data products with averages over all points shifting into a given minute. We recognize this involves an unphysical mixing of plasma elements from differing domains. But in some sense it emulates our ignorance of the dynamical processes that happen in the real solar wind.

There is a parameter in the output 1-min data records, the duration between observing times (DBOT), whose negative values indicate that out-of-sequence arrivals have occurred.

[Note added 01/22/2007. It should be recognized that occasionally our approach to averaging over all data shifting into a given minute leads to a series of minutes whose parameter values alternate between those characteristic of different plasma domains. That is, each minute average may not simply be an average of values from two domains, especially for Wind plasma data which starts at 92-s resolution. For a recent example of this, see 2007/10/25 Wind SWE plasma data prior to shifting at http://omniweb.sci.gsfc.nasa.gov/ftpbrowser/wind_swe_kp.html, and see corresponding shifted data at http://omniweb.gsfc.nasa.gov/form/sc_merge_min1.html. There is a clean interplanetary shock in the unshifted data at 10:44 UT, while there is an interval of ~50 minutes duration, spanning 11:47 - 12:36, of shifting between pre-shock and post-shock parameter values in the minute averages built from shifted data. Users must exercise care in using spacecraft-specific, bow-shock-nose shifted data, or High Resolution OMNI data created from them, in the presence of significant variability in field and plasma parameters and in derived phase front normal directions.]

This section describes the common format of (a) the 1-min ACE, Wind, IMP 8 and Geotail spacecraft-specific data sets that have been created at the bow shock nose, (b) the ACE data sets shifted by various techniques to the location of Wind and (c) the unshifted Wind data. It also describes the shared format of the 1-min and 5-min spacecraft-interspersed OMNI data sets.

The 1-min field and plasma averages are built from 15-s magnetic field and ~1-min plasma records whose shifted time tags indicate that any portion of the data underlying the parameter values (i.e., the higher resolution field values from which the 15 sec field averages were determined or the plasma spectra from which the bulk plasma parameters were determined) were observed during the relevant minute of interest. See Appendix 5 for a more discussion of the averaging, including the weighting used.

The 1-min time tags are at the start (not midpoint) of the data used in the average.

The 5-min OMNI averages are built from the five relevant 1-min averages. The standard deviations in these averages correspond to the process of building the 5-min averages and do not retain knowledge of the standard deviations in the 1-min averages.

Identification of spacecraft and of shift technique, for the spacecraft-specific data sets, are captured in file names rather than in data records. To review, we use the following identifiers:

```Spacecraft:
ACE	71
Geotail	60
IMP 8	50
Wind	51
Shift technique:
2	MVAB-0 (min variance constrained by <B> · n = 0)
3	Cross Product
4	Mixed - use PFN(3) if good, otherwise use PFN(2)
5	Mixed - use PFN(3) = PFN(2) only if they agree
Identification of source spacecraft for field and plasma data in
OMNI is contained in data records, using above spacecraft ID's.
```
Only shift technique 4 is used in the spacecraft-specific data sets shifted to the bow shock nose and in the HRO data set created from them, while each shift technique is used in the ACE data sets shifted to Wind.

The common format for the spacecraft-specific data sets is as follows. Of the 37 words, words 8-15, 23-28, and 32-34 are 1-min averages formed over native-time-resolution data.

```Word		        Format   	Comment

Year		         I4		1995 ... 2006
Day		         I4		1 ... 365 or 366
Hour		         I3		0 ... 23
Minute		         I3		0 ... 59 at start of average
# of points in IMF avgs  I4
Percent interp.	         I4		See footnote A below
CP/MV Flag	         F4.1		See footnote A below
Timeshift, sec	         I6
Phase_frnt_nrml, X,GSE	 F6.2		GSE components of unit vector,
Phase_frnt_nrml, Y,GSE	 F6.2			X comp. always > 0.
Phase_frnt_nrml, Z,GSE	 F6.2
Scalar B, nT	         F8.2
Bx, nT (GSE, GSM)	 F8.2
By, nT (GSE)	         F8.2
Bz, nT (GSE)	         F8.2
By, nT (GSM)	         F8.2	        Determined from post-shift GSE components
Bz, nT (GSM)	         F8.2	        Determined from post-shift GSE components
RMS, timeshift, sec	 I7
RMS, Phase front normal  F6.2	        See footnote B below
RMS, Scalar B, nT	 F8.2
RMS, Field vector, nT    F8.2		See footnote B below
# of points in plasma avgs I4
Flow speed, km/s	 F8.1
Vx Velocity, km/s, GSE   F8.1
Vy Velocity, km/s, GSE   F8.1
Vz Velocity, km/s, GSE   F8.1
Proton Density, n/cc	 F7.2
Temperature, K	         F9.0
X(s/c), GSE, Re	         F8.2		Position of spacecraft
Y(s/c), GSE, Re	         F8.2
Z(s/c), GSE, Re	         F8.2
X(target), GSE, Re	 F8.2		Position of bow shock nose or Wind
Y(target), GSE, Re	 F8.2
Z(target), GSE, Re	 F8.2
RMS(target), Re	         F8.2		See footnote B below
DBOT1, sec	         I7		See footnote C below
DBOT2, sec	         I7		See footnote C below

The data may be read with the format statement:
(I4,I4,2I3,2I4,F4.1,I7,3F6.2,6F8.2,I7,F6.2,2F8.2,I4,4F8.1,F7.2,F9.0,3F8.2,4F8.2,2I7)
```
Note that for missing data, fill values consisting of a blank followed by 9's which together constitute the Ix or Fx.y format are used.

Percent interp: The percent (0-100) of the points contributing to the 1-min magnetic field averages whose phase front normal (PFN) was interpolated because neither the MVAB-0 nor Cross Product shift techniques yielded a PFN that satisfied its respective tests (see above for these).

CP/MV flag: The fraction (0-1) of the points, that contribute to the 1-min magnetic field averages and that are not based on interpolated PFN's, whose PFN was based on the MVAB-0 method.

If in a given 1-min magnetic field average, there are n points with CP-based PFN's, p points with MVAB-0 PFN's and q points with interpolated PFN's, then Percent interp = 100 * q/(n+p+q) and CP/MV flag = p/(p+n) (or = 9.9 if p+n = 0)

Note that standard deviations for the three vectors are given as the square roots of the sum of squares of the standard deviations in the component averages. The component averages are given in the records but not their individual standard deviations.

Footnote C: The DBOT (Duration Between Observing Times) words: For a given record, we take the 1-min average time shift and estimate, using the solar wind velocity and the location of the observing spacecraft, the time at which the corresponding observation would have been made at the spacecraft. Then we take the difference between this time and the corresponding time of the preceding 1-min record and define this as DBOT1. This difference would be one minute in the absence of PFN (phase front normal) and/or flow velocity variations. When this difference becomes negative, we have apparent out-of- sequence arrivals of phase planes. That is, if plane A is observed before plane B at the spacecraft, plane B is predicted to arrive at the target before plane A. Searching for negative DBOT enables finding of such cases.

DBOT2 is like DBOT1 except that the observation time for the current 1-min record is compared to the latest (most time-advanced) previous observation time and not to the observation time of the previous record. Use of DBOT2 helps to find extended intervals of out-of-sequence arrivals.

We do not capture out-of-sequence-arrival information at 15-s resolution but only at 1-min resolution. The standard deviation in the 1-min averaged time shifts may be used to help find cases of out-of-sequence 15-s data.

End of footnotes for spacecraft-specific data format

```The common format for the 1-min and 5-min OMNI data sets is

Year			        I4	      1995 ... 2006
Day			        I4	1 ... 365 or 366
Hour			        I3	0 ... 23
Minute			        I3	0 ... 59 at start of average
ID for IMF spacecraft	        I3	See  footnote D below
ID for SW Plasma spacecraft	I3	See  footnote D below
# of points in IMF averages	I4
# of points in Plasma averages	I4
Percent interp		        I4	See  footnote A above
Timeshift, sec		        I7
RMS, Timeshift		        I7
RMS, Phase front normal	        F6.2	See Footnotes E, F below
Time btwn observations, sec	I7	DBOT1, See  footnote C above
Field magnitude average, nT	F8.2
Bx, nT (GSE, GSM)		F8.2
By, nT (GSE)		        F8.2
Bz, nT (GSE)		        F8.2
By, nT (GSM)	                F8.2	Determined from post-shift GSE components
Bz, nT (GSM)	                F8.2	Determined from post-shift GSE components
RMS SD B scalar, nT	        F8.2
RMS SD field vector, nT	        F8.2	See  footnote E below
Flow speed, km/s		F8.1
Vx Velocity, km/s, GSE	        F8.1
Vy Velocity, km/s, GSE	        F8.1
Vz Velocity, km/s, GSE	        F8.1
Proton Density, n/cc		F7.2
Temperature, K		        F9.0
Flow pressure, nPa		F6.2	See  footnote G below
Electric field, mV/m		F7.2	See  footnote G below
Plasma beta		        F7.2	See  footnote G below
Alfven mach number		F6.1	See  footnote G below
X(s/c), GSE, Re		        F8.2
Y(s/c), GSE, Re		        F8.2
Z(s/c), GSE, Re		        F8.2
BSN location, Xgse, Re	        F8.2	BSN = bow shock nose
BSN location, Ygse, Re	        F8.2
BSN location, Zgse, Re 	        F8.2

AE-index, nT                    I6      See World Data Center for Geomagnetism, Kyoto
AL-index, nT                    I6      See World Data Center for Geomagnetism, Kyoto
AU-index, nT                    I6      See World Data Center for Geomagnetism, Kyoto
SYM/D index, nT                 I6      See World Data Center for Geomagnetism, Kyoto
SYM/H index, nT                 I6      See World Data Center for Geomagnetism, Kyoto
ASY/D index, nT                 I6      See World Data Center for Geomagnetism, Kyoto
ASY/H index, nT                 I6      See World Data Center for Geomagnetism, Kyoto
PC(N) index,                    F7.2    See World Data Center for Geomagnetism, Copenhagen
Magnetosonic mach number        F5.1    See  footnote G below
-------
Proton flux (>10 MeV)           F9.2    In 5-min OMNI, but not in 1-min OMNI
Proton flux (>30 MeV)           F9.2    In 5-min OMNI, but not in 1-min OMNI
Proton flux (>60 MeV)           F9.2    In 5-min OMNI, but not in 1-min OMNI

The data may be read with the format statement
1-min: (2I4,4I3,3I4,2I7,F6.2,I7, 8F8.2,4F8.1,F7.2,F9.0,F6.2,2F7.2,F6.1,6F8.2,7I6,F7.2,F5.1)
5-min: (2I4,4I3,3I4,2I7,F6.2,I7, 8F8.2,4F8.1,F7.2,F9.0,F6.2,2F7.2,F6.1,6F8.2,7I6,F7.2,F5.1,3F9.2)

Note that for missing data, fill values consisting of a blank followed by 9's which
together constitute the Ix or Fx.y format are used.

Footnote D:

The following spacecraft ID's are used:
ACE	71
Geotail	60
IMP 8	50
Wind	51

Footnote E:

Note that standard deviations for the minute-averaged phase front
normal and magnetic field vectors are given as the square roots of
the sum of squares of the standard deviations in the component
averages.  For the magnetic field vectors only, the component
averages are given in the records but not their individual
standard deviations.  1-min averaged phase front normal directions
are given in the spacecraft-specific data sets but not in the
high resolution OMNI data set.

Footnote F:

There are no phase front normal standard deviations in the 5-min
records.  This word has fill (99.99) for such records.

Footnote G:

Derived parameters are obtained from the following equations.

Flow pressure = (2*10**-6)*Np*Vp**2 nPa (Np in cm**-3,
Vp in km/s, subscript "p" for "proton")

Electric field = -V(km/s) * Bz (nT; GSM) * 10**-3

Plasma beta = [(T*4.16/10**5) + 5.34] * Np / B**2 (B in nT)
(Note that very low |B| values (<~ 0.3 nT) encountered rarely
in high resolution data can drive plasma beta values to above
1000.  In high resolution OMNI, there were about 20 such minutes
encountered in ~12 years.  We have assigned the value 998.0 to
plasma beta in such cases.  Correct values of T, Np and B are available
in the records for recomputation of plasma beta values.)

Alfven Mach number = (V * Np**0.5) / 20 * B

Magnetosonic Mach Number = V/Magnetosonic_speed
Magnetosonic speed = [(sound speed)**2 + (Alfv speed)**2]**0.5
The Alfven speed = 20. * B / N**0.5
The sound speed = 0.12 * [T + 1.28*10**5]**0.5

For details on these, see http://ftpbrowser.gsfc.nasa.gov/bow_derivation.html
```

It is an important current research topic to determine under what conditions single-spacecraft observations of solar wind field and plasma variations upstream (and possibly off to the side of) the Earth's magnetosphere can lead to reliable predictions of the solar wind variations to occur at the Earth's bow shock. Goodness of predictability may depend on many variables, including the spacecraft- to-bow shock separation geometry, the level of variation in the solar wind, the nature of the solar wind (e.g., fast vs. slow flows) and the technique used to shift data from the observation point to the bow shock.

It is possible to assess predictability goodness by multiple techniques. One would be to compare single-spacecraft predictions with the results of multi-spacecraft analyses, as was done by Weimer et al (2003), but done over a statistically significant number of independent time intervals. Another would be to search out a statistically significant number of major solar wind field and/or plasma discontinuous or other variations, and to note agreement level between spacecraft A's observations and spacecraft B's observations as shifted to A (A - shifted_B cross correlation functions - ccf's). A third would be to simply compute A - shifted_B ccf's in a large number of fixed-duration time intervals, each characterized by A-B separation geometry, mean physical parameter values in the intervals, parameter variance levels and the shift technique.

We are taking the last approach of the above paragraph. We have built a database of ccf's for field and plasma parameters for ~6000 4-hour intervals in 1998-2000, for ACE data shifted to the Wind spacecraft by each of the four shift techniques discussed in Section 3a of these notes. While a final and comprehensive assessment of goodness of predictability as a function of ACE-Wind separation, solar wind flow state, solar wind variation level and shift technique lies in the near future, we report herein some preliminary results. It is intended that the final assessment will be published and will be reproduced here when completed.

We focus here on predictability of Bz variations as the most geoeffective of the solar wind parameters. Imagine that computed 4-hour ccf's are the dependent variable in an independent variable space consisting of Wind-ACE separation geometry (along and across the flow direction), the means and standard deviations for each physical parameter in the 4-hour intervals, and the shift technique. For any bin in independent variable space, we find a certain number of intervals whose ccf's make up a distribution itself having a mean, median, standard deviation, etc. We examine the medians of these distributions as indicating dependence of predictability on the independent variables.

With no selection of parameters but exercising each of the 4 shift techniques, we find four distributions with numbers of 4-hour intervals ranging between 5109 and 5288 and with medians ranging between 0.691 and 0.706. Standard deviations in the (non-Gaussian) distributions of medians are ~0.31 Thus, at least in the case of looking over all the data, the various shift techniques are giving statistically equivalent results. In fact this is also the case for virtually all the binned analyses we've done.

Except where noted, additional results in this section are for shifts by "technique4" that we have used in our production work.

To first assess dependence of predictability on the transverse separation of Wind and ACE, we do a series of runs binned only by ACE-Wind Impact Parameter (IP). We find that the median of the Bz ccf distributions increases through the values 0.35, 0.34, 0.54,0.63, 0.75, 0.85, 0.87 at the IP decreases through the bins >150, 120-150,90-120, 60-90, 30-60, 15-30 and 0-15 Re. The numbers of 4-hour intervals in these distributions range from 159 (120-150 Re) to 2001 (30-60 Re). It is interesting that the ccf is the same for the 120-150 Re bin and the >150 Re bin, and that the ccf is the same for the 0-15 and 15-30 Re bins. The latter may be due to the occurrence of rotational discontinuities which, because of their propagation relative to the ambient solar wind, are not well accommodated by the shift assumptions. If we define a Bz scale length as the distance over which the Bz ccf falls by 10% (cf. Richardson and Paularena, 2001), then the scale length is approximately (135-15)/(0.85-0.35)*10 = 24 Re.

Interestingly, when we look at medians of Bz ccf distributions involving MVAB-0-determined and CP-determined PFN's in the IP = 0-15 Re and 15-30 Re bins, we find 0.87 (0-15 Re) and 0.84 (15-30 Re) for both methods. That both methods give the same result may run counter to an expectation that the MVAB-0 method may be good for PFN determination for both tangential and rotational discontinuities, while the CP method should be better for PFN determination for non-propagating tangential discontinuities having no field component normal to the discontinuity plane.

To examine the dependence of predictability on the solar wind variability level, we did a series of runs for various values of the standard deviation in the 4-hour Bz average (sigma-Bz). Upon limiting the ACE-Wind Impact Parameter to be less than 60 Re, we found median values of the Bz ccf distributions of 0.66, 0.76, 0.82, 0.85, and 0.91 in the sigma-Bz bins 0-1, 1-2, 2-3, 3-4, >4 nT. The numbers of intervals per distribution ranges between 277 (sigma-Bz > 4 nT) and 1129 (1 < sigma-Bz < 2 nT). Removing the constraint on the Wind-ACE IP almost doubled the numbers of 4-hour intervals per sigma-Bz run, but decreased the median ccf's only by 7% (at largest sigma-Bz) to 13% (at smallest sigma-Bz). The conclusion here is that the higher the variation level in Bz, the more predictable are bow shock nose Bz variations, given upstream Bz observations.

To examine possible dependence of predictability on the X distance upstream, we define bins by X(ACE) - X(Wind). For a series of runs all having Wind-ACE IP < 60 Re, we find medians in Bz ccf distributions of 0.78, 0.77, 0.81, 0.74 for bins of <50 Re, 50-125 Re, 125-200 Re, >200 Re respectively. The numbers of intervals in the distributions range between 345 (delta X > 200 re) and 1248 (125 Re < delta X < 200 Re). The conclusion here is that, while there's a hint of a downturn in the Bz ccf at delta X > 200 Re, there's no major dependence of predictability on delta X.

Finally, to assess predictability on flow speed, we do runs in flow speed bins <350, 350-450, 450-550 and >550 km/s for IP < 60 Re and for sigma-Bz and for sigma-Bz > 1 nT, we find medians in the Bz ccf distributions of 0.84, 0.83, 0.79, 0.72 as the speed increases through the four indicated bins. Numbers of 4-hour intervals in the bins ranges from 346 (V > 550 km/s) to 1210 (350-450 km/s). Predictability in Bz variations decreases modestly as the solar wind flow speed increases.

(February, 2008: We will very shortly add to this Appendix 1 the results of all the comparisons of Geotail data with data from the other spacecraft. Note that they underlie the Geotail data cross-normalizations that have been applied and that are described in Appendix 2.)

[November, 2011. As of now, the comparisons involving IMP 8 and Wind magnetic field data are based on now- replaced versions of the data. But the data-comparison software at the url's identified below address the reprocessed IMP 8 and Wind magnetic field data.]

While the key issue for our new products is the extent to which solar wind variations observed remote from the Earth's bow shock may be used to infer variations at the bow shock nose, it is also of interest to review whether there are systematic differences in parameter values between pairs of input data sets. This is largely because the spacecraft-interspersed data set (i.e., High Resolution OMNI - HRO) should not have excessive parameter changes due to transition between one source spacecraft and another, and so that the parameter values included in the new HRO are most likely "true" at least at the observation points.

This section discusses our search for systematic differences among input data sets. We expect that any systematic differences, while they may change slowly, will not change on the scale of days or weeks. Thus we assess systematic differences using hourly averaged physical parameter values as built from higher resolution data shifted by the simple technique used in preparing the hourly resolution OMNI 2 data set and discussed here
http://omniweb.gsfc.nasa.gov/html/omni2_doc.html#shift.
Such data, and the tools for comparison, are available at
http://omniweb.sci.gsfc.nasa.gov/ftpbrowser/mag_iwa_s2.html (magnetic field data)
http://omniweb.sci.gsfc.nasa.gov/ftpbrowser/pla_iwa_s2.html (plasma data, linear)
http://omniweb.sci.gsfc.nasa.gov/ftpbrowser/pla_iwa_s3.html (logs of N and T)

These interfaces determine the slopes and intercepts in the linear regressions P1 = a + b*P2, where P represents any of the relevant physical parameters (or, as special cases, log N and log T). The interfaces also determine the uncertainties in the slope and intercept, cross correlation coefficients, and the rms deviations between the data points on the scatter plots and the best fit lines. The "1" and the "2" refer to the members of any spacecraft pair.

Our work uses linear regressions of logs of densities and temperatures rather than the values of N and T themselves because these parameters are more log-normally distributed than normally distributed.

Note that the documentation of our hourly resolution OMNI 2 data set at http://omniweb.gsfc.nasa.gov/html/omni2_doc.html#comp extensively discusses intercomparisons of hourly ACE, Wind and IMP 8 magnetic field and plasma data. The rationale for the present discussion is to address the significantly extended time span over which data are now available for intercomparison.

In the earlier work, magnetic field cross comparisons were limited to mid-2000 or earlier, and plasma parameter comparisons were limited to mid-2001 or earlier. In particular, the ACE/SWEPAM-to-Wind/SWE plasma density comparison was limited to early 1998 through mid-2001. It did not show any systematic time variation, although the 2001 data comparison was starting to diverge from the earlier relation.

Recall that in the data preparation section (Section 2) of these notes, we said that for Wind/SWE, we would use the Key Parameter (KP) data, but would normalize them, if any normalizations were appropriate, to the nonlinear fit (NLF) data for which admirably small uncertainty estimates had been derived by Kasper et al. (2006).

```In particular, the comparisons now span:
Magnetic field:
Wind-ACE: 1998-2004
Wind-IMP:  1995-2000
Plasma data
Wind/NLF-ACE:  1998-2004 (but Wind/KP-ACE to 2005)
Wind/NLF-IMP:  1995-2004 (but Wind/KP-IMP to 2005)
```
We have built a series of parameter-specific tables summarizing the results of the annual and multi-year cross correlations. For all analyses, we used the linear expression P(Wind) = a + b * P(2), where P(2) was either ACE or IMP 8 for various runs. More specifically, for plasma comparisons, we used P(Wind/NLF) = a + b * P(2) where now P(2) might be ACE or IMP 8 or Wind/KP.
```The tables and their contents are

Magnetic field:

Bt.xls: Field magnitude, Wind vs. IMP and vs. ACE
Bxyz.xls: Field components,  Wind vs. IMP and vs. ACE

Plasma data

V2.xls: Flow speed, Wind/NLF vs Wind/KP, vs. IMP, and vs. ACE
ThetaV2.xls: Flow elevation angle, Wind/NLF vs Wind/KP, vs IMP, and vs. ACE
PhiV2.xls:  Flow azimuth angle, Wind/NLF vs. Wind/KP, vs. IMP, and vs. ACE
N2.xls: Number density,   Wind/NLF vs. Wind/KP, vs. IMP, and vs. ACE;
also Wind/KP vs ACE for 2002-2005
T2.xls: Temperature,  Wind/NLF vs Wind/KP, vs. IMP and vs. ACE;
also Wind/KP vs. ACE for 2002-2005.
```
Each table has the following columns: year(s) covered, number of hourly averages used, intercept, slope, cross correlation coefficient, rms difference between points and best fit line, uncertainties in intercept and slope, differences or ratios as described in the next paragraph. The rows in the tables for any given spacecraft pair are for individual years and then for groups of years.
```In addition, the following parameters, determined from the
best fit lines, are included in certain spreadsheets:
field magnitude:  differences at 20 nT
[i.e., values of (P1-P2) = a + (b-1)*P2 at P2 = 20]
field components:  differences at 0, ± 20 nT
flow speed:  differences at 300, 350, 450, 650 km/s
flow angles:  differences at 0, ± 10 deg
densities:  ratios at N(other) = 2, 5, 10, 20/cc
[i.e., N(Wind/NLF) / N(other)]
temperatures:  ratios at T = 25, 63, 160, 400 thousand deg
```
Further, the density and temperature tables have regression results as combined over all flow speed and over the flow speed bins used in our previous work, V < 350 km/s, 350 < V < 450 km/s, V > 450 km/s. They also contain the slopes and intercepts used in normalizing IMP and ACE data to the Wind/NLF data for OMNI 2 a few years ago.

These regression results tables are further described in Appendix 7.

Some of the more salient points made visible by these tables, as grouped by physical parameters, are given in the following subsections.

Magnetic field comparisons

If the Wind magnetic field data are right, then IMP field magnitude and components (absolute values) would need to be increased by 1.5 to 2 percent to match Wind. Thus there are systematic Wind-IMP magnetic field component differences of ~0.3 to ~0.4 nT at ± 20 nT. Averaged over 1996-2000, when Bz(Wind) = 0, Bz(IMP) = -0.06, indicating good IMP zero level determination. There is no clear evidence of any time dependence in the Wind-IMP relations in magnetic field data.

By contrast, Wind version 4 and ACE magnetic field data agree to within 1 percent for virtually all components and years (through 2004), and to within 0.03 nT in Bz at Bz = 0. But the Wind version 3 data, used in high Resolution OMNI for times past November 28, 2004, have greater differences relative to ACE. Wind field magnitudes and X and Y components are up to 2% different than ACE values over parts of the ± 20 nT range. Most significant is a ~1 nT offset in the Wind Bz component. [The best fit regression equation for v.4 (v.3) Wind Bz values against ACE Bz values is Bz(Wind) = 0.020 nT (1.134 nT) + 1.001 (1.012) * Bz(ACE).] We will replace the v.3 data with v.4 data as they become available.

The Geotail magnetic field data available as we were creating these new data sets were known to have preliminary and incorrect Bz offsets. See the discussions in Section 2 and in Appendix 2.

Flow speed comparisons

Flow speeds agree to within 1% or less. That is |V(Wind/NLF) - V(Z)| / V(Wind/NLF) < 1%,
where Z = Wind/KP, ACE, or IMP. For the case of Z = ACE and IMP, V(Wind/NLF) exceeds V(Z). V(Wind/KP) is virtually identical to V(Wind/NLF).

Flow direction angle comparisons

Flow azimuth angles between any source pair agree to within 1 degree over the ± 10 deg range. Flow elevation angle agreement level depends on the source pair. Wind/NLF and Wind/KP agree to within 1 degree over the ± 10 degree range. The same is true for Wind/NLF vs. ACE except that near +10 deg, Wind/NLF exceeds ACE by ~1.5 deg. The IMP elevation angle exceeds the Wind/NLF elevation angle by an amount ranging from ~1.2 deg at -10 deg to ~4 degrees at +10 degrees. An apparent IMP flow elevation angle offset of ~2 deg has been recognized for many years. The present analysis shows for the first time an elevation angle dependence in this offset. There are no evident time dependences in the relations between any source pair for flow speed or direction angles.

Density Comparisons

The tables at N2.xls contain annual and multi-year results for the three source pairs (Wind/KP, IMP and ACE, each vs. Wind/NLF) for all speed and in three speed bins (V< 350 km/s, 350 - 450 km/s, >450 km/s) As noted before, the results consist of numbers of points (hours) per run, the slope and intercept and their uncertainties for the run, the cross correlation coefficients, the rms deviation of data points about the best fit regression line and ratios given by the best fit line of Wind/NLF densities to densities from the other sources at four density values (2, 5, 10, 20 /cc).

The Wind/NLF to Wind/KP density ratios increase from near 0.90 to 0.98 as density increases from 2 to 20 /cc, with only very small (~2%) and quasi-random variations as functions of both time and flow speed bin. The regression equation for this time-independent, speed-independent case is LogN(NLF) = -0.059 + 1.038 * LogN(KP).

The Wind/NLF to IMP density ratios fall from from near unity to a value between 0.80 and 0.88 as density increases from 2 to 20 /cc, with no significant systematic time dependence but with a non-negligible dependence on flow speed bin.

There are two sets of Wind-ACE density results in N2.xls, one for Wind/NLF vs. ACE (for 1998-2004, the limit of the Wind/NLF data) and one for Wind/KP vs. ACE for 2001-2006. We will extend the ACE density normalization to Wind/NLF by looking at time variation across 2004/2005 in the Wind/KP - ACE comparison and using the time invariance of the Wind/NLF-Wind/KP comparison across 1995-2004.

The N(NLF)/N(ACE) ratios exhibit irregular year-to-year variations in the V<350 km/s range. This is largely because of poor statistics. When analyzed in multi-year segments, the N(NLF)/N(ACE) ratios are within a few percent of unity (except that in the less populated 2000-2004 interval, the ratio exceeds 1.1 at N values greater than 10/cc. Taking N(ACE) = N(NLF) for V<350 km/s is consistent with the data.

However, in the much better populated 350-450 and >450 km/s ranges, the ratios are significantly different than unity and are largely time-invariant (to within ± 2-3%) in the 1998-2000 interval, and again in the 2001-2004 interval. So we performed 3-year and 4-year analyses over these intervals, also documented in N2.xls. At 350<V<450, the N(NLF)/N(ACE) ratio increases from 0.83 to 0.90 as N increases from 2 to 20/cc during 1998-2000, while this ratio increases from 0.88 to 1.04 during 2001-2004. At V>450 km/s, this ratio is virtually independent of density, being about 0.82 during 1998-2000 and about 0.90 during 2001-2004. The ratio increasing from earlier to later is consistent with higher density determinations by Wind or lower density determinations by ACE in the later interval. (As seen in N2.xls, there is no significant time variation in the N(NLF)/N(IMP) ratios, suggesting that the time dependence in the N(NLF)/N(ACE) ratio is due to N(ACE) variation.)

To extend the N(NLF) - N(ACE) comparison to 2005-2006 for which N(NLF) data are not available, we compare N(KP) and N(ACE) data for the intervals 2001-2004 and 2005-2006. At 350<V<450 km/s, the N(KP)/N(ACE) ratio is about 1.02 (independent of N) for 2001-2004, while this ratio is about 0.90 (also independent of N) for 2005-2006. At V>450 km/s, this ratio decreases from 1.01 to 0.96 as N increases from 2 to 20 during 2001-2004, and decreases from 1.02 to 0.88 over this N range in 2005-2006. Attributing the differences between the earlier and later periods to ACE density data, we are in a position to assert that
N(NLF)/N(ACE) (2005-2006) =N(NLF)/N(ACE) (2001-2004) * N(KP)/N(ACE) (2005-2006) / N(KP)/N(ACE) (2001-2004)
and find that N(NLF)/N(ACE) (2005-2006) =K * N(NLF)/N(ACE) (2001-2004),
where K = 0.88 for 350<V<450 and K = 0.96 for V>450.

Thus at both 350<V<450 and V>450, the N(NLF)/N(ACE) ratio increases from 1998-2000 to 2001-2004, and then decreases again for 2005-2006.

The equations that summarize these comparisons are:

```V < 350 km/s, all years:  N(NLF) = N(ACE)
350 - 450 km/s, 1998-2000:  LogN(NLF) = -0.094 + 1.037 * LogN(ACE)
350 - 450 km/s, 2001-2004:  LogN(NLF) = -0.076 + 1.070 * LogN(ACE)
350 - 450 km/s, 2005-2006:  LogN(NLF) = -0.129 + 1.070 * LogN(ACE)
V > 450 km/s, 1998-2000:  LogN(NLF) = -0.085 + 0.998 * LogN(ACE)
V > 450 km/s, 2001-2004:  LogN(NLF) = -0.031 + 0.980 * LogN(ACE)
V > 450 km/s, 2005-2006:  LogN(NLF) = -0.048 + 0.980 * LogN(ACE)
```
Temperature Comparisons

For temperatures, the tables at T2.xls contain almost identical information as for the density information at N2.xls discussed immediately above. However, the temperature ratios are given at standard values of 25,000, 63,000, 160,000 and 400,000 deg K.

The Wind/NLF to Wind/KP temperature ratios are largely unity independent of V and of time for 1998 and later. But for 1995-1997, there is a speed dependence reflected in the relations
V<350: LogT(NLF) = -0.630 + 1.132 * LogT(KP)
350-450: LogT(NLF) = -0.383 + 1.073 * LogT(KP)
V>450: logT(NLF) = -0.221 + 1.038 * LogT(KP)

These equations say that during 1995-1997, the T(NLF)/T(KP) ratio grows from about 0.88 to anywhere between 0.98 (V>450) and 1.28 (V<350) as T increases from 25,000 deg K to 400,000 deg K.

The Wind(NLF) to IMP temperature ratios given for 1995 to 2004 in T2.xls diverge in very recent years from the ratios based on more nearly time invariant 1995-2000 data used previously in preparing the hourly OMNI 2 data set. However, because we include IMP data in our new products only through the June 2000 failure of the IMP magnetometer, we shall ignore the post-2000 behavior included in T2.xls only for completeness.

The annual Wind(NLF) to ACE temperature ratios show significant scatter, especially at V<350 km/s where the statistics are poor. There are no obvious systematic time dependences in these ratios. Performing analyses of data for all years (1998-2004) in the three V bins reveals T(Wind)/T(ACE) ratios which increase with temperature (as T increases from 25K deg to 400K deg), from 0.79 to 1.60 at V<350 km/s, from 1.06 to 1.21 at 350<V<450, and from 0.97 to 1.20 at V>450. The tables of T2.xls contain uncertainties in the slopes and intercepts, but not in the ratios. Since the ratios obey R = T(Wind)/T(ACE) = 10 ** [a + (b-1) * LogT(ACE)], we can estimate the uncertainties in the ratios as (delta-R) = 2.3 * R * SQRT [(delta-a)**2 + (LogT(ACE) * delta-b)**2] The relative uncertainties, (delta-R)/R increase with T(ACE) from 0.87 to 0.99 at V< 350 km/s, from 0.21 to 0.24 at 350<V<450, and from 0.22 to 0.25 at V>450. These relative uncertainties are larger that the differences in the ratios in the various speed bins. Accordingly, we shall suppress the speed dependence in the temperature normalization and determine by a single all-years run at V>380 km/s (lower speed threshold would exceed our 12,000 points/run limit, but we apply the result to all speeds):
logT(Wind/NLF) = -0.147 + 1.039 * LogT(ACE). We shall also apply this to post-2004 ACE temperature data, for which we do not yet have Wind/NLF temperature data.

Using hourly averaged data, the previous section has revealed the mainly small systematic differences for each magnetic field parameter between Wind on the one hand and ACE and IMP 8 on the other hand. It has also revealed systematic differences for each plasma parameter between the nonlinear fit-based Wind/SWE data on the one hand and the Wind/SWE key parameter data, the ACE/ SWEPAM data and the MIT/IMP 8 data on the other hand.

The question is now whether and for which parameters we should cross normalize the data to be included in the spacecraft-interspersed high resolution OMNI data set. (Note that we do no such normalizations for our new spacecraft-specific data sets.) We choose to minimize cross-normalizations for multiple reasons. First, since we use 3-hour swaths of same-spacecraft data in 1-min OMNI, there are at most only 0.55% of minute-to-minute transitions that would involve a change of source spacecraft. In fact, the actual fraction of transitions between sources is very much less than this. Second, we do not expect this data set to be used for long term solar wind variation studies; the hourly resolution OMNI data set is more appropriate for this.

So, as for the present hourly OMNI data set, we shall cross-normalize only plasma densities and temperatures.

We shall normalize the Wind/KP density data for all time and flow speeds as:
LogN(Wind/KP, norm) = -0.059 + 1.038 * LogN(Wind/KP, obsvd)

We shall normalize the Wind/KP temperature data as follows:

```1995-1997, V<350 km/s:    LogT(norm) = -0.063 + 1.132 * LogT(obsvd)
1995-1997, 350-450 km/s:  LogT(norm) = -0.383 + 1.073 * LogT(obsvd)
1995-1997, V>450 km/s:    LogT(norm) = -0.221 + 1.038 * LogT(obsvd)
1998-2006, all V:                LogT(norm) = LogT(obsvd)

For IMP 8 we shall use the same time-invariant equations we used for
hourly OMNI.

Density:
V<350 km/s:    LogN(norm) = 0.020 + 0.941 * LogN(obsvd)
350-450 km/s:  LogN(norm) = 0.033 + 0.919 * LogN(obsvd)
V>450 km/s:    LogN(norm) = 0.019 + 0.907 * LogN(obsvd)

Temperature:
V<350 km/s:    LogT(norm) = 0.864 + 0.839 * LogT(obsvd)
350-450 km/s:  LogT(norm) = 0.491 + 0.920 * LogT(obsvd)
V>450 km/s:    LogT(norm) = 0.702 + 0.890 * LogT(obsvd)

For ACE, we use

Density:
V < 350 km/s, all years:  N(norm) = N(obsvd)
350 - 450 km/s, 1998-2000:  LogN(norm) = -0.094 + 1.037 * LogN(obsvd)
350 - 450 km/s, 2001-2004:  LogN(norm) = -0.076 + 1.070 * LogN(obsvd)
350 - 450 km/s, 2005-2006:  LogN(norm) = -0.129 + 1.070 * LogN(obsvd)
V > 450 km/s, 1998-2000:    LogN(norm) = -0.085 + 0.998 * LogN(obsvd)
V > 450 km/s, 2001-2004:    LogN(norm) = -0.031 + 0.980 * LogN(obsvd)
V > 450 km/s, 2005-2006:    LogN(norm) = -0.048 + 0.980 * LogN(obsvd)

Temperature:
All V, all years:  LogT(norm) = -0.147 + 1.039 * LogT(obsvd)

For Geotail, we use

Density (all time and all V):  LogN(norm) = -0.072 + 0.980 * LogN(obsvd)

Temperature (1995-1998, all V):  LogT(norm) = 0.166 + 0.925 * LogT(obsvd)
Temperature (1999-2005, all V):  LogT(norm) = -0.362 + 1.052 * LogT(obsvd)

As explained in the Geotail data discussion of Section 2, The Geotail magnetic field
data we worked with had preliminary and incorrect Bz offset values.  Accordingly,
we compared Geotail B data with B data from the other spacecraft and derived the
following "normalizations" of the Geotail B data:

Bx and By, all time, all Bz:

Bx(norm) = 1.02 * Bx(obsvd)
By(norm) = 1.02 * By(obsvd)

Bz (depends on time)

19950101-19951231:  Bz(norm) = -0.490 + 1.004 * Bz(obsvd)
19960101-19991231:  Bz(norm) = -0.597 + 1.017 * Bz(obsvd)
20000101-20040401:  Bz(norm) = -0.149 + 1.019 * Bz(obsvd)
20040402-20050401 : Bz(norm) = -0.461 + 1.020 * Bz(obsvd)
20050402-20051231:  Bz(norm) = -0.663 + 1.023 * Bz(obsvd)

Bt (depends on time and on sign of Bz)

19950101-19991231, Bz<0:  Bt(norm) = 0.123 + 1.022 * Bt(obsvd)
19950101-19991231, Bz>0:  Bt(norm) = -0.180 + 1.012 * Bt(obsvd)
20000101-20040401, Bz<0:  Bt(norm) = 0.052 + 1.016 * Bt(obsvd)
20000101-20040401, Bz>0:  Bt(norm) = -0.021 + 1.014 * Bt(obsvd)
20040402-20051231, Bz<0:  Bt(norm) = 0.123 + 1.022 * Bt(obsvd)
20040402-20051231, Bz>0:  Bt(norm) = -0.180 + 1.012 * Bt(obsvd)

```
Appendix 3. Despike Algorithms

We have undertaken to eliminate spikes from the Wind and IMP 8 magnetic field and plasma data sets. Owing to their relatively clean state, we have judged it unnecessary to despike the ACE data. Wind magnetic field data were despiked with the simple approach of eliminating any record with a field magnitude or component absolute value in excess of 70 nT. Other data were despiked with the approach described as follows.

We test a point using its two predecessors and two followers. We require that the 1st and last of these 5 points be within 15 mins (for B data) or 60 mins (for plasma data). The first two and last two points in a data segment separated from its neighbors by intervals of >15 min (B) or >60 min (plasma) go untested by the algorithms discussed here. (We visually scanned output data looking for obvious spikes thereby missed, and deleted these.)

Any record having a declared spike in any of its physical parameters is rejected. For a parameter value to be declared a spike, it must satisfy two criteria.

Let P represent the value of the physical parameter being tested. Define <P> as the mean value of parameter P over the 1st, 2nd, 4th, and 5th points of the current set, and let sigma(P) be the RMS deviation in this average. The first test for a spike is to have |P-<P>| > 4 * sigma(P).

The second tests -

IMP IMF data - For P = |B|, require |P-<P>| > 0.2 * <P>. For P = Bx, By, Bz, require |P-<P>| > 1.0 nT.

IMP plasma data - For P = V, N, W [W = thermal speed; T(deg) = 60.5 * W(km/s)**2], require |P-<P>| > k * <P> where k = 0.1, 0.3, 0.3 for P = V, N, W respectively. For P = flow latitude and longitude angles, require |P-<P>| > 4.0 deg. (We have also excluded all IMP plasma records having |flow angle| > 15 deg.)

Wind/SWE plasma data - For P = V, N, T, require |P-<P>| > k * <P> where k = 0.1, 0.3, 0.3 for P = V, N, T respectively. For P = Vx, Vy, Vz, require |P-<P>| > 0.1 * <V>.

For completeness, we note that the Wind/SWE plasma data came to us already having been run through MIT despike software that required that the relative difference between the point being tested and the median of that point and its immediate predecessor and immediate successor should be less than 0.1, 0.5 and 1.0 for flow speed, density and thermal speed, respectively. Some points accepted by the MIT software were rejected by ours.

We assume the geocentric direction to the bow shock nose is parallel to the (aberrated) solar wind flow direction: Rt = - |Rt| * V/|V|. (V and |V| are determined from the aberration-corrected V values provided in the input plasma data sets, but with 29.8 km/s, the mean orbital speed of the Earth about the sun, added to their Vy values.)

```|Rt| is provided as a function of the geocentric magnetopause nose distance Rmp and the magnetosonic Mach number Mms
by Farris and Russell (1994) as

Rt = Rmp * [1.0 + 1.1 * ((2/3)*Mms**2 + 2) / ((8/3) * (Mms**2 - 1)]

where

Mms = Vsw / Vms

Vms**2 = 0.5 * (Va**2 + Vs**2 + SQRT [(Va**2 + Vs**2)**2 - 4*(Va**2*Vs**2 * (cos th)**2])

Va = B / SQRT (4pi * (4*Na + Np) * Mp) = 20.3 * B / SQRT (Np)   (Alfven speed)

Vs = 0.12 * [Tp (deg K) + 1.28*10**5]**0.5   (sound speed)

and where the magnetopause nose distance is given in terms of
the solar wind pressure P and Bz, by Shue et al (1997) as

Rmp = (11.4 + K * Bz) * P**-1/6.6

where P is the pressure defined as a function of Np and V by

P = (2*10**-6)*Np*Vp**2 (N in cm**-3, Vp in km/s; P in nPa)

and where K = 0.013 if Bz > 0 and K = 0.140 if Bz < 0.

Na and Np above refer to alpha particle and proton densities.
The equation for P assumes a constant 4% alpha particle contribution.
```

We have input records with (typically shifted) time tags T and parameter values P. The parameters are either ~15-sec magnetic field or ~1-min plasma parameters. Magnetic field parameters are typically averages of yet higher resolution magnetic field parameters that have been obtained between some first time Tf and some last time Tl. Plasma parameters are as derived from some distribution function accumulated between some first time Tf and some last time Tl. The relation between the input record time tag T and the first and last times (Tf & Tl) of the data on which the record's parameter values are based is dataset-specific. The duration Tl-Tf varies between records for some data sets but not for others.

We want to create output records tagged at the start of every minute. The parameter values in the output records should be based, as much as possible, on observations made during that minute. This means that, for a given output minute, we want to do weighted averages over any input values whose underlying data were obtained, in whole or part, during the output record's minute of interest. One weighting factor is the extent to which the parameters of the input record cover the desired output interval. The other factor is the extent to which the parameters of the input record are determined by data taken outside the minute of interest. These weights may be written as follows.

Let Tf* = Tf or Tf* = the first instant of the output record, whichever is later. Let Tl* = Tl or Tl* = the last instant of the output record, whichever is earlier. Then Tl* - Tf* = the part of the duration of the input record which lies within the duration of the output record. Let S = Tl* - Tf*. The fraction of the input record which lies within the output record time span is (Tl* - Tf*)/(Tl - Tf). Let this fraction be F. Note that F = S/(Tl - Tf). For data sets having the same durations [i.e., (Tl - Tf) values] for all records, we have F = constant * S. ACE and Wind field records and plasma records each has a common Tl-Tf, while both IMP8 field and plasma records have varying Tl-Tf values.

To get parameter values <P> for the output records, find all input records whose parameters are based on observations taken within the output minute of interest. Define the weighted averages as <P> = SUM (Si * Fi * Pi)/SUM (Si * Fi), where i indexes the relevant input records and where the sums are over all the relevant input records. There is interest in defining variance measures of the P values. These may be attributed to variances within the contributing Pi values and to the spread of the Pi values about the mean <P> value. We consider below only the variability in our Pi values about <P>.

Since we build the mean using weighting, we do so also for the variance, using the expression

V = [SUM ((Si*Fi) * (Pi-<P>)**2) / SUM (Si*Fi) = <P**2> - <P>**2

Five-minute averages are computed from the 1-min averages. The 5-min averages tagged with minute = 0 are built from 1-min averages tagged as being for minutes 0, 1, 2, 3 and 4. Likewise for 5-min averages tagged with minutes 5, 10 ... 55.

(This Appendix was originally written as we were creating HRO from ACE, Wind and IMP data. The variant used in adding Geotail data to HRO is described near the end of this Appendix.)

There will be many minutes when shifted data are available from multiple spacecraft. In building High Resolution OMNI (HRO), we shall follow the hourly OMNI practice of selecting data from one source when multiple sources are available. However, instead of following the hourly OMNI practice of selecting the source for each unit time increment, for our HRO products we shall select and intersperse 3-hour data segments [both field and plasma data together] from among our multiple sources.

There are three criteria we shall use, namely, (a) the source-Earth Impact Parameter (IP, separation transverse to the flow, with allowance for Earth's orbital motion), (b) the completeness of magnetic field data coverage in the 3-hour interval, (c) source continuity. This latter means that if neither (a) nor (b) provides a strong discriminant between sources, we shall favor using the source used in the previous 3-hour segment.

We make discrimination between spacecraft pairs algorithmically as follows. Let ScX and ScY represent the two spacecraft being compared.

```Let

A = IP (ScX-Earth)
B = IP (ScY-Earth)
C = fractional ScX coverage, this segment
D = fractional ScY coverage, this segment
E = +1 if ScX data used (i.e., if F>0) in prior segment
E = -1 if ScY data used (i.e., if F<0) in prior segment

Let's define F by

F = a * (|B|-|A|) + b*(C-D)/(D+C) + c*E

For the weights, a, b, c, we have experimented a bit and have chosen

a = 1/30Re
b = 2
c = 0.25

If F > 0, use ScX, otherwise use ScY
```
For 3-hour intervals with some data available from each of three spacecraft (early 1998 through mid-2000), we have determined the favored spacecraft for each of the three possible pairings of spacecraft and then determined by inspection which one spacecraft was preferable to both of the other two spacecraft.

When we added Geotail data to HRO, we treated the 3-spacecraft-based HRO data set as a single data set and the Geotail data set as a second data set, and used the 2-spacecraft algorithm described above for determining whether, for each 3-hour interval, Geotail data should replace the data previously in HRO. We carefully used Impact Parameter appropriate to the spacecraft used in HRO for the interval. Further, if the spacecraft used in HRO for a given interval is different than the spacecraft used in HRO for the preceding interval, we ignore the "continuity factor" by setting E = 0 in the above algorithm.

Note that upon making extensions to HRO, we frequently have data from one source spacecraft reaching closer to current data than data from other source(s). In such cases, most current data will be used in HRO with no "F tests" relative to other spacecraft. But later, when data from other source(s) become available, inter-spacecraft tests will be performed and the originally included data may be replaced by data from the other source(s).

This note describes the Excel spreadsheets we've created to capture the regression results produced.

The spread sheet names, the physical parameter(s) covered, the spacecraft pairs included, and the descriptions of the spreadsheet-specific (parameter-specific) columns are given next. Note that "diff" refers y-x (=a+(b-1)x) at specified x value(s) while "ratio" (used for densities and temperatures) refers to y/x (=10**(a+(b-1)Log_x) at specified x values. Following that, the common columns are described.

```Magnetic field data  (Both have tables for Wind vs IMP and Wind vs. ACE):

Bt.xls, magnetic field magnitude, diff @ 20 nT

Bxyz.xls, magnetic field cartesian components,  diffs @ 0, ± 20 nT

Plasma data.  (All have tables for Wind/NLF vs. Wind/KP,vs. IMP, and vs. ACE;
some have tables for Wind/KP vs. ACE):

V2.xls, plasma flow speed, diffs at 300, 350, 450, 650 km/s

ThetaV2.xls,  plasma flow elevation angle,  diffs at 0, ± 10 deg.

PhiV2.xls,  plasma flow azimuthal angle, diffs at 0, ± 10 deg

N2.xls,  plasma proton density,  all V and in V bins, ratios at
2, 5, 10, 20/cc.

T2.xls.  Proton temperature, all V and in V bins, also Wind/KP vs. ACE, but only for 2002-2005;
extra runs for I8FTS (IMP 8 fine scale points per hour threshold) > 15; ratios at 25K, 63K,160K, 400K deg.
```
The standard impact parameter (IP) requirement for all 2-spacecraft runs was IP(Wind-ACE) or IP(Wind-IMP) < 60 Re. For magnetic field, where we used shifted 15.35-s, 1-min and 4-min data for IMP, Wind and ACE hourly averages respectively, we required numbers of fine scale points in hourly averages to be > 180, 45 and 12 for IMP, Wind and ACE respectively. For plasma hourly averages built from ~1-min, 1.5-min and 4-min IMP, Wind and ACE data, we required equivalent numbers >30, 30, 12 for IMP, Wind and ACE, respectively. These constraints are equivalent to requiring at least about 75% coverage in each hourly average, to minimize inclusion in analysis of hourly average pairs whose members are based on non-overlapping parts of an hour.

All spreadsheets have the following columns: year(s) covered, number of hourly averages, intercept, slope, cross correlation coefficient (ccf), rms difference between points and best fit line, uncertainties in intercept and slope. In addition, each spread sheet has either the differences or the ratios indicated above.

We thank all the ACE, Wind, IMP 8 and Geotail data preparers and providers listed in Section 2. We especially thank Dan Weimer for providing the code used in implementing Phase Front Normal determination techniques 1, 2, 3, 5 listed in Section 3a, and for the consulting in the use of these codes.

Acknowledgement to the SPDF OMNIWeb database as the source of data used in publications is requested: "The OMNI data were obtained from the GSFC/SPDF OMNIWeb interface at http://omniweb.gsfc.nasa.gov". Further, for recent years when few sources (IMP8, Wind, ACE, Geotail) contributed to OMNI, it would be appropriate to also cite the PI's who provided the data to OMNI. Copies of preprints or reprints of OMNI-based publications sent to Natalia Papitashvili (address below) would be appreciated for tracking purposes.

Bargatze, L.F., R.L. McPherron, J. Minamora and D.Weimer, A new interpretation of Weimer et al’s solar wind propagation delay technique, J. Geophys. Res., 110, A07105, doi:10.1029/2004JA010902, 2005.

Farris, M.H. and C.T. Russell, Determining the standoff distance of the bow shock: Mach number dependence and use of models, J. Geophys. Res., 99, 17681-17689, 1994.

Haaland, S., G. Paschmann, and B. U. Ö. Sonnerup (2006), Comment on “A new interpretation of Weimer et al.'s solar wind propagation delay technique” by Bargatze et al., J. Geophys. Res., 111, A06102, doi:10.1029/2005JA011376.

Kasper J. C., A. J. Lazarus, J. T. Steinberg, K. W. Ogilvie, A. Szabo (2006), Physics-based tests to identify the accuracy of solar wind ion measurements: A case study with the Wind Faraday Cups, J. Geophys. Res., 111, A03105, doi:10.1029/2005JA011442.

Knetter, T., F.M. Neubauer, T. Horbury and A. Balogh, Four point discontinuity observations using Cluster magnetic field data: A statistical survey, J. Geophys. Res., 109, A06102, doi:10.1029/2003JA010099, 2004

Richardson, J.D., and K.I. Paularena, Plasma and magnetic field correlations in the solar wind, J. Geophys. Res., 106, 239-251, 2001.

Shue, J.-H., J.K. Chao, H.C. Fu, C.T. Russell, P. Song, K.K. Khurana and H.J. Singer, A new functional form to study the solar wind control of the magnetopause size and shape, J. Geophys. Res., 102, 9497-9511, 1997.

Sonnerup, B.U.O and L.J. Cahill, Explorer 12 observations of the magnetopause current layer, J. Geophys. Res., 73, 1757, 1968.

Sonnerup, B.U.O. and M. Scheible, Minimum and maximum variance analysis, in Analysis Methods for Multi-Spacecraft Data, edited by G. Paschmann and P.W. Daly, Int. Space Sci. Inst., Bern, 1998.

Weimer, D.R., and J.H. King, Improved calculations of interplanetary magnetic field phase front angles and propagation time delays, J. Geophys. Res., 113, A01105, doi:10.1029/2007JA012452, 2008

Weimer, D.R., D.M. Ober, N.C. Maynard, W.J. Burke, M.R. Collier, D.J. McComas, N.F. Ness and C.W. Smith, Variable time delays in the propagation of the interplanetary magnetic field, J. Geophys. Res., 107(A8), 10.1029/2001JA009102, 2002.

Weimer, D.R., D.M. Ober, N.C. Maynard, M.R. Collier, D.J. McComas, N.F. Ness, C.W. Smith and J. Watermann, Predicting interplanetary magnetic field (IMF) propagation delay times using the minimum variance delay technique, J. Geophys. Res., 108(A1), 1026, doi:10.1029/2002JA009405, 2003.

 If you have any questions/comments about OMNIWEB system, contact: Dr. Natalia Papitashvili, Mail Code 672, NASA/Goddard Space Flight Center, Greenbelt, MD 20771