On Adding Wind and ACE Data to OMNI

JOSEPH H. KING AND NATALIA E. PAPITASHVILI

ACE IMF Data - March, 2001

These paragraphs were appended to the front of a pre- existing file which describes in great detail the addition of Wind plasma data to the OMNI data set and then, much more briefly, the addition of Wind IMF data to OMNI. In early 2001 ACE IMF data were added to OMNI for 1998 through July, 2000. Later ACE IMF data will be added as they, and concurrent ACE plasma data, become available.

NSSDC downloaded 4-min-averaged ACE level-2 IMF data from the ACE Science Center (ASC) at http://www.srl.caltech.edu/ACE/ASC/level2/index.html. We also downloaded ACE ion-based 64-sec plasma parameters from ASC. The IMF data are from the MAG instrument (N.F. Ness, PI), and the plasma data are from the SWEPAM instrument (D. McComas, PI). An intermediate 4-min resolution set of ACE IMF, plasma, and position data at ACE times is accessible at http://nssdc.gsfc.nasa.gov/ftphelper/ace_merge.html. We used ACE position information from the IMF data records and the solar wind speed values from SWEPAM to time shift the 4-min IMF data to Earth using the approach described below in the Wind plasma section. We then created hourly IMF averages of all time-shifted 4-min values whose new time tags fell within each hour of interest. We then merged these with concurrent IMP and Wind IMF data to look for systematic offsets or other problems and found none in the ACE IMF data. [This IMP-Wind-ACE hourly merged data set is accessible at http://nssdc.gsfc.nasa.gov/ftphelper/imp_wind_ace.html.] Finally, we added to OMNI the newly created ACE IMF hourly averages for those hours not previously having IMF data from IMP or Wind. The numbers of ACE IMF hours added to OMNI for 1998, 1999 and 2000 were 194, 2780 and 3483.

ACE plasma data will be included in a later OMNI version which may normalize multi-source plasma parameters differently than has been done since 1973 and which may prioritize spacecraft for OMNI inclusion by the continuity of their data rather than by their proximity to the Earth's magnetosphere.

Wind IMF Data - January, 2001

This January 2001 addendum briefly describes the addition of November 1994 - July 1999 IMF data from the GSFC MFI instrument on Wind. The data supersede the May 1995 - June 1996 MFI data previously included in OMNI. One-min IMF averages were time-shifted to Earth and hourly averages were built at Earth (as described in more detail for plasma parameters below). An IMP/Wind merged hourly data set was built from which overlapping plots and lists are available at http://nssdc.gsfc.nasa.gov/ftphelper/imp_wind.html. IMP-Wind scatter plots revealed close adherence to y=x for all IMF parameters. Wind values were added to OMNI for hours not having IMP data or having IMP data but only as parts of segments shorter than 4 hours.


Wind Plasma Data - September, 1999

CONTENT

EXECUTIVE SUMMARY

This report describes the addition to NSSDC's OMNI data set of solar wind plasma data from the Solar Wind Experiment (SWE) on the NASA Wind spacecraft. The SWE Principal Investigator is K.W. Ogilvie of Goddard. The Wind data were obtained from the Co-Investigator, A.J. Lazarus of MIT, and the IMP data from A.J. Lazarus (P.I.) and his colleague K. Paularena.

Of special note herein are: the time shifting of ~1-hour-upstream Wind data for interspersing with IMP 8 data; the comparisons of Wind plasma data and IMP/MIT data; and the rationale for continuing to use the IMP/LANL data set as the fiducial data set to which others are normalized (at least for a short while longer).

WIND DATA INCLUSION AND TIME SHIFTING

Because most spacecraft contributing to the OMNI data set, e.g., IMP 8, are typically within 12 minutes or so of the Earth (for typical solar wind flow speeds), we have not time shifted such data in building hourly averages. However, the Wind spacecraft has a much larger orbit which frequently puts it an hour or so upstream of the Earth. As such, it is important to time shift the ~90 sec. resolution Wind data to Earth, and then to build hourly averages for Earth hours before interspersing with IMP hourly averages in OMNI.

Numerous analyses have been performed on various approaches to time shifting. Wind is up to several tens of Earth radii from the Earth-Sun line, and phase fronts of solar wind variations have a range of orientations. Additionally, some solar wind variations propagate relative to the ambient solar wind.

As is described in detail in Interplanetary Medium Data Book Supplement 3 (http://nssdc.gsfc.nasa.gov/omniweb/om_book/sup3/s3_main.html), we found upon adding ISEE 3 data to OMNI some years ago that best agreement was obtained, statistically, when it was assumed that the variation phase fronts were aligned in a direction intermediate between being normal to the Earth-Sun line and being along the Parker IMF spiral direction. However, the distribution of agreement levels vs. assumed phase front orientation was broad, and agreement was almost as good by assuming either the normal-to-radial or spiral angle alignments. We somewhat arbitrarily chose to time shift ISEE 3 data assuming alignment of variation phase fronts with the spiral angle. That is, we used a corotation delay rather than a radial convection delay.

A recent analysis of Wind and IMP data by Richardson and Paularena (1998) confirms the intermediate alignment of plasma features and thus supports the statistical equivalence (equal goodness) of the radial convection and corotation delay approaches.

For consistency with our earlier ISEE work, we have time shifted Wind plasma and magnetic field data to Earth, using observed Wind speeds and Wind locations. The Wind average tagged in OMNI as being for hour 2 of a given day (hour tags run from 0 to 23) is built from 1-min values whose time-shifted time tags fall in the range 02:00-03:00. The standard deviations in Wind-based parameters given in OMNI are those obtained in the process of creating the hour averages from the time-shifted 1-min values.

Owing to its greater proximity to the Earth, IMP provides a more reliable measure of the state of the solar wind at Earth than does Wind. Therefore for any hours when data from both IMP and Wind are available, IMP data are selected for inclusion in OMNI. There is one exception, however. When the IMP hourly plasma averages are based on five or fewer points (out of about 45 possible), and the Wind averages for that hour are based on more than 5 points (out of 60 possible), Wind plasma data are used rather than IMP data.

PLASMA DATA COMPARISONS: IMP/MIT AND WIND

Over its history, OMNI's multi-source data have been normalized to a fiducial data set. This is best described in the original 1977 Data book. What was found was that plasma densities and temperatures needed to be normalized to achieve a level of uniformity limited only by random differences in pairs of observed values, whereas plasma flow speed and magnetic field components and magnitude had no systematic differences between source spacecraft pairs large enough to warrant cross normalization.

We have done cross normalization between IMP hour averages and concurrent (time-shifted) Wind hour averages for plasma flow speed, density, and temperature. For the latter two parameters, we cross normalize logs of densities and temperatures (as all our previous density and temperature cross-normalizations have been) since these parameters tend to be log-normally distributed more so than normally distributed.

To minimize effects of cases where IMP averages and time shifted Wind averages may be based on different parts of an hour (given the reality of telemetry data gaps, etc.), we have required at least 30 fine scale points be included in both IMP and Wind averages before including that hour's values in the cross normalization analysis.

We have looked for both time dependencies, within the 1995-1998 interval, and flow speed dependencies in cross normalization. Searching for flow speed dependencies, not done in prior OMNI data preparation work, was stimulated by recent work of others (e.g., Russell and Petrinec, 1993; Lazarus and Paularena, 1998).

Cross-normalizations were done by searching for parameters a and b in the linear equation P(IMP) = a + b*P(Wind), (where P = V, log N or log T) wherein the sum of squares of perpendicular distances between data points (in P(IMP), P(Wind) space) and the best fit line are minimized.

Results of these analyses, by parameter by year for all flow speeds combined are given in Table 1.

Table 1.  

1995    V(IMP)  = 4 + 0.98 * V(Wind)
1996              1   0.99
1997              8   0.97
1998              4   0.98

1995    log-N(IMP)  =  -.048 + 1.072 * log-N(Wind)
1996                   -.053   1.080
1997                   -.075   1.092
1998                   -.096   1.121

1995    log-T(IMP)  =  -.732 + 1.117 * log-T(Wind)
1996                   -.708   1.113
1997                  -1.012   1.180
1998                   -.461   1.070
The numbers of hours (points) in the four annual analyses range between 1600 and 2000. The statistical uncertainties in the density intercepts and slopes are in the range .010-.015. The equivalent uncertainties in the temperature intercepts are in the range .088-.126 and in the temperature slopes are .018-.026.

The flow speed results are close enough to Y=X to warrant no speed normalization, as for all our prior data sources. The density and temperature lines diverge sufficiently from Y=X to require normalization. The density result shows a small steepening of the regression line with time; however, the smallness of this steepening suggests that we can neglect this time dependence for any practical purpose, and we shall do so. The temperature result shows a larger but non uniform time variation. the non uniformity suggests randomness rather than systematic effects, so again we shall neglect the time dependence in temperature cross-normalizations.

We next proceed to search for flow speed dependence in the cross normalization. For this, we combine all our 1995-1998 data, and then separate it into three speed bins: <380 km/s (3045 hours), 380-450 km/s (2193 hours), >450 km/s (1752 hours). The results are given in Table 2.

Table 2.

All data        V(IMP) =  4 + 0.98 * V(Wind)
V<380                    13   0.96
380-450                   4   0.98
>450                      6   0.98

All data   log-N(IMP) = -.063 + 1.086 * log-N(Wind)
<380                    -.081   1.100
380-450                 -.098   1.121
>450                    -.071   1.122

All data   log-T(IMP) = -.733 + 1.120 * log-T(Wind)
<380                   -1.184   1.219
380-450                 -.771   1.131
>450                    -.823   1.129
The statistical uncertainties in the speed-binned intercepts are in the ranges 1-4 (V), .012-.015 (log N), and .110-.221 (log T). The equivalent uncertainties in the slopes are .003-.009 (V), .011-.020 (log N), and .023-.042 (log T).

Gathering the "all data" results from above, and adding uncertainties to them, we have:

      (1)         V(IMP) = 4.245 (+/-.640) + 0.983 (+/-.001) * V(Wind)
      (2)     log-N(IMP) = -.063 (+/-.006) + 1.086 (+/-.006) * log-N(Wind)
      (3)     log-T(IMP) = -.733 (+/-.056) + 1.120 (+/-.012) * log-T(Wind)
Figures 1-3 show the scatter plots of these parameters. In each of these figures, we have plotted <380 km/s points in blue, 380-450 km/s points in red, and >450 km/s points in green. Note that for the log-N and log-T scatter plots, the points were laid down in a blue, red, green sequence, with later colors hiding earlier colors for cases of overlap. Nevertheless it is clear that fast (slow) flows are associated with hot (cool) and dilute (dense) plasmas, as has long been known.

The black lines on Figures 1-3 are the best fit lines using all data in their determination (i.e., equations 1-3). In Figures 4-6 we show families of best fit lines determined from each of the three speed intervals, using the same color convention as for Figures 1-3, and the all data best fit lines also shown on Figures 1-3. On the scale of the plot, the four velocity best fit lines in Figure 4 are indistinguishable.

The density lines are virtually indistinguishable except that the high speed (green) line stands apart at high density values. But from the combination of Figures 2 and 5 it is clear that the green and black lines both pass through the low and middle density points where most of the data points are; the difference between the green and black lines there is clearly less than the width of the main distribution of points.

The log-temperature points of Figure 3 show a broader distribution than either the flow speed or the log-density points. The regression lines of Figure 6 also stand apart from each other more clearly than for speed or log-density, but again their separation is significantly less than the width in the distributions of Figure 3.

Because the differences in the regression lines are in all cases significantly less than the widths of the underlying distributions, we will consider that all our Wind density and temperature data may be normalized to the IMP/MIT data with the single, all data regression lines, the black lines of Figures 1-6 and equations 1-3.

Note that previous authors who have found speed dependencies in density correlation have used fractional density differences and have not used logarithms of densities. Our use of
log-N(IMP) = a + b * log-N(Wind),
with a and b, independent of speed, yields
N(IMP)/N(Wind) = (10**a) * [N(Wind) ** (b-1)].
The ratio is itself a function of N(Wind) which, as evident from Figure 2, is statistically related to the flow speed.

PLASMA DATA NORMALIZATION: IMP/MIT AND WIND

In the initial OMNI creation, we opted, somewhat arbitrarily, to normalize IMP8/MIT data to a merged IMP 6-7-8 data set provided by Los Alamos National Laboratory (LANL). This data set had been internally normalized by LANL to match the highest plasma densities (IMP 7). Addition of more IMP 8 data in the late 1970's revealed that the MIT-LANL best fit regression lines were nearly time invariant, so the same normalization of MIT densities and temperatures were used. Subsequently all MIT data added to OMNI has been subjected to the same normalization.

At this point, we must address the possibility of renormalizing the data.

In the near future, we expect to perform more extensive MIT - LANL cross- normalization over the 25+ year IMP life, looking for time dependencies therein. In addition, within the next year, we expect to receive from MIT both reprocessed IMP data and more definitive Wind data than the Key Parameter data used to date. Given these two facts, we expect to be in a much better position to address renormalizations of data in about a year's time. As such, we shall maintain the present normalization of OMNI data to IMP/LANL for the present time, rather than making one change now and possibly another one in a year. However, it did not seem useful to either delay addition of Wind data to OMNI, given that Wind data addition provides a nearly 100%-complete data record relative to the 40%-complete record achieved in recent years with IMP 8 alone, nor to insert Wind data to OMNI with no normalization at all.

The normalization of IMP/MIT data to IMP/LANL data uses the same equations as have been used for years:

log-N(norm) = 0.121 + 0.89 * log-N(impmit)
log-T(norm) = -0.62 + 1.11 * log-T(impmit)
The normalization of Wind data to IMP/LANL data is obtained by chaining these LANL/MIT equations with the MIT/Wind equations above to give:
log-N(norm) =  0.065 + 0.967 * log-N(wind)
log-T(norm) = -1.434 + 1.243 * log-T(wind)
EFFECTS OF OMNI NORMALIZATION ON PLASMA DENSITY VALUES

The ratios N(norm)/N(obsv) and T(norm)/T(obsv) for obsv = Wind and IMP/MIT are given in Figure 7 and Figure 8 . These figures also show the ratios N(MIT)/N(Wind) and T(MIT)/T(Wind) obtained earlier in this analysis. These ratios show the fractional changes upon normalization of observed densities and temperatures, as functions of the observed value. The MIT/Wind ratios show the fractional changes that would be made in the Wind values if they were normalized to MIT values rather than to LANL values.

Consider first the density normalization. Normalizing MIT to LANL data (as has been done for 25 years for OMNI) implies an increase of MIT observations by ~20% at lowest densities (~2/cm**3), no change in the ~12/cm**3 range, and a decrease of 10% or so at high densities (~30/cm**3). (Please recall that the normalizing of spacecraft A data to spacecraft B values is arbitrary; we do not assert the data of spacecraft B is more likely to be true than the data of spacecraft A although we hope to have more basis for such assertions (and renormalizations) in a year or so.)

Normalizing Wind data to LANL data involves smaller changes than for the MIT data discussed in the preceding paragraph. Here, we increase Wind data by ~12% at the lowest densities and increase Wind data ~2% at the highest densities. We do not decrease the Wind densities at any density level.

Finally, normalizing Wind data to MIT data would imply <10% decrease of Wind densities at low densities (<5/cm**3) and increasing Wind densities above 5/cm**3 by amounts increasing to 20% for the highest densities encountered.

It is clear that normalizing densities can give fractional density changes of up to 20% at the extreme ends of the solar wind density range encountered. A very useful question to ask is, given the distribution of solar wind density values actually encountered, what is the effect of these normalization in time integrated density averages that may be more significant than hourly averages in, for example, determining the distance to the heliopause.

Figure 9 shows 27-day averaged densities for 1995-1998 obtained three ways. First, the actual OMNI data (IMP/MIT and Wind interspersed) as normalized to the LANL data are shown in red. Second, a hypothetical OMNI data set (IMP/MIT and Wind data interspersed as in the actual case) normalized to the IMP/MIT data is shown in green. Third, the black trace shows unnormalized Wind data only. 27-day averaged densities of >~9/cm**3 are virtually indistinguishable for either normalization to IMP/MIT or to IMP/ LANL. This is consistent with the LANL/MIT trace in Figure 7 being close to unity in the 9-15/cm**range and with there being no 27-day averages above this range. On the other hand, at lower densities, the LANL- normalized 27-day averages exceed the MIT-normalized averages by a few percent, again consistent with the LANL/MIT trace in Figure 7.

More significant is the fact that the 27-day averages of unnormalized Wind densities are 10-15% lower than the LANL-normalized OMNI data. This points out the importance of trying to determine the normalization most likely to give absolutely true plasma densities. As indicated above, with the arrival of newly processed IMP and Wind plasma densities from MIT in the coming year, with a study of the IMP/LANL data of the 1980's and 1990's, and with a review of others' plasma cross comparison work of recent years (including comparisons of in situ plasma density determinations with densities inferred from plasma wave cutoff frequencies as on Wind), we will assess whether renormalizations will yield OMNI density values more likely to be absolutely true.

EFFECTS OF OMNI NORMALIZATION ON PLASMA TEMPERATURE ESTIMATES

Figure 8, introduced above, shows the fractional changes of observed IMP/MIT and Wind temperature values when normalized to IMP/LANL and the fractional change of Wind temperatures if normalized to IMP/MIT. Virtually everywhere, MIT and Wind temperatures are decreased when normalized to LANL for inclusion in OMNI. The decrease in Wind temperatures at temperatures below 10**5 deg K is between a factor of 2 to 3! Fortunately many fewer science analyses depend on reliable values of solar wind temperatures than depend on good density values, but those researchers who are using temperature values quantitatively in their analyses should be cognizant of the fairly broad discrepancy between data sources.

DATA COVERAGE AT THIS TIME

As of the date of this report (September 1999), the fractional overall coverage for OMNI's field and plasma data is given in Figure 10 at annual resolution. The benefit to OMNI and its users of having Wind data available is dramatically obvious in the plasma coverage values of the 1995 -1998 period. We expect to add additional Wind magnetic field data in the coming months, such that the magnetic field and plasma coverage after 1994 will more closely resemble each other.

A MINOR Y2K-DRIVEN FORMAT CHANGE

Since OMNI's inception decades ago, the first word (one character) in each hourly record was a flag indicating whether magnetic field and/or plasma data were present in the record and, if both were, whether they were from the same spacecraft or not. It is our impression that this word has not been used by OMNI users. Its information content is redundant relative to information elsewhere in the data records.

The next three characters of each OMNI record have been a blank and a 2- digit year designator. In order to be more Y2K-compliant via the use of 4- digit year fields, and taking advantage of the non need and perceived non-use of the initial flag, we henceforth will use the first 4 characters for a 4-digit year field and we will abandon the flag.

REFERENCES

1. Lazarus and Paularena, A comparison of solar wind parameters from experiments on the IMP 8 and Wind spacecraft, pp 85-90. "Measurement Techniques in Space Plasmas (Particles)" ed. Pfaff, Borovsky, and Youns, AGU, 1998.

2. Richardson and Paularena, The orientation of plasma structure in the solar wind, GRL, 25, 2097-2100, 1998

3. Russell and Petrinec, Intercalibration of solar wind instruments during the International Magnetospheric Study, JGR, 98, 18,963 - 18,970, 1993. See also Comment on that paper: Paularena and Lazarus, JGR, 99, 14,777 - 14,778, 1994. 


NASA Official: J. H. King, king@nssdca.gsfc.nasa.gov
Version 1.0, 24 September, 1999
Last Updated: 24 September, 1999, NEP