Unreplicated early generation variety trial - Wheat

Introduction

To further illustrate the approaches presented in the previous section, we consider an unreplicated field experiment conducted at Tullibigeal situated in south-western NSW. The trial was an S1 (early stage) wheat variety evaluation trial and consisted of 525 test lines which were randomly assigned to plots in a 67 by 10 array. There was a check plot variety every 6 plots within each column. That is the check variety was sown on rows 1,7,13,...,67 of each column. This variety was numbered 526. A further 6 replicated commercially available varieties (numbered 527 to 532) were also randomly assigned to plots with between 3 to 5 plots of each. The aim of these trials is to identify and retain the top, say 20% of lines for further testing. Cullis et al. (1989) considered the analysis of early generation variety trials, and presented a one-dimensional spatial analysis which was an extension of the approach developed by Gleeson and Cullis (1987). The test line effects are assumed random, while the check variety effects are considered fixed. This may not be sensible or justifiable for most trials and can lead to inconsistent comparisons between check varieties and test lines. Given the large amount of replication afforded to check varieties there will be very little shrinkage irrespective of the realised heritability.

We consider an initial analysis with spatial correlation in one direction and fitting the variety effects (check, replicated and unreplicated lines) as random. We present three further spatial models for comparison. The ASReml input file is
 Tullibigeal trial
   linenum
   yield
   weed
   column 10
   row 67
   variety 532   # testlines 1:525, check lines 526:532
 wheat.asd !skip 1 !DOPATH 1
 !PATH 1                       # AR1 x I
 y ~ mu  weed mv !r variety
 1 2
 67 row AR1 0.1
 10 column I 0

 !PATH 2                       # AR1 x AR1
 y ~ mu  weed mv !r variety
 1 2
 67 row AR1 0.1
 10 column AR1 0.1

 !PATH 3                       # AR1 x AR1 + column trend
 y ~ mu weed pol(column,-1) mv !r variety
 1 2
 67 row AR1 0.1
 10 column AR1 0.1

 !PATH 4                       # AR1 x AR1 + Nugget + column trend
 y ~ mu weed pol(column,-1) mv !r variety units
 1 2
 67 row AR1 0.1
 10 column AR1 0.1
 predict var
The data fields represent the factors variety, row and column, a covariate weed and the plot yield ( yield). There are three paths in the ASReml file. We begin with the one-dimensional spatial model, which assumes the variance model for the plot effects within columns is described by a first order autoregressive process. The abbreviated output file is
    1 LogL=-4280.75     S2= 0.12850E+06    666 df   0.1000      1.000     0.1000
    2 LogL=-4268.57     S2= 0.12138E+06    666 df   0.1516      1.000     0.1798
    3 LogL=-4255.89     S2= 0.10968E+06    666 df   0.2977      1.000     0.2980
    4 LogL=-4243.76     S2=  88033.        666 df   0.7398      1.000     0.4939
    5 LogL=-4240.59     S2=  84420.        666 df   0.9125      1.000     0.6016
    6 LogL=-4240.01     S2=  85617.        666 df   0.9344      1.000     0.6428
    7 LogL=-4239.91     S2=  86032.        666 df   0.9474      1.000     0.6596
    8 LogL=-4239.88     S2=  86189.        666 df   0.9540      1.000     0.6668
    9 LogL=-4239.88     S2=  86253.        666 df   0.9571      1.000     0.6700
   10 LogL=-4239.88     S2=  86280.        666 df   0.9585      1.000     0.6714
  Final parameter values                       0.95918     1.0000    0.67205

  Source                Model  terms     Gamma     Component    Comp/SE   % C
  variety                 532    532  0.959184       82758.6       8.98   0 P
  Variance                670    666   1.00000       86280.2       9.12   0 P
  Residual            AR=AutoR    67  0.672052      0.672052      16.04   1 U

                                    Wald F statistics
      Source of Variation           NumDF     DenDF    F-inc             Prob
    7 mu                                1      83.6  9799.18            <.001
    3 weed                              1     477.0   109.33            <.001
The iterative sequence converged, the REML estimate of the autoregressive parameter indicating substantial within column heterogeneity. The abbreviated output from the two-dimensional AR1 cross AR1 spatial model is
    1 LogL=-4277.99     S2= 0.12850E+06    666 df
    2 LogL=-4266.13     S2= 0.12097E+06    666 df
    3 LogL=-4253.05     S2= 0.10777E+06    666 df
    4 LogL=-4238.72     S2=  83156.        666 df
    5 LogL=-4234.53     S2=  79868.        666 df
    6 LogL=-4233.78     S2=  82024.        666 df
    7 LogL=-4233.67     S2=  82725.        666 df
    8 LogL=-4233.65     S2=  82975.        666 df
    9 LogL=-4233.65     S2=  83065.        666 df
   10 LogL=-4233.65     S2=  83100.        666 df

  Source                Model  terms     Gamma     Component    Comp/SE   % C
  variety                 532    532   1.06038       88117.5       9.92   0 P
  Variance                670    666   1.00000       83100.1       8.90   0 P
  Residual            AR=AutoR    67  0.685387      0.685387      16.65   0 U
  Residual            AR=AutoR    10  0.285909      0.285909       3.87   0 U

                                    Wald F statistics
      Source of Variation           NumDF     DenDF    F-inc             Prob
    7 mu                                1      41.7  6248.65            <.001
    3 weed                              1     491.2    85.84            <.001
The change in REML LogL is significant (χ21= 12.46, p<.001) with the inclusion of the autoregressive parameter for columns. The Figure presents the sample variogram of the residuals for the AR1 cross AR1 model. There is an indication that a linear drift from column 1 to column 10 is present. We include a linear regression coefficient pol(column,-1) in the model to account for this. Note we use the '-1' option in the pol term to exclude the overall constant in the regression, as it is already fitted. The linear regression of column number on yield is significant (t=-2.96). The sample variogram (Figure 2 ) is more satisfactory, though interpretation of variograms is often difficult, particularly for unreplicated trials. This is an issue for further research.


Figure 1. Sample variogram of the residuals from the AR1 cross AR1 model for the Tullibigeal data


Figure 2. Sample variogram of the residuals from the AR1 cross AR1 + pol(column,-1) model for the Tullibigeal data

The abbreviated output for this model and the final model in which a nugget effect has been included is
 #AR1xAR1 + pol(column,-1)
    1 LogL=-4270.99     S2= 0.12730E+06    665 df
    2 LogL=-4258.95     S2= 0.11961E+06    665 df
    3 LogL=-4245.27     S2= 0.10545E+06    665 df
    4 LogL=-4229.50     S2=  78387.        665 df
    5 LogL=-4226.02     S2=  75375.        665 df
    6 LogL=-4225.64     S2=  77373.        665 df
    7 LogL=-4225.60     S2=  77710.        665 df
    8 LogL=-4225.60     S2=  77786.        665 df
    9 LogL=-4225.60     S2=  77806.        665 df

  Source                Model  terms     Gamma     Component    Comp/SE   % C
  variety                 532    532   1.14370       88986.3       9.91   0 P
  Variance                670    665   1.00000       77806.0       8.79   0 P
  Residual            AR=AutoR    67  0.671436      0.671436      15.66   0 U
  Residual            AR=AutoR    10  0.266088      0.266088       3.53   0 U

                                    Wald F statistics
      Source of Variation           NumDF     DenDF    F-inc             Prob
    7 mu                                1      42.5  7073.70            <.001
    3 weed                              1     457.4    91.91            <.001
    8 pol(column,-1)                    1      50.8     8.73            0.005

 #
 #AR1xAR1 + units + pol(column,-1)
 #
    1 LogL=-4272.74     S2= 0.11683E+06    665 df  :   1 components constrained
    2 LogL=-4266.07     S2=  50207.        665 df  :   1 components constrained
    3 LogL=-4228.96     S2=  76724.        665 df
    4 LogL=-4220.63     S2=  55858.        665 df
    5 LogL=-4220.19     S2=  54431.        665 df
    6 LogL=-4220.18     S2=  54732.        665 df
    7 LogL=-4220.18     S2=  54717.        665 df
    8 LogL=-4220.18     S2=  54715.        665 df

  Source                Model  terms     Gamma     Component    Comp/SE   % C
  variety                 532    532   1.34824       73769.0       7.08   0 P
  units                   670    670  0.556400       30443.6       3.77   0 P
  Variance                670    665   1.00000       54715.2       5.15   0 P
  Residual            AR=AutoR    67  0.837503      0.837503      18.67   0 U
  Residual            AR=AutoR    10  0.375382      0.375382       3.26   0 U

                                    Wald F statistics
      Source of Variation           NumDF     DenDF    F-inc             Prob
    7 mu                                1      13.6  4241.53            <.001
    3 weed                              1     469.0    86.39            <.001
    8 pol(column,-1)                    1      18.5     4.84            0.040

The increase in REML LogL is significant. The predicted means for the varieties can be produced and printed in the .pvs file as
  Warning: mvetimates         is ignored for prediction
  Warning: units                is ignored for prediction

  ---- ---- ---- ---- ---- ---- ----   1 ---- ---- ---- ---- ---- ---- ---- ----
  column               evaluated at       5.5000
  weed                 is evaluated at average value of       0.4597
  Predicted values of yield

  variety             PredictedVlue StandardEror Ecode
        1.0000              2917.1782       179.2881 E
        2.0000              2957.7405       178.7688 E
        3.0000              2872.7615       176.9880 E
        4.0000              2986.4725       178.7424 E
          .                     .               .
      522.0000              2784.7683       179.1541 E
      523.0000              2904.9421       179.5383 E
      524.0000              2740.0330       178.8465 E
      525.0000              2669.9565       179.2444 E
      526.0000              2385.9806        44.2159 E
      527.0000              2697.0670       133.4406 E
      528.0000              2727.0324       112.2650 E
      529.0000              2699.8243       103.9062 E
      530.0000              3010.3907       112.3080 E
      531.0000              3020.0720       112.2553 E
      532.0000              3067.4479       112.6645 E
  SED: Overall Standard Error of Difference   245.8
Note that the (replicated) check lines have lower SE than the (unreplicated) test lines. There will also be large diffeneces in SEDs. Rather than obtaining the large table of all SEDs, you could do the prediction in parts
predict var 1:525 column 5.5
predict var 526:532 column 5.5 !SED
to examine the matrix of pairwise prediction errors of variety differences.
  • Back

    Return to start