Qualifiers affecting data input
Datafile line
The purpose of the datafile line is to
nominate the data file,
specify qualifiers to modify
the reading of the data,
the output produced,
the operation of ASReml.
The datafile line appears in the ASReml command file in the form
datafile [ qualifiers ]
datafile
is the path name of the
file that contains the variates, factors, covariates, traits
(response variates) and weight variables represented as data fields;
enclose the path name in quotes if it contains embedded blanks,
the qualifiers tell ASReml to modify either
the reading of the data and/or
the output produced, see below,
the operation of ASReml, see
Common job qualifiers
the data file related qualifiers must appear on the data file
line,
the job control qualifiers may appear on the data file
line or on following lines,
the arguments to qualifiers are represented by the following symbols
f --- a filename,
n --- an integer number, typically a count,
p --- a vector of real numbers, typically in increasing order,
r --- a real number,
s --- a character string,
t --- a model term label,
v --- the number or label of a data variable,
vlist --- a list of variable labels.
Data input qualifiers
Frequently used data file qualifiers
!SKIP n
causes the first
n records of the (non-binary) data file to be
ignored. Typically these lines contain column headings for the data fields.
Other data file qualifiers.
!CSV
used to make consecutive commas imply
a missing value; this is automatically set if the file name ends
with
.csv
or
.CSV
!DATAFILE s
specifies a data file
name replacing the one obtained from the datafile line. It is
required when different
!PARTS
of a job must read different
files. The
!SKIP
qualifier, if specified, will be
applied when reading the file.
!FILTER v
[
!SELECT n
] enables a subset of the data to be analysed;
v is the number or name of a data field. When reading data,
the value in field v is checked after any transformations are performed. If
!select
is omitted, records with zero in field v are omitted from the analysis. Otherwise, records with n in field v are retained and all other records are omitted.
Warning
If the filter column contains a missing value, the value from the previous non-missing record is assumed in that position.
!FORMAT s
supplies a Fortran like
FORMAT
statement for reading fixed format files.
!MERGE c f
[
!SKIP n !MATCH a b
]
may be specified on a line following the datafile
line.
The purpose is to
combine
data fields from the (primary) data file
with data fields from a secondary file (f).
!READ n
formally instructs ASReml to read
n data fields from the data file. It is needed when there are
extra columns in the data file that must be read but are only required for
combination into earlier fields in transformations,
or when ASReml attempts to read more fields than it needs to.
!RECODE
is required when reading a binary
data file with pedigree identifiers that have not been recoded
according to the pedigree file. It is not needed when the file was
formed using the
!SAVE
qualifier
but will be needed if formed in
some other way.
!RREC [ n]
causes ASReml to read n records or to read up to a data reading error
if n is omitted, and then process the records it has.
This allows data to be extracted from a file
which contains trailing non-data records (for example
extracting the predicted values from a
.pvs
file).
The argument (n)
specifies the number of data records to be read.
If not supplied, ASReml reads until a data reading error occurs,
and then processes the data it has. Without this qualifier,
ASReml aborts the job when it encounters a data error.
See
!RSKIP
!RSKIP n [s]
allows ASReml to skip lines at the heading of a file down
to (and including) the nth instance of string s.
For example, to read back the third set predicted values
in a
.pvs
file, you would specify
!RREC !RSKIP 4 ' Ecode'
since the line containing the 4th instance of
Ecode
immediately precedes the predicted values.
Used with the
!RREC
qualifier, ASReml will read
until the end of the predict table.
The keyword
Ecode
which occurs once at the beginning and then immediately
before each block of data
in the
.pvs
file is used to count the sections.
Return to start