Content ASCII REAL_S | REAL_R
G | .grm | .bgrm, .sgrm | .rgrm
G-1 | .giv | .bgiv, .sgiv | .rgiv
The binary forms are preferred because the files are smaller and accuracy is greater.
Providing the inverse and log-determinant of the matrix saves processing time calculating them.
REAL_S refers to the Fortran sequential binary file structure in which each 'record' is enclosed in a 'wrapper' indicating the record size in bytes.
When ASReml writes a binary file, it uses the REAL_S file structure.
REAL_R refers to the R (C) binary form which does not include any 'record size' information.
Typically, G and G-1 are dense matrices ((nearly) all cells non-zero) and are half-stored.
However, some very large matrices have significant sparsity (most cells are zero).
Some common layouts are now discussed with respect to R and an NRM matrix of order 10
based on the pedigree:
ID Sire Dam
1 0 0
2 0 0
3 0 0
4 1 1
5 1 1
6 2 2
7 4 6
8 5 6
9 7 8
10 9 9
GRM matrices in ASReml are half-stored row-wise in either dense or sparse layouts.
The dense layout, for NR rows, means each of NR*(NR+1)/2 cells are explicitly stored row-wise
and so a particular cell I,J can be found at position I*(I-1)/2 + J where I >= J.
The sparse layout collapses the vector of values, dropping (non-diagonal) values of zero.
It uses a second vector, parallel to the values vector to specify the column (J) values for each cell;
row (I) values are implicit in the order of values with the diagonal cells always retained.
ASCII half-stored G-1 matrix cell-wise
Processessing the small pedigree shown above, with the !SAVE qualifier to form the inverse relationship matrix,
creates the ped_A.giv file shown below. It contains qualifiers for the log determinant and the number of genetic groups.
The first field is the row number, the second is the column number and the third is the matrix cell value.
ASReml can read this .giv file back in as a G-1 matrix.
!LDET -6.6130181 !GROUPSDF 0
1 1 5.000000000
2 2 3.000000000
3 3 1.000000000
4 1 -2.000000000
4 4 3.000000000
5 1 -2.000000000
5 5 3.000000000
6 2 -2.000000000
6 4 1.000000000
6 5 1.000000000
6 6 4.000000000
7 4 -2.000000000
7 6 -2.000000000
7 7 4.500000000
8 5 -2.000000000
8 6 -2.000000000
8 7 0.5000000000
8 8 4.500000000
9 7 -1.000000000
9 8 -1.000000000
9 9 4.909090909
10 9 -2.909090909
10 10 2.909090909
ASCII half-stored G matrix row-wise
The information in the .giv file was used to create a matrix (Ainv) in R which was inverted to form an NRM matrix.
> NRM = solve(Ainv)
> write(round(NRM,5),'NRM.grm')
Produces NRM.grm containing
Literal
"V1" "V2" "V3" "V4" "V5" "V6" "V7" "V8" "V9" "V10"
"1" 1 0 0 1 1 0 0.5 0.5 0.5 0.5
"2" 0 1 0 0 0 1 0.5 0.5 0.5 0.5
"3" 0 0 1 0 0 0 0 0 0 0
"4" 1 0 0 1.5 1 0 0.75 0.5 0.625 0.625
"5" 1 0 0 1 1.5 0 0.5 0.75 0.625 0.625
"6" 0 1 0 0 0 1.5 0.75 0.75 0.75 0.75
"7" 0.5 0.5 0 0.75 0.5 0.75 1 0.625 0.8125 0.8125
"8" 0.5 0.5 0 0.5 0.75 0.75 0.625 1 0.8125 0.8125
"9" 0.5 0.5 0 0.625 0.625 0.75 0.8125 0.8125 1.3125 1.3125
"10" 0.5 0.5 0 0.625 0.625 0.75 0.8125 0.8125 1.3125 1.65625
ASReml can read this and 3 variations back in as a G matrix: without labels, without elements above diagonal and without both.
Both these layouts (cell-wise and row-wise) can be used for G and G-1 files, according to the file extension.
If the !LDET qualifier is omitted in a .giv file, ASReml will calculate the log-determinant.
The !GROUPSDF qualifier is only needed when the G-1 matrix is actually an A-1 formed with genetic groups.
REAL_S BINARY half-stored files ( .bgiv, .sgiv, .bgrm, .sgrm)
Usually, these files will be formed by ASReml and given ASReml can read them back, the details are not important.
A Fortan binary sequential file of 4w bytes contains w 4byte words. A word is either 32bit integer value or a 32bit real value.
The 32bit integer values are either record wrappers specifying the number of bytes in the record, or an integer that is part of the record.
You can use the R readBin() function to examine a binary file.
ASReml looks at the leading 12 or so words in the file and will read the file if it appears to match one of the following patterns.
In the header line, [..] is a record wrapper, G11 is the first cell of the matrix, Ldet is the log determinant,
NG is the number of degrees of freedom associated with genetic groups, NR is the number of rows in the matrix,
and 7/77 specifies a particular sparse layout.
For the '7' layout, the file begins
[20] G11 Ldet NG NR 7 [20] [8] 1 2 [8] [4] 3. [4] ...
and matrix rows 2:NR are written as two records: NV, COL(ROW(I):ROW(I)+NV-1) and VAL(ROW(I):ROW(I)+NV-1)
where I is the half row being written, ROW(I) points to the first cell of that row,
NV is the number of nonzero cells in the row ending at the diagonal element,
COL(...) is the list of column numbers and VAL(I) are the matrix values.
For the '77' layout, the file begins
[20] G11 Ldet NG NR 77 [20] [12] 1 2 3. [12] ...
and
matrix rows 2:NR are written as one record each: NV, (COL(K),VAL(K),K=ROW(I),ROW(I)+NV-1)
A third 'cell-wise' layout with no header begins
[12] 1 1 G1,1 [12] [12] 2 1|2 G2,1|2 [12] [12] ...
and every non-zero cell is specified in a separate record with its row and column index given.
A 'cell-wise' layout with header begins
[12] NR NG Ldet [12] [12] 1 1 G11 [12] [12] ...
and every non-zero cell is specified in a separate record with its row and column index given.
A 'dense' row-wise layout with header begins
[12] NR NG Ldet [12] [4] G11 [4] [8] G21 G22 [8] ...
or
[12] NR Ldet NG [12] [4] G11 [4] [8] G21 G22 [8] ...
A 'dense' rowwise layout without header begins
[4] G1,1 [4] [8] G2,1 G2,2 [8] ...
Note that a qualifier !SGIV has been added to the pedigree file line to write back A-1 as a sparse binary .sgiv file.
To read back A-1 as a G-1 in a subsequent run,
several changes will be required to the command filr coding.
For example, if the original job (say PED.as) included lines
Animal !P
...
Pedigree.csv !SGIV !AIF
...
Y ~ ... !r nrm(Animal)
Copy it as say GIV.as and change it to say
Animal !A !L PED.aif # Animal !P
...
Pedigree\_A.sgiv #Pedigree.csv !SGIV !DIAG
...
Y ~ ... !r grm1(Animal) #Y ~ ... !r nrm(Animal)
REAL_R BINARY half-stored files ( .rgiv, .rgrm)
ASReml can also read binary files formed using the R writeBin() function (or by a C program),
identified as such by the r in the file extension.
Unlike the Fortran sequential binary files described above,
these have no record markers and all values are 32bit real values.
.rgiv files need the header to specify the log determinant.
Given a matrix called GRM held in R, it can be written to a binary file that ASReml can read by the R code:
NR <- dim(GRM)[1] # dimension
Tfile <- file("My.rgrm", "wb")
for (i in 1:NR)
writeBin (GRM[1:i,i],Tfile, size=4)
close(Tfile)
If writing the inverse GRM matrix in R, use the filename extension .rgiv, and include a header line in the file by inserting the R code line
writeBin (c(NR, 0, Ldet), Tfile, size=4) after the Tfile line, where Ldet is the log determinant of the GRM matrix
usually obtained while inverting the GRM matrix.
For a sparse stored matrix held in a data.frame SAI as described above, use the R code:
SAI=read.table( 'ped_A.giv',skip=1)
NV <- dim(SAI)[1] # Get length
NR <- SAI[NV,1]
Tfile <- file("SAI.rgiv", "wb")
writeBin (c(NR, 0, -6.61302), Tfile, size=4)
for (i in 1:NV) {writeBin (c(SAI[i,2],SAI[i,3]), Tfile, size=4)}
close(Tfile)
Return to index
|