H inverse

H inverse and HBLUP

The H matrix is a particular form of a G matrix obtained by merging an A (numerator relationship) matrix with a G (Genomic relationship) matrix pertaining to a subset of the genotypes in A.

H-1τ, ω = A-1 + P(τ(G-1 - ωA-122) }P'

where A-122 is the inverse of the block of A with genomic information in G and P is a permutation matrix which maps the rows of (τG-1 - ωA-122) back to A.

GRM.sgiv !HINV GRM_ID_file.txt [ !Hskip h !OMEGA ω !TAU τ ] creates a H-1 from the pedigree based A-1 (just defined in the job) and the G-1 defined on this GRM line.

GRM.sgiv is the file containing the G matrix or its inverse, GRM_ID_file.txt is a file containing the list of genotype identifiers for the G matrix, which must be a subset of the pedigree file identifiers. Use !Hskip h to skip header lines in GRM_ID_file.txt. !OMEGA ω and !TAU τ are optional tuning parameters discissed below.

If the H matrix has been formed outside of ASReml, it can be used as a GRM matrix.

To form the H-1 matrix, first specify the pedigree file from which the A-1 matrix is formed, then specify the G (inverse) matrix with the !HINV qualifier. ASReml saves the H-1 matrix as a binary file (filename given in the output) which can subsequently be used directly (saving the setup time in the subsequent runs). The GRM line with the !HINV qualifier actually defines two inverse matrices available for use in the model: the G-1 and the H-1.

Warning: ASReml does not hold the list of genotype names as an attribute of the G matrix and so cannot automatically align the data order with the G-1 matrix order as asreml-r does. Therefore, genotype names must be specified in the G-1 matrix order in the definition of the genotype factor:
genotype !A !L Gorder.txt instructs ASReml to code the variable genotype in the order of level names in the first field of ASCII file Gorder.txt. Other-wise genotype is coded in the order level names are encountered in the data file, which may not match the G matrix.

The GRM file (and hence 'Gorder.txt') may include genotypes not in the data file; fitting grm(genotype) in the model will predict effects for all the genotypes. If the data file includes genotypes not in the GRM file (i.e. not in Gorder.txt), these are appended to the genotype list and the G-1 matrix is extended with an identity matrix to cover the extra genotypes.

So we could have the following
  Animal !P
  Sire !A !L SireGRM.txt
  ...
  pedigree.txt
  SireGRM.grm !HINV SireGRM.txt
then we could fit any of nrm(Animal), grm1(Sire), grm2(Animal) in the model.

This assumes that the Sires are included in the pedigree. When the pedigree file is processed, the list of animals is retained and subsequently used when the data is read in, to assign the correct levels to Animal (specified by the !P).

When SireGRM.grm !HINV SireGRM.txt is read/processed, grm1 is formed directly and grm2 is formed by merging G-1 into A-1. It needs the identities (in SireGRM.txt) to merge G-1 into A-1.

When the data is read, ASReml knows the pedigree identifiers (and !P says use the pedigree identifiers when coding the levels for Animal), but it does not know Sire will be associated with SireGRM.grm. The link is not implied to ASReml by any similarity in the names; it is helpful to us though to use labels/filenames that suggest what the variable/matrix represents). By providing !L SireGRM.txt on the Sire !A !L SireGRM.txt line, Sire is coded in the order of class names in SireGRM.txt (which was constructed to match the row order in SireGRM.grm)

The !P makes a strong link between Animal and A-1 so that if you specify Animal in the model, ASReml treats it as nrm(Animal) (and uses the A-1). Hence the need to use ide(Animal) to get independent animal effects.

There is no automatic link between Sire and any particular GRM. If you specify Sire in the model, it is equivalent to idv(Sire). Specify grm1(Sire) to associate 'Sire' with the first GRM inverse. If you want to use H-1 in the model, you need to specify grm2(Animal).

Return to index