The header always contains the following lines, where <para1>, ..., <para12>
are the places for you to specify various parameters.
<para1> specifies the type of mapping population being used.
Possible values are DH and RILd,
where d is any natural number. For example, RIL6 means
a RIL population at generation 6. You should use DH for
BC1, DH and Hap.
<para2> gives a name for the mapping population. It can be
any string of letters (a-z, A-Z) or digits (0-9).
<para3> specifies the distance function to be used.
Possible choices are kosambi and haldane,
which refers to the commonly used Kosambi's and Haldane's distance functions respectively.
<para4> specifies the threshold to be used for clustering the markers into LGs.
A reasonable choice of p_value is 0.000001.
Alternatively, the user can turn off this feature by setting <para4> to
any number larger than 1. If the user does so, our software tool assumes that
all markers belong to one single linakge group.
<para5> and <para6>
together allow one to detect bad markers.
In high density genetic linkage mapping, bad markers appear to be isolated from others.
MSTmap will detect isolated marker groups and will place them in seperate LGs.
An isolated marker group is a small set of markers of size less than or equal to <para6>
and is more than <para5> away from the rest of the markers.
A reasonable choice for <para6> is 1 or 2.
To disable this feature, simply set <para6> to 0.
For example, if <para5>=15
and <para6>=2, then any group whose size is less than 2 and is 15 centimorgans away
from the rest of the markers will be placed in a linkage group by themselves.
Occasionally there are markers with excessive number of missing observations. Those markers can be eliminated by
settting <para7> to a proper value. For example,
if <para7>=0.25, then any marker with more than 25% missing observations will
be removed completely without being mapped.
<para8> is a binary flag which can be set to yes or no.
If <para8> is set to yes, then our software tool
will try to estimate missing data before clustering the markers into linkage groups.
<para9> is a binary flag which can be set to yes or no.
If <para9> is set to yes, then our software tool
will try to detect bad data during the map construction process. Those suspicious genotype data will be
printed to the console for user inspection. The error detection feature can be
turned off by setting <para9> to no.
<para10> specifies the objective function to be used.
Possible choices are COUNT and ML.
COUNT refers to the commonly used sum of recombination events objective function
and ML refers to the commonly used maximum likelihood objective function.
<para11> specifies the total number of markers in the data set.
<para12> specifies the total number of mapping lines in the data set.