Help Information


MSTmap was developed by Yonghui Wu (Ph.D. student, Computer Science), Dr. Prasanna Bhat (post-doc, Botany and Plant Sciences), Dr. Timothy J. Close (Professor, Botany and Plant Sciences), and Dr. Stefano Lonardi (Professor, Computer Science), at the University of California, Riverside (UCR). Please forward any inquiry to Stefano Lonardi ( The web-server is currently hosted by the Department of Botany and Plant Sciences, UCR.

Assemble the input file and specifiy various options for MSTmap

  • The genotype file is of the tab-delimited text format. It contains a table of dimension (m+1)*(n+1), where m is the total number of markers and n is the total number of mapping lines. The first row gives the ids for the mapping lines, while the first column gives the ids for the genetic markers. Each id is a string of letters (a-z, A-Z) or digits (0-9). No space is allowed within an id. Each cell in the table refers to the genotype state of a particular mapping line on a particular marker locus. The genotype states can be specified with letters 'A', 'a', 'B', 'b', '-', 'U' or 'X'. 'A' and 'a' are equivalent, 'B' and 'b' are equivalent and so are '-' and 'U'. 'U' and '-' indicates the missing genotype call. If the data set is from a RIL population, you can use 'X' to indicate that the corresponding genotype is a heterozygous. Please refer to example.txt for an example.

  • Grouping LOD Criteria specifies the criteria to be used to group markers into LGs. If you wish to put all the markers in one single LG regardless the pair-wise LOD scores, you can choose "Single LG".

  • Population type can be set to either "DH, BC1 or Hap" or "RIL at generation 2-10". Use generation 10 if your RIL population is beyond F10. The generation level is counted as follows. It is 1 for the F1 generation. Each additional inbreeding process will increment the value by 1.

  • Number of markers is simply the number of markers included in the genotype data file.

  • Number of mapping lines is simply the number of mapping lines included in the genotype data file.

  • No mapping distance threshold and No mapping size threshold together allow one to detect bad markers. In high density genetic linkage mapping, bad markers appear to be isolated from others. MSTmap will detect isolated marker groups and will place them in seperate LGs. An isolated marker group is a small set of markers of size less than or equal to No mapping size threshold and is more than No mapping distance threshold away from the rest of the markers. A reasonable choice for No mapping size threshold is 1 or 2. To disable this feature, simply set No mapping size threshold to 0.

  • No mapping missing threshold specifies the maximum percentage of missing observations allowed per marker locus. MSTmap will remove all markers which contains more than No mapping missing threshold percentage of missing observations.

  • MSTmap is able to detect erroneous genotype calls during the mapping process. To turn on this feature, set Try to detect genotyping errors to yes. The default is to have this feature turned off. If this feature is turned on, rare recombination events will be treated as errors. As a consequence, fewer bins will be produced.


Y. Wu, P. Bhat, T. J Close, S. Lonardi. 2007. Efficient and Accurate Construction of Genetic Linkage Maps from Noisy and Missing Genotyping Data. WABI 2007 - Workshop on Algorithms in Bioinformatics, LNBI 4645, pp.395-406, Philadelphia PA.

Yonghui Wu, Prasanna R. Bhat, Timothy J. Close, Stefano Lonardi. 2008. Efficient and Accurate Construction of Genetic Linkage Maps from the Minimum Spanning Tree of a Graph PLOS Genetics 4:e1000212