QC Metrics
From wgs-assembler
The N50 size of a set of entities (e.g., contigs or scaffolds) represents the largest entity E such that at least half of the total size of the entities is contained in entities larger than E. For example if we have a collection of contigs with sizes 7, 4, 3, 2, 2, 1, and 1 kb, the N50 length is 4 because we can cover 10 kb with contigs bigger than 4kb.
[Scaffolds] | |
TotalScaffolds = 26 | the total number of scaffolds in the assembly. |
TotalContigsInScaffolds = 38 | the total number of contigs that made it into scaffolds. |
MeanContigsPerScaffold = 1.46 | the average number of contigs in a scaffold. |
MinContigsPerScaffold = 1 | the minimum number of contigs in a scaffold. |
MaxContigsPerScaffold = 4 | the maximum number of contigs in a scaffold. |
TotalBasesInScaffolds = 2365510 | the sum of all contig sizes for the contigs in scaffolds. |
MeanBasesInScaffolds = 90981 | the average scaffold size. The size of a scaffold is the sum of all contigs contained in that scaffold. |
MinBasesInScaffolds = 485 | the minimum size of a scaffold. |
MaxBasesInScaffolds = 745692 | the maximum size of a scaffold. |
N25ScaffoldBases = 745692 | the length of the largest scaffold for which the following is true: the sum of its length and the lengths of all larger scaffolds equals to 25% of the total assembly length. |
N50ScaffoldBases = 460070 | the length of the largest scaffold for which the following is true: the sum of its length and the lengths of all larger scaffolds equals to 50% of the total assembly length. |
N75ScaffoldBases = 190089 | the length of the largest scaffold for which the following is true: the sum of its length and the lengths of all larger scaffolds equals to 75% of the total assembly length. |
ScaffoldAt1000000 = 460070 | summing the lengths of all scaffolds in decreasing order, this is the length of the scaffold when total scaffold length is 1,000,000 bp. |
ScaffoldAt2000000 = 49941 | summing the lengths of all scaffolds in decreasing order, this is the length of the scaffold when total scaffold length is 2,000,000 bp. |
TotalSpanOfScaffolds = 2366881 | the sum of all contig sizes and gaps in all scaffolds. |
MeanSpanOfScaffolds = 91034 | the average span of a scaffold. |
MinScaffoldSpan = 485 | the minimum span of a scaffold. |
MaxScaffoldSpan = 745672 | the maximum span of a scaffold. |
IntraScaffoldGaps = 12 | the number of sequencing gaps in all scaffolds. |
2KbScaffolds = 20 | the count of scaffolds whose span >= 2kbp. |
2KbScaffoldSpan = 2361688 | the cummulative span of scaffolds whose span >= 2kbp. |
MeanSequenceGapLength = 114 | the average length of a sequencing gap. |
[Top5Scaffolds=contigs,size,span,avgContig,avgGap,EUID] | |
0 = 2,745692,745672,372846,-20,9604 | |
1 = 4,460070,460310,115018,80,9602 | |
2 = 2,221845,221825,110922,-20,9599 | |
3 = 1,190326,190326,190326,0,9601 | |
4 = 3,190089,190089,63363,0,9595 | |
total = 12,1808022,1808222,150668,29 | |
[Contigs] | |
TotalContigsInScaffolds = 38 | the total number of contigs that made it into scaffolds. |
TotalBasesInScaffolds = 2365510 | the sum of all contig sizes for the contigs in scaffolds. |
TotalVarRecords = 4452 | the total number of var records in the contigs. Each var record indicates a possible SNP or high quality difference between the underlying reads. |
MeanContigLength = 62250 | the average contig length. |
MinContigLength = 485 | the minimum contig length. |
MaxContigLength = 744546 | the maximum contig length. |
N25ContigBases = 744546 | the length of the largest contig for which the following is true: the sum of its length and the lengths of all larger contigs equals to 25% of the total contig length |
N50ContigBases = 190326 | the length of the largest contig for which the following is true: the sum of its length and the lengths of all larger contigs equals to 50% of the total contig length |
N75ContigBases = 63054 | the length of the largest contig for which the following is true: the sum of its length and the lengths of all larger contigs equals to 75% of the total contig length |
ContigAt1000000 = 274067 | summing the lengths of all contigs in decreasing order, this is the length of the contig when total contig length is 1,000,000 bp. |
ContigAt2000000 = 38509 | summing the lengths of all contigs in decreasing order, this is the length of the contig when total contig length is 1,000,000 bp. |
[BigContigs_greater_10000] | |
TotalBigContigs = 28 | the number of contigs bigger than 10kb. |
BigContigLength = 2354038 | the sum of the sizes of all contigs bigger than 10kb. |
MeanBigContigLength = 84073 | the minimum contig length in contigs over 10kb. |
MinBigContig = 12338 | the minimum contig size in contigs over 10kb. |
MaxBigContig = 744546 | the maximum contig size in contigs over 10kb. Should be the same as MaxContigSize. |
BigContigsPercentBases = 99.52 | the percentage of TotalBasesInScaffolds contained in contigs over 10kb. |
[SmallContigs] | |
TotalSmallContigs = 10 | the number of contigs smaller than 10kb. |
SmallContigLength = 11472 | the sum of the sizes of all contigs smaller than 10kb. |
MeanSmallContigLength = 1147 | the average length of contigs under 10kb. |
MinSmallContig = 485 | the minimum contig size in contigs under 10kb. Should be the same as MinContigSize. |
MaxSmallContig = 2522 | the maximum contig size in contigs under 10kb. |
SmallContigsPercentBases = 0.48 | the percentage of TotalBasesInScaffolds contained in contigs under 10kb. |
[DegenContigs] | |
TotalDegenContigs = 3226 | the number of degenerate contigs (contigs that do not appear in scaffolds). |
DegenContigLength = 1461756 | the sum of the sizes of all degenerate contigs. |
MeanDegenContigLength = 453 | the average length of degenerate contigs. |
MinDegenContig = 68 | the minimum size of a degenerate contig. |
MaxDegenContig = 7612 | the maximum size of a degenerate contig. |
DegenPercentBases = 61.79 | the ratio (as percentage points) between DegenContigLength and TotalBasesInScaffolds. Note that degenerate contigs are not counted as part of TotalBasesInScaffolds. |
[Top5Contigs=reads,bases,EUID] | |
0 = 182445,744546,9578 | |
1 = 63719,274067,9574 | |
2 = 44864,190326,9572 | |
3 = 33161,131399,9570 | |
4 = 26026,118015,9573 | |
total = 350215,1458353 | |
[UniqueUnitigs] | |
TotalUUnitigs = 2100 | total number of unitigs with A-stats higher than 5 (unique unitigs) |
MinUUnitigLength = 64 | the minimum unique unitig length |
MaxUUnitigLength = 14743 | the maximum unique unitig length |
MeanUUnitigLength = 1400 | the average unique unitig length |
SDUUnitigLength = 2608 | the standard deviation unique unitig lengths |
[Surrogates] | |
TotalSurrogates = 988 | total number of surrogates in the assembly. A surrogate is a contig containing repetitive or ambiguous reads. |
SurrogateInstances = 1182 | number of instances in contigs where surrogate reads are placed. One surrogate may be placed in multiple locations. |
SurrogateLength = 658224 | sum of all surrogate contig lengths. |
SurrogateInstanceLength = 841946 | sum of all surrogate instance lengths. |
UnPlacedSurrReadLen = 22302386 | sum of all unplaced surrogate read lengths. |
PlacedSurrReadLen = 408114 | sum of all placed surrogate read lengths. |
MinSurrogateLength = 112 | size of smallest surrogate. |
MaxSurrogateLength = 10121 | size of largest surrogate. |
MeanSurrogateLength = 666 | mean size of a surrogate. |
SDSurrogateLength = 1034 | standard deviation of surrogate sizes assuming a normal distribution. |
[Mates] | |
ReadsWithNoMate = 572634(88.78%) | number of reads (out of TotalReads) that did not have a mate |
ReadsWithGoodMate = 52160(8.09%) | number of reads (out of TotalReads) that had a good mate |
ReadsWithBadShortMate = 0(0.00%) | number of reads (out of TotalReads) that had a mate too close together |
ReadsWithBadLongMate = 214(0.03%) | number of reads (out of TotalReads) that had a mate too far apart |
ReadsWithSameOrientMate = 416(0.06%) | number of reads (out of TotalReads) that had a mate oriented the same direction |
ReadsWithOuttieMate = 220(0.03%) | number of reads (out of TotalReads) that had a mate oriented away from each other |
ReadsWithBothChaffMate = 32(0.00%) | number of reads where the reads in a mate are chaff (singleton) |
ReadsWithChaffMate = 906(0.14%) | number of reads where the mate is chaff (singleton) |
ReadsWithBothDegenMate = 2690(0.42%) | number of reads where both reads in a mate are degenerates |
ReadsWithDegenMate = 2122(0.33%) | number of reads where its mate is a degenerate |
ReadsWithBothSurrMate = 0(0.00%) | number of reads where both reads are surrogates |
ReadsWithSurrogateMate = 9436(1.46%) | number of reads where the mate is a surrogate |
ReadsWithDiffScafMate = 4206(0.65%) | number of reads where the mate resides in a different scaffold |
ReadsWithUnassignedMate = 0(0.00%) | number of reads where the mate is unassigneds |
TotalScaffoldLinks = 0 | number of links between scaffolds. These represent linking information currently conflicting with the existing scaffolds. The lower this number the better. |
MeanScaffoldLinkWeight = 0.00 | average weight (# of mate pairs) of links between scaffolds. |
[Reads] | |
TotalReadsInput = NA | the total number of reads supplied to the assembler. Paired end SFF files are NOT accurately accounted. |
TotalUsableReads = 645036 | the total number of reads included in the assembly. |
AvgClearRange = 327 | the average read clear range (i.e. the usable portion of each read - clear of vector and bad quality bases |
ContigReads = 545225(84.53%) | the number of reads that belong to contigs. |
BigContigReads = 543465(84.25%) | number of reads that belong to contigs over 10kb in size. |
SmallContigReads = 1760(0.27%) | number of reads that belong to contigs under 10kb in size. |
DegenContigReads = 25606(3.97%) | number of reads in degenerate contigs. |
SurrogateReads = 70715(10.96%) | number of reads in surrogates - potentially repetitive or ambiguously placed contigs. |
PlacedSurrogateReads = 3526(0.55%) | number of placed reads in surrogates. |
SingletonReads = 7016(1.09%) | number of reads that are neither in contigs, nor surrogates, nor degenerate contigs. |
ChaffReads = 7007(1.09%) | number of reads that are neither in contigs, nor surrogates, nor degenerate contigs. |
[Coverage] | |
ContigsOnly = 75.28 | coverage (redundancy) of all contigs in scaffolds: length of all the reads in contigs or surrogates divided by the size of all scaffolds |
Contigs_Surrogates = 84.71 | coverage of all contigs and surrogates: length of all the reads in contigs and surrogates divided by the size of all scaffolds. |
Contigs_Degens_Surrogates = 54.57 | coverage of all contigs, degenerates, and surrogates: length of all the reads in contigs, surrogates, and degenerates divided by the size of all scaffolds and degenerates. |
AllReads = 89.14 | coverage you paid for: length of all the reads divided by the size of the scaffolds. |
[TotalBaseCounts] | |
BasesCount = NA | Total count of all bases for all reads (inclues vector and bad quality regions |
ClearRangeLengthFRG = NA | Total clear range for all input reads (from frg file) |
ClearRangeLengthASM = 210856220 | Total clear range for all used reads (per asm file). This excludes reads trimmed by OBT |
SurrogateBaseLength = 22710500 | Total length of surrogate reads. (Same as UnPlacedSurrReadLen + PlacedSurrReadLen) |
ContigBaseLength = 178069925 | Total length of contig reads. |
DegenBaseLength = 8471349 | Total length of degenerate reads |
SingletonBaseLength = 2012560 | Total length of singleton reads |
Contig_SurrBaseLength = 200372311 | Total length of contig reads and unplaced surrogate reads. (Same as UnPlacedSurrReadLen + ContigBaseLength) |
[gcContent] | |
Content = 42.34 | The percentage of gc content in all the scaffold contigs. |
[Unitig Consensus] | |
NumColumnsInUnitigs = 21322792 | |
NumGapsInUnitigs = 1256051 | |
NumRunsOfGapsInUnitigReads = 95810082 | |
[Contig Consensus] | |
NumColumnsInUnitigs = 4213072 | |
NumGapsInUnitigs = 385852 | |
NumRunsOfGapsInUnitigReads = 31174926 | |
NumColumnsInContigs = 4184982 | |
NumGapsInContigs = 357749 | |
NumRunsOfGapsInContigReads = 27813096 | |
NumAAMismatches = 6059 | |
NumVARRecords = 4452 | |
NumVARStringsWithFlankingGaps = 3538 | |
[Read Depth Histogram] | |
d < 3Kbp < 10Kbp < 1Mbp < inf | |
0 0 0 0 0 | |
1 477 0 55458 0 | |
2 168 0 34337 0 | |
(and so on) |