ANI interpretation w/ span-length coverage

| posted 26 Feb, 2020 00:53
(Sorry if this isn't the right forum - not sure where to post bioinformatics analysis questions!)

We were wondering how to interpret ANI values with respect to average genome coverage/span-length of alignments. We haven't noticed any mention of this in papers which compare large numbers of phages, and would like to make sure that we are correctly interpreting high ANI values given low span-lengths.

For example, some phages may share 90% ANI with span-lengths of 100%, whereas other phages may share 90% ANI with span-lengths of only 10%. While both circumstances certainly indicate some degree of sequence similarity, an ANI value of 90% seems inflated given that the aligned sequences only account for 10% of each genome. Is it appropriate to compare ANI values without regarding span-length/average coverage, or should span-length also be accounted for in our analyses?
| posted 26 Feb, 2020 19:07
ANI in DNA MAster is calculated between pairs of genes that share protein sequence similarity at a preset threshold. Not all ANI calculators do this, some are simply whole genome without regard to gene content.
So yes, span length matters, for ANI in DNA MAster.
