very low gcf values on species trees #468
-
I am working on an empirical dataset of 2750 ultraconserved elements from mammals. I have ASTRAL species trees built with dense taxon sampling and well-vetted sequence data. However, calculations of gCF on this tree are routinely low. Among ingroup samples, most branch values are <60, a few are <10, and several are even less than 1. While the tree is known to contain many short internodes (likely due to rapid radiation), these values feel surprisingly low and widespread. I know that Minh et al. present some instances of very low gCFs despite many available loci but Im wondering if what I describe seems reasonable? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
I moved this to iqtree2 discussion, because it's not really an issue. The answer is: yes, this is very much expected. The gCF measures how many gene trees share a split seen on the focal tree (e.g. species tree estimated however). If you are using UCEs, gene tree estimation error will be huge, because a single UCE often contains very limited information. This is compounded by short branches in the species tree of course. In this case, concordance vectors are likely to be useful, because because the three ways of measuring concordance (genes, sites, and quartets) have different pros and cons, and can all give you useful and somewhat different information. Despite gene tree estimation error, all three measures can still tell you useful things! Matt Hahn and I wrote a review on this here: https://academic.oup.com/mbe/article/41/11/msae214/7824840 and there's an IQ-TREE tutorial on it here: https://iqtree.github.io/doc/recipes/concordance-vector Happy to keep the discussion going if you have more questions, but I'll close the issue for now. (And for future general discussion items, just stick with using the IQTREE2 discussion forum - we're about to fix it, we know it's a bit confusing right now, that's our fault!). Rob |
Beta Was this translation helpful? Give feedback.
-
Gotcha. Thanks for the helpful perspective! |
Beta Was this translation helpful? Give feedback.
I moved this to iqtree2 discussion, because it's not really an issue.
The answer is: yes, this is very much expected.
The gCF measures how many gene trees share a split seen on the focal tree (e.g. species tree estimated however). If you are using UCEs, gene tree estimation error will be huge, because a single UCE often contains very limited information. This is compounded by short branches in the species tree of course.
In this case, concordance vectors are likely to be useful, because because the three ways of measuring concordance (genes, sites, and quartets) have different pros and cons, and can all give you useful and somewhat different information. Despite gene tree estimation error…