Bin taxonomy

Bin taxonomy was already determined by checkm by searching for certain marker genes. An alternative method is to blast all protein coding genes to the nr protein database of ncbi blast, and then retreiving the last common ancestor of the good hits you get. This sounds like a lot of work, luckily there is a tool that does just this! CAT, the Contig Annotation Tool, can determine the taxonomy of a contig or a bin. This is a very computationally intensive task and you need to download a huge database to your local computer.

CAT is not installed in the default conda environment, so open up a terminal and install it.

conda install -c bioconda cat

If you run into trouble, creating a separate environment helped for me:

conda create -n cat -c bioconda CAT diamond=0.9.21
conda activate cat

[DO:] Download a CAT database here: tbb.bio.uu.nl/bastiaan/CAT_prepare/

In [ ]:

[DO:] If that's setup, read the CAT manual, and go ahead!

In [ ]: