# cd-hit ## Summary + **Module Name:** cd-hit + **Support Level:** Secondary Support + **Software Access Level:** Open Access + **Home Page:** [http://weizhongli-lab.org/cd-hit/](http://weizhongli-lab.org/cd-hit/) ## Software Description CD-HIT is a program for clustering large protein database at high sequence identity threshold. The program removes redundant sequences and generate a database of only the representatives. It can be applied in protein family classification, domain analysis, organizing large protein databases, improving performance of database search, and much more. ## General Linux To load this module for use in a Linux environment, you can run the command: module load cd-hit Depending on where you are working, there may be more than one version of cd-hit available. To see which modules are available for loading you can run: module avail cd-hit