ncibtep@nih.gov

Bioinformatics Training and Education Program

BTEP Question Forum

BTEP maintains several Question and Answer Forums of interest to the NCI/CCR community.
Currently, there are forums on these topics listed below:

If you wish to ask a question go to the Ask Question Page and submit your question.

 Back to Questions

Pipelines and QC:What package do you use to perform a partial duplicate removal?

What package do you use to perform a partial duplicate removal?

1

1 Answer:


We use two different strategies for duplicate removal depending on where the input fastq files are single-end(SE) or paired-end(PE). For PE reads, Picard Tools MarkDuplicate is used to find read pairs with the exact same start and end mapping coordinates duplicates are then marked for removal. For SE reads, the MACS2 filterdup command is used to remove duplicates (partial removal). - answered by Tovah Markowitz, Paul Schaughency, Vishal Koparde.


Answered on June 5th, 2020 by