publication . Article . Preprint . Other literature type . 2020

ExpansionHunter Denovo: a computational method for locating known and novel repeat expansions in short-read sequencing data

Egor Dolzhenko; Mark F. Bennett; Phillip A. Richmond; Brett Trost; Sai Chen; Joke J.F.A. van Vugt; Charlotte Nguyen; Giuseppe Narzisi; Vladimir G. Gainullin; Andrew M. Gross; ...
Open Access English
  • Published: 01 Apr 2020 Journal: Genome Biology, volume 21, issue 1, pages 1-14
  • Publisher: BMC
Abstract
AbstractExpansions of short tandem repeats are responsible for over 40 monogenic disorders, and undoubtedly many more pathogenic repeat expansions (REs) remain to be discovered. Existing methods for detecting REs in short-read sequencing data require predefined repeat catalogs. However recent discoveries have emphasized the need for detection methods that do not require candidate repeats to be specified in advance. To address this need, we introduce ExpansionHunter Denovo, an efficient catalog-free method for genome-wide detection of REs. Analysis of real and simulated data shows that our method can identify large expansions of 41 out of 44 pathogenic repeats, i...
Subjects
Fields of Science and Technology classification: 03 medical and health sciences0302 clinical medicine0303 health sciences030217 neurology & neurosurgery030304 developmental biology
Sustainable Development Goals: 3. Good health
free text keywords: Repeat expansions, Short tandem repeats, Whole-genome sequencing data, Genome-wide analysis, Friedreich ataxia, Myotonic dystrophy type 1, Method, Huntington disease, Fragile X syndrome, lcsh:Biology (General), lcsh:QH301-705.5, lcsh:Genetics, lcsh:QH426-470, Short read, Sequencing data, Microsatellite, Computational biology, Computer science, Genome, Trinucleotide repeat expansion, DNA Repeat Expansion, Human genome, Whole genome sequencing, Biology, Simulated data
Funded by
EC| EScORIAL
Project
EScORIAL
Emerging Simplex ORigins In ALS
  • Funder: European Commission (EC)
  • Project Code: 772376
  • Funding stream: H2020 | ERC | ERC-COG
Validated by funder
,
CIHR
Project
  • Funder: Canadian Institutes of Health Research (CIHR)
,
NHMRC| Computational and statistical bioinformatics for medical “omics”
Project
  • Funder: National Health and Medical Research Council (NHMRC) (NHMRC)
  • Project Code: 1054618
  • Funding stream: Program Grants
,
NHMRC| Discovery and translation of disease causing mutations with genomic and transcriptomic data
Project
  • Funder: National Health and Medical Research Council (NHMRC) (NHMRC)
  • Project Code: 1102971
  • Funding stream: Research Fellowships
Supplementary Outcomes
Abstract
AbstractExpansions of short tandem repeats are responsible for over 40 monogenic disorders, and undoubtedly many more pathogenic repeat expansions (REs) remain to be discovered. Existing methods for detecting REs in short-read sequencing data require predefined repeat catalogs. However recent discoveries have emphasized the need for detection methods that do not require candidate repeats to be specified in advance. To address this need, we introduce ExpansionHunter Denovo, an efficient catalog-free method for genome-wide detection of REs. Analysis of real and simulated data shows that our method can identify large expansions of 41 out of 44 pathogenic repeats, i...
Subjects
Fields of Science and Technology classification: 03 medical and health sciences0302 clinical medicine0303 health sciences030217 neurology & neurosurgery030304 developmental biology
Sustainable Development Goals: 3. Good health
free text keywords: Repeat expansions, Short tandem repeats, Whole-genome sequencing data, Genome-wide analysis, Friedreich ataxia, Myotonic dystrophy type 1, Method, Huntington disease, Fragile X syndrome, lcsh:Biology (General), lcsh:QH301-705.5, lcsh:Genetics, lcsh:QH426-470, Short read, Sequencing data, Microsatellite, Computational biology, Computer science, Genome, Trinucleotide repeat expansion, DNA Repeat Expansion, Human genome, Whole genome sequencing, Biology, Simulated data
Funded by
EC| EScORIAL
Project
EScORIAL
Emerging Simplex ORigins In ALS
  • Funder: European Commission (EC)
  • Project Code: 772376
  • Funding stream: H2020 | ERC | ERC-COG
Validated by funder
,
CIHR
Project
  • Funder: Canadian Institutes of Health Research (CIHR)
,
NHMRC| Computational and statistical bioinformatics for medical “omics”
Project
  • Funder: National Health and Medical Research Council (NHMRC) (NHMRC)
  • Project Code: 1054618
  • Funding stream: Program Grants
,
NHMRC| Discovery and translation of disease causing mutations with genomic and transcriptomic data
Project
  • Funder: National Health and Medical Research Council (NHMRC) (NHMRC)
  • Project Code: 1102971
  • Funding stream: Research Fellowships
Supplementary Outcomes
Any information missing or wrong?Report an Issue