Embodiments of the present invention provide for improving the quality of cell-free DNA for analysis. Cell-free DNA may include DNA with defects that do not allow for analysis of those DNA with techniques such as sequencing and targeted capture enrichment. These defects may be defects within the str
Embodiments of the present invention provide for improving the quality of cell-free DNA for analysis. Cell-free DNA may include DNA with defects that do not allow for analysis of those DNA with techniques such as sequencing and targeted capture enrichment. These defects may be defects within the strands of the DNA and not present at the ends of the DNA. Embodiments of the present invention repair these intrastrand defects in cell-free DNA. The repair of the defects in cell-free DNA may then allow for these repaired cell-free DNA to be analyzed by techniques, including sequencing and targeted capture enrichment.
대표청구항▼
1. A method of improving analysis of a first biological sample, the first biological sample including cell-free nucleic acid molecules, the method comprising: obtaining a plurality of double-stranded nucleic acid molecules from the cell-free nucleic acid molecules to produce a second biological samp
1. A method of improving analysis of a first biological sample, the first biological sample including cell-free nucleic acid molecules, the method comprising: obtaining a plurality of double-stranded nucleic acid molecules from the cell-free nucleic acid molecules to produce a second biological sample, wherein: one or more double-stranded nucleic acid molecules of the plurality of double-stranded nucleic acid molecules each have one or more defects, andfor each of the one or more double-stranded nucleic acid molecules having a defect of the one or more defects: the defect is present in the respective double-stranded nucleic acid molecule at a location at least one nucleotide away from a closest end of the respective double-stranded nucleic acid molecule;adding a mixture comprising an enzyme to the second biological sample; andrepairing the one or more defects in each of the one or more double-stranded nucleic acid molecules using the enzyme to produce a repaired set of double-stranded nucleic acid molecules. 2. The method of claim 1, further comprising: producing a sequencing library using the repaired set of double-stranded nucleic acid molecules, anddetecting an aneuploidy or a sequence imbalance using the sequencing library. 3. The method of claim 2, further comprising: determining whether a sequence imbalance exists in the second biological sample by: obtaining a plurality of sequence reads from the repaired set of double-stranded nucleic acid molecules,analyzing, by a computer system, the plurality of sequence reads, wherein analyzing a sequence read includes: identifying a location of the sequence read in a reference genome by aligning the sequence read to the reference genome,determining, by a computer system, an amount of sequence reads from a genomic region using the identified locations,obtaining a value of a normalized parameter for the amount of sequence reads from the genomic region,comparing the value of the normalized parameter to a cutoff value,determining whether the sequence imbalance exists based on the comparison. 4. The method of claim 3, wherein the genomic region is a chromosome, and the sequence imbalance is an aneuploidy. 5. The method of claim 2, further comprising sequencing the sequencing library or performing targeted capture enrichment of the sequencing library to determine a set of reads. 6. The method of claim 5, wherein the set of reads does not comprise reads from any strand of the one or more double-stranded nucleic acid molecules having one or more defects. 7. The method of claim 1, wherein the one or more defects are selected from the group consisting of a nick, gap, abasic site, thymidine dimer, oxidized pyrimidine, deaminated cytosine, blocked 3′ end defect, or a combination thereof. 8. The method of claim 1, wherein the repaired set of double-stranded nucleic acid molecules are free of a nick, gap, abasic site, thymidine dimer, oxidized pyrimidine, deaminated cytosine, blocked 3′ end defect, or a combination thereof. 9. The method of claim 1, wherein the enzyme comprises at least one of a polymerase, a ligase, an endonuclease, or a glycosylase. 10. The method of claim 1, wherein the mixture comprises at least one of Taq DNA polymerase, Bst DNA polymerase, Taq DNA ligase, endonuclease VIII, endonuclease IV, T4 endonuclease V (PDG), 8-oxoguanine glycosylase (FPG), or uracil-DNA glycosylase (UDG). 11. The method of claim 1, wherein the mixture comprises Taq DNA polymerase, Bst DNA polymerase, Taq DNA ligase, endonuclease VIII, endonuclease IV, T4 endonuclease V (PDG), 8-oxoguanine glycosylase (FPG), and uracil-DNA glycosylase (UDG). 12. The method of claim 1, wherein the first biological sample is maternal plasma from a female subject pregnant with a fetus. 13. The method of claim 1, wherein the one or more double-stranded nucleic acid molecules have lengths in a range from 251 to 600 bp, from 251 to 450 bp, or from 451 bp to 600 bp. 14. The method of claim 1, wherein the one or more double-stranded nucleic acid molecules have lengths associated with dinucleosomal nucleic acid molecules or trinucleosomal nucleic acid molecules. 15. The method of claim 1, wherein repairing the one or more defects in each of the one or more double-stranded nucleic acid molecules comprises repairing a greater number of nucleic acid molecules having lengths from 251 to 600 bp than nucleic acid molecules having lengths from 0 to 250 bp. 16. The method of claim 1, wherein: the first biological sample is obtained from a female subject pregnant with a fetus,the one or more double-stranded nucleic acid molecules comprise a plurality of fetal-derived nucleic acid molecules,after repairing the one or more defects, the second biological sample is characterized by a fetal fraction calculated from reads obtained from analyzing the repaired set of double-stranded nucleic acid molecules, andthe fetal fraction is greater than 0.05. 17. The method of claim 1, further comprising: performing blunt-end ligation of the plurality of double-stranded nucleic acid molecules or of the repaired set of double-stranded nucleic acid molecules. 18. The method of claim 1, further comprising: haplotyping the HLA gene using the repaired set of double-stranded nucleic acid molecules. 19. The method of claim 1, further comprising: haplotyping a monogenic disease by: obtaining a plurality of sequence reads from the enriched set of double-stranded nucleic acid molecules,aligning the plurality of sequence reads to a reference genome, andidentifying a mutation at a locus between two proximal SNP using the aligned plurality of sequence reads. 20. The method of claim 19, wherein the monogenic disease is congenital adrenal hyperplasia. 21. A method of determining a classification of whether an individual has a disorder, the method comprising: receiving a first sample comprising a first set of double-stranded nucleic acid molecules derived from cell-free nucleic acid molecules in a biological sample, wherein: one or more double-stranded nucleic acid molecules of the first set of double-stranded nucleic acid molecules each have one or more defects, andthe one or more defects are present in the respective double-stranded nucleic acid molecule at a location at least one nucleotide away from a closest end of the respective double-stranded nucleic acid molecule;receiving a second sample comprising a second set of double-stranded nucleic acid molecules derived from the cell-free nucleic acid molecules in the biological sample;adding a first mixture comprising an enzyme to the first sample;repairing the one or more defects in each of the one or more double-stranded nucleic acid molecules of the first set of double-stranded nucleic acid molecules using the enzyme to produce a repaired first set of double-stranded nucleic acid molecules;determining a value of a parameter characterizing a difference in defects between the repaired first set of double-stranded nucleic acid molecules and the second set of double-stranded nucleic acid molecules;comparing the value of the parameter to a reference value; anddetermining the classification of whether the individual has the disorder based on the comparison of the value of the parameter to the reference value. 22. The method of claim 21, wherein: the first mixture comprises a plurality of enzymes. 23. The method of claim 21, wherein the first sample and the second sample are equal in volume. 24. The method of claim 21, further comprising: adding a second mixture to a third sample comprising a third set of double-stranded nucleic acid molecules derived from the cell-free nucleic acid molecules in the biological sample to produce the second sample comprising the second set of double-stranded nucleic acid molecules, the second mixture excluding the enzyme. 25. The method of claim 24, wherein: the first mixture comprises a plurality of enzymes, andthe second mixture excludes the plurality of enzymes. 26. The method of claim 21, further comprising: sequencing or performing target capture enrichment of the repaired first set of double-stranded nucleic acid molecules to determine a first set of reads, andsequencing or performing targeted capture enrichment of the second set of double-stranded nucleic acid molecules to determine a second set of reads, wherein: the first set of reads does not comprise reads from double-stranded nucleic acid molecules having one or more defects,the second set of reads does not comprise reads from double-stranded nucleic acid molecules having one or more defects,the first set of reads has a first amount of reads,the second set of reads has a second amount of reads, andthe value of the parameter is determined using the first amount of reads and the second amount of reads. 27. The method of claim 26, further comprising: calculating a first statistical value characterizing the sizes of the first set of reads,calculating a second statistical value characterizing the sizes of the second set of reads,wherein: the value of the parameter is determined using the first statistical value and the second statistical value. 28. The method of claim 26, wherein: the value of the parameter is determined using a difference between the first amount of reads and the second amount of reads, orthe value of the parameter is determined using a ratio of the first amount of reads and the second amount of reads. 29. The method of claim 26, wherein: the first amount of reads is a normalized amount of reads, andthe second amount of reads is a normalized amount of reads. 30. The method of claim 21, wherein: the parameter is multidimensional,a first dimension of the parameter is based on a size of a nucleic acid molecule, anda second dimension of the parameter comprises an amount of the nucleic acid molecules. 31. The method of claim 30, wherein the first dimension is a range of sizes or a range of ratios of the size of the nucleic acid molecule to a reference size. 32. The method of claim 30, wherein: the reference value is multidimensional, andthe reference value specifies amounts of nucleic acid molecules at different sizes of the nucleic acid molecules. 33. The method of claim 21, wherein: the parameter is multidimensional,a first dimension of the parameter comprises a location in a reference genome of a nucleic acid molecule, anda second dimension of the parameter comprises an amount of the nucleic acid molecule. 34. The method of claim 21, wherein the reference value is determined using one or more subjects identified to have the disorder or from one or more subjects identified to not have the disorder. 35. The method of claim 34, wherein comparing the value of the parameter to the reference value comprises determining whether the value of the parameter exceeds the reference value. 36. The method of claim 21, wherein the disorder is systemic lupus erythematosus, congenital adrenal hyperplasia (CAH), Fragile-X Syndrome, graft-versus-host (GVHD) disease, a pregnancy-related disorder, an aneuploidy, or a sequence imbalance. 37. The method of claim 35, wherein: determining the classification is based on a difference between the value of the parameter and the reference value, andthe classification comprises a severity of the disorder, the classification indicating a more severe disorder when the difference is greater than a cutoff value. 38. A computer product comprising a computer readable medium storing a plurality of instructions for controlling a computer system to perform a method comprising: determining a value of a parameter characterizing a difference in defects between a repaired first set of double-stranded nucleic acid molecules and a second set of double-stranded nucleic acid molecules, wherein the repaired first set of double-stranded nucleic acid molecules and the second set of double stranded nucleic acid molecules are obtained by: receiving a first sample comprising a first set of double-stranded nucleic acid molecules derived from cell-free nucleic acid molecules in a biological sample, wherein: one or more double-stranded nucleic acid molecules of the first set of double-stranded nucleic acid molecules each have one or more defects, andthe one or more defects are present in the respective double-stranded nucleic acid molecule at a location at least one nucleotide away from a closest end of the respective double-stranded nucleic acid molecule;receiving a second sample comprising the second set of double-stranded nucleic acid molecules derived from the cell-free nucleic acid molecules in the biological sample;adding a first mixture comprising an enzyme to the first sample;repairing the one or more defects in each of the one or more double-stranded nucleic acid molecules of the first set of double-stranded nucleic acid molecules using the enzyme to produce the repaired first set of double-stranded nucleic acid molecules;comparing the value of the parameter to a reference value; anddetermining a classification of whether an individual has a disorder based on the comparison of the value of the parameter to the reference value. 39. A system comprising: the computer product of claim 38; andone or more processors for executing instructions stored on the computer readable medium.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.