Menu Close

Find all the sequences that occur more than once in DNA molecule.

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: “ACGAATTCCG”. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

Input: 
s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"

Output: ["AAAAACCCCC","CCCCCAAAAA"]
def DNAseq(seqce):
    n=len(seqce)
    d={}
    temp=[]
    for i in range(n):
        sub=seqce[i:i+10]
        d[sub]=d.get(sub,0)+1
    for key,val in d.items():
        if val>1:
            temp.append(key)
    return temp

seq=input('Enter the Sequence : ')
print(DNAseq(seq))

Input_1:
Enter the Sequence : AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT

Output:
CCCCCAAAAA AAAAACCCCC


Input_2:
Enter the Sequence : AAAAAAAAAAAAA
Output:
AAAAAAAAAA


ILLUSTRATION OF THE OUTPUT

Executed using python3

More Q