-
Notifications
You must be signed in to change notification settings - Fork 368
/
burrow_wheeler_transform.py
72 lines (59 loc) · 2.48 KB
/
burrow_wheeler_transform.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
''' =============== PROBLEM STATEMENT =============
Implement the Burrows-Wheeler Transform (BWT) and its inverse.
Given an input string s, the program should compute its BWT and then reconstruct the original string using the inverse BWT.
============= SOLUTION ==============
The bwt function takes a string and returns its Burrows-Wheeler transform bwt.
1. The input string is appended with a special sentinel character ('$') to mark the end of the string.
2. Generate all possible cyclic permutations of the input string are generated by shifting the characters.
3, The cyclic permutations are sorted lexicographically, and the last character of each sorted permutation is extracted.
The inverse_bwt function takes the transformed string bwt and returns the original string.
1. Count the occurrences of each character.
2. Generate the sorted list of characters.
3. Construct the first column of the transformation matrix.
4. Perform the reverse transformation.
Sample Input:
banana
Sample Output:
BWT of banana$ is annb$aa
Inverse BWT of annb$aa is banana
'''
def bwt(s):
# add sentinel character $
s = s + '$'
# generate all cyclic permutations of the input string
perms = [s[i:] + s[:i] for i in range(len(s))]
# sort the cyclic permutations lexicographically
sorted_perms = sorted(perms)
# extract the last characters of each sorted permutation to construct the transformed string
bwt = ''.join([p[-1] for p in sorted_perms])
return bwt
def inverse_bwt(bwt_string):
# Count the occurrences of each character
counts = {}
for char in bwt_string:
counts[char] = counts.get(char, 0) + 1
# Generate the sorted list of characters
sorted_chars = sorted(counts.keys())
# Construct the first column of the transformation matrix
first_column = []
for char in sorted_chars:
count = counts[char]
first_column.extend([char] * count)
# Perform the reverse transformation
index = 0
original_string = ""
for _ in bwt_string:
char = bwt_string[index]
original_string = char + original_string
index = first_column.index(char)
first_column[index] = None
return original_string[1:]
if __name__ == '__main__':
# get input string from user
s = input("Enter a string: ")
# calculates BWT of input string
bw = bwt(s)
print("BWT of", s, "is", bw)
# calculates inverse BWT from the transformed string
ibw = inverse_bwt(bw)
print("Inverse BWT of", bw, "is", ibw)