Skip to content

Commit 41254e9

Browse files
kescoboKevin Bonham
and
Kevin Bonham
authored
More rosalind.info problems (#14)
* add fib problem * finish up fibonacci * get started on GC content * finish up gc * add input file * remember to save --------- Co-authored-by: Kevin Bonham <[email protected]>
1 parent 28974b4 commit 41254e9

File tree

5 files changed

+777
-1
lines changed

5 files changed

+777
-1
lines changed

docs/src/rosalind/01-dna.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11

2-
## 🧬 Problem 1: Counting DNA nucleotides
2+
# 🧬 Problem 1: Counting DNA nucleotides
33

44
🤔 [Problem link](https://rosalind.info/problems/dna/)
55

docs/src/rosalind/04-fib.md

+279
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,279 @@
1+
# ♻️ 🐇 Rabbits and Recurrence Relations
2+
3+
[Original Problem](https://rosalind.info/problems/fib/)
4+
5+
!!! warning "The Problem"
6+
_Given_: Positive integers ``n≤40``
7+
and $k≤5$.
8+
9+
_Return_: The total number of rabbit pairs
10+
that will be present after ``n`` months,
11+
if we begin with 1 pair and in each generation,
12+
every pair of reproduction-age rabbits produces a litter of ``k``
13+
rabbit pairs (instead of only 1 pair).
14+
15+
Sample Dataset
16+
```txt
17+
5 3
18+
```
19+
Sample Output
20+
```
21+
19
22+
```
23+
24+
This is a classic computer science problem,
25+
but not _directly_ biology focused.
26+
Instead, we'll use it to showcase some other julia features.
27+
28+
## Recursion in julia
29+
30+
Recursion in julia is pretty easy -
31+
we simply have a function call itself.
32+
As in any language, the key is to have a bail-out condition
33+
to avoid infinite recursion.
34+
35+
In this case, the bail-out is when you reach ``1``,
36+
and it's also helpful to avoid invalid inputs.
37+
38+
```@example fib
39+
function fib(n::Int, k::Int)
40+
# validate inputs
41+
n >= 0 || error("N must be greater than or equal to 0")
42+
k > 1 || error("K must be at least 2")
43+
44+
# once we reach 1 or 0, we just return 1 or 0
45+
if n <= 1
46+
return n
47+
else
48+
# otherwise, recursively call on the previous 2 integers
49+
return fib(n - 1, k) + k * fib(n - 2, k)
50+
end
51+
end
52+
```
53+
54+
Let's go through each piece:
55+
56+
```julia
57+
function fib(n::Int, k::Int)
58+
```
59+
60+
is the function definition.
61+
It takes two arguments, `n` and `k`, both of type `Int`.
62+
63+
```julia
64+
n >= 0 || error("N must be greater than or equal to 0")
65+
k > 1 || error("K must be at least 2")
66+
```
67+
68+
are what we call "short-circuit" evaluation.
69+
The `||` operator is a logical OR operator,
70+
and so if the first condition is true, the second condition is not evaluated
71+
(because `true` OR anything is true).
72+
The same things could have been written with the short-circuiting `&&`
73+
or as an `if` statement:
74+
75+
```julia
76+
!(n >= 0) && error("N must be greater than or equal to 0")
77+
# or
78+
n < 0 && error("N must be greater than or equal to 0")
79+
# or
80+
if n < 0
81+
error("N must be greater than or equal to 0")
82+
end
83+
```
84+
85+
```julia
86+
if n <= 1
87+
return n
88+
```
89+
90+
This is our recursion bail-out or "base case".
91+
Once we reach ``1`` or ``0``, we don't want to recurse any more,
92+
we just return that value.
93+
94+
```julia
95+
else
96+
return fib(n - 1, k) + k * fib(n - 2, k)
97+
end
98+
```
99+
100+
If we're not at the base case, we recurse.
101+
We call the function again, but with ``n - 1``
102+
for the previous generation,
103+
and ``n - 2`` for the generation before that.
104+
We also multiply the second call by ``k``
105+
to handle the fact that each pair of rabbits
106+
from two generations ago (the ones that have matured)
107+
produce ``k`` pairs of rabbits when they breed.
108+
109+
## Multiple dispatch
110+
111+
So that solves the recursion problem,
112+
but one thing you might notice
113+
if you try to read the inputs from rosalind.info directly
114+
is that we read the files as `String`s,
115+
but we've defined our function to operate on `Int`s.
116+
117+
```julia
118+
function fib(n::Int, k::Int)
119+
```
120+
121+
Here, we're defining a function with 2 arguments,
122+
``n`` and ``k``.
123+
The `::Int` syntax forces the arguments to have the type `Int`,
124+
which is short-hand for a 64-bit integer.
125+
126+
In something like python,
127+
you would probably write the function without type annotations,
128+
and then check inside the function
129+
with an `if` statement to deal with
130+
the case where the arguments are not integers:
131+
132+
```python
133+
def fib(n, k):
134+
if not isinstance(n, int):
135+
n = int(n)
136+
if not isinstance(k, str):
137+
k = int(k)
138+
139+
# do stuff
140+
```
141+
142+
In julia, we can handle this instead by
143+
defining different "methods" of `fib`, each of which
144+
takes different types of arguments.
145+
In a previous problem,
146+
when we defined a function to operate on `String`s or on `BioSequence`s,
147+
we saw that julia makes use of "multiple dispatch",
148+
though we didn't name it as such.
149+
150+
In this case,
151+
if we try to call the function with different argument types,
152+
say `String`s, we'll get a `MethodError`, telling us that there isn't
153+
a version that works on that combination of types:
154+
155+
```julia-repl
156+
julia> fib("5", "3")
157+
ERROR: MethodError: no method matching fib(::String, ::String)
158+
The function `fib` exists, but no method is defined for this combination of argument types.
159+
160+
Closest candidates are:
161+
fib(::Int64, ::Int64)
162+
```
163+
164+
If we'd like, we could define a version
165+
that takes `String`s as arguments,
166+
and tries to convert them into integers.
167+
To do that, we use the `parse()` function.
168+
For example:
169+
170+
```@example fib
171+
parse(Int, "5")
172+
```
173+
174+
So, here's a version of `fib` that takes `String`s as arguments:
175+
176+
```@example fib
177+
function fib(n::String, k::String)
178+
nint = parse(Int, n)
179+
kint = parse(Int, k)
180+
fib(nint, kint)
181+
end
182+
183+
fib("5", "3")
184+
```
185+
186+
Ok, but this still doesn't work:
187+
188+
```julia-repl
189+
julia> fib("2", 3)
190+
ERROR: MethodError: no method matching fib(::String, ::Int64)
191+
The function `fib` exists, but no method is defined for this combination of argument types.
192+
193+
Closest candidates are:
194+
fib(::String, ::String)
195+
@ Main REPL[2]:1
196+
fib(::Int64, ::Int64)
197+
@ Main REPL[1]:1
198+
```
199+
200+
We could now go through and define methods
201+
of `fib` that take different types of arguments.
202+
For example:
203+
204+
```@example fib
205+
function fib(n::String, k::Int)
206+
nint = parse(Int, n)
207+
fib(nint, k)
208+
end
209+
```
210+
211+
or
212+
213+
```@example fib
214+
function fib(n::Int, k::String)
215+
kint = parse(Int, k)
216+
fib(n, kint)
217+
end
218+
```
219+
220+
Now, we can call `fib` with an `Int` and a `String`:
221+
222+
```@example fib
223+
fib(2, "3")
224+
```
225+
226+
## Parsing files
227+
228+
When you do a problem on Rosalind,
229+
the file you download contains your problem input.
230+
Up to now,
231+
I've just assumed you'll be copy-pasting
232+
the input into your Julia REPL.
233+
234+
However, it's often more convenient to read the input from a file,
235+
and in this case, a little additional parsing is required,
236+
since the input comes in the form of two integers separated by a space,
237+
and when you read in the file, it will be a `String`.
238+
239+
There are a number of ways to read files in julia,
240+
including the `readlines()` function,
241+
which loads all of the lines of the file into `Vector{String}`.
242+
243+
When you have really large files, this can be annoying,
244+
and you can instead use the `eachline()` function,
245+
so you can do something like `for line in readlines("input.txt")`
246+
and deal with each line separately.
247+
248+
But in this case (and the previous rosalind.info problems we've looked at),
249+
each problem comes in as a single line,
250+
so we'll use the `read()` function instead.
251+
By default, `read()` reads the entire file into a vector of bytes,
252+
that is a `Vector{UInt8}`.
253+
This can be converted to a `String` using the `String()` function,
254+
but this is a common enough use-case that we can just tell `read()` to return a `String`
255+
by passing the `String` type as an argument.
256+
257+
```julia
258+
read("rosalind_fib.txt", String)
259+
```
260+
261+
One tricky gotcha is that the rosalind text files
262+
have a trailing newline character at the end of the file.
263+
This can be removed using the `strip()` function.
264+
265+
Finally, because of the format of this input in particular,
266+
we'll use the `split()` function to split the string into two parts,
267+
and then convert each part to an integer using the `parse()` function.
268+
269+
```@example fib
270+
function read_fib(file)
271+
numbers = read(file, String)
272+
numbers = strip(numbers) # remove trailing newline
273+
n_str, k_str = split(numbers, ' ') # split on spaces
274+
return parse(Int, n_str), parse(Int, k_str)
275+
end
276+
277+
(n,k) = read_fib("problem_inputs/rosalind_fib.txt")
278+
fib(n, k)
279+
```

0 commit comments

Comments
 (0)