Skip to content

Commit dda2403

Browse files
committed
Rename to schemarecomb.
1 parent 5764960 commit dda2403

31 files changed

+205
-198
lines changed

README.rst

+8-9
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
1-
With ggrecomb, you can design easy-to-use recombinant protein libraries, even if you aren't a computational expert.
1+
With schemarecomb, you can design easy-to-use chimeric protein libraries, even if you aren't a computational expert.
22

33
Here's a simple example::
44

5-
>>> import ggrecomb as sr
5+
>>> import schemarecomb as sr
66
>>> from Bio import SeqIO
77
>>>
88
>>> # pytest stuff, you can ignore this.
@@ -24,11 +24,10 @@ Here's a simple example::
2424
>>> best_lib = max(libraries, key=lambda x: x.mutation_rate - x.energy)
2525
>>>
2626
>>> # Save the generated DNA fragments.
27-
>>> # out_fn = tempdir + '/' + 'bgl3_dna_frags.fasta'
2827
>>> SeqIO.write(best_lib.dna_blocks, out_fn, 'fasta')
2928
42
3029

31-
With this simple script, we generated a six parent, seven block chimeric beta-glucosidase library. The saved DNA fragments can be ordered directly from a DNA synthesis provider and assembled with `NEB's Golden Gate Assembly Kit <https://www.neb.com/products/e1601-neb-golden-gate-assembly-mix>`_. There's no worrying about adding restriction sites since ggrecomb automatically adds BsaI sites.
30+
With this simple script, we generated a six parent, seven block chimeric beta-glucosidase library. The saved DNA fragments can be ordered directly from a DNA synthesis provider and assembled with `NEB's Golden Gate Assembly Kit <https://www.neb.com/products/e1601-neb-golden-gate-assembly-mix>`_. There's no worrying about adding restriction sites since schemarecomb automatically adds BsaI sites.
3231

3332

3433
Why Recombinant Proteins?
@@ -38,23 +37,23 @@ Engineering proteins with recombinant libraries has a number of advantages over
3837

3938
So why doesn't everybody do recombinant protein engineering? Historically, there's a number of technical challenges that made recombinant libraries impractical for general use. Namely, a great deal of computational expertise and time is needed to manually generate and select suitable libraries. Even with the required computational resources, mutagenesis was significantly easier than assembling 8+ DNA fragments when protein engineering was developing its fundamentals, so nearly everybody opted for traditional directed evolution and passed that practice down to their students and mentees.
4039

41-
The goal of this software package is to make recombinant library design accessible and convenient for protein engineers of all computational skill levels. ggrecomb designs libraries that come ready to order and construct with a simple Golden Gate Assembly reaction. To learn more, read the `ggrecomb documentation <https://ggrecomb.readthedocs.io/en/latest/>`_.
40+
The goal of this software package is to make recombinant library design accessible and convenient for protein engineers of all computational skill levels. schemarecomb designs libraries that come ready to order and construct with a simple Golden Gate Assembly reaction. To learn more, read the `schemarecomb documentation <https://schemarecomb.readthedocs.io/en/latest/>`_.
4241

4342

4443
Installation
4544
------------
4645

4746
.. code-block:: bash
4847
49-
$ pip install ggrecomb
48+
$ pip install schemarecomb
5049
5150
5251
Documentation
5352
-------------
5453

5554
Package reference material and helpful guides can be found at:
5655

57-
https://ggrecomb.readthedocs.io/en/latest/
56+
https://schemarecomb.readthedocs.io/en/latest/
5857

5958

6059
Citing
@@ -63,6 +62,6 @@ Citing
6362
..
6463
https://www.software.ac.uk/how-cite-software?_ga=1.54830891.1882560887.1489012280
6564
66-
If you use ggrecomb in a scientific publication, please cite it as::
65+
If you use schemarecomb in a scientific publication, please cite it as::
6766

68-
Bremer, B. & Romero, P. (2021). ggrecomb [Software]. Available from https://github.com/RomeroLab/ggrecomb.
67+
Bremer, B. & Romero, P. (2021). schemarecomb [Software]. Available from https://github.com/RomeroLab/schemarecomb.

bin/ggrecomb bin/schemarecomb

File renamed without changes.

conftest.py

+16-13
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
from Bio import SeqIO
99
import pytest
1010

11-
import ggrecomb
11+
import schemarecomb
1212

1313

1414
@pytest.fixture
@@ -199,7 +199,8 @@ def rid_to_database(http_dir):
199199
for database, date in database_query_dates.items():
200200
fn = http_dir / f'blast_{database}_put_{date}.txt'
201201
with open(fn, 'rb') as handle:
202-
rid, _ = ggrecomb.parent_alignment._parse_qblast_ref_page(handle)
202+
rid, _ = schemarecomb.parent_alignment._parse_qblast_ref_page(
203+
handle)
203204
if database == 'refseq':
204205
# Patch because refseq filename does not match database query name
205206
database = 'refseq_protein'
@@ -266,9 +267,9 @@ def bgl3_records_aln(fixture_dir):
266267

267268
@pytest.fixture
268269
def bgl3_parents_aln(bgl3_pdb_filename, bgl3_records_aln):
269-
pdb = ggrecomb.PDBStructure.from_pdb_file(bgl3_pdb_filename)
270-
parents = ggrecomb.ParentSequences(bgl3_records_aln, pdb_structure=pdb,
271-
prealigned=True)
270+
pdb = schemarecomb.PDBStructure.from_pdb_file(bgl3_pdb_filename)
271+
parents = schemarecomb.ParentSequences(bgl3_records_aln, pdb_structure=pdb,
272+
prealigned=True)
272273
return parents
273274

274275

@@ -292,8 +293,9 @@ def bgl3_single_aln_str(fixture_dir):
292293

293294
@pytest.fixture
294295
def bgl3_parent_alignment(bgl3_records_aln, bgl3_pdb_filename):
295-
pdb = ggrecomb.PDBStructure.from_pdb_file(bgl3_pdb_filename)
296-
parents = ggrecomb.ParentSequences(bgl3_records_aln, pdb, prealigned=True)
296+
pdb = schemarecomb.PDBStructure.from_pdb_file(bgl3_pdb_filename)
297+
parents = schemarecomb.ParentSequences(bgl3_records_aln, pdb,
298+
prealigned=True)
297299
return parents
298300

299301

@@ -306,8 +308,9 @@ def mock_bgl3_blast_query(bgl3_records, mocker, blast_http_responses,
306308
307309
This:
308310
fake_urlopen, fake_efetch = mock_bgl3_blast_query
309-
mocker.patch('ggrecomb.parent_alignment.urlopen', fake_urlopen)
310-
mocker.patch('ggrecomb.parent_alignment.Entrez.efetch', fake_efetch)
311+
mocker.patch('schemarecomb.parent_alignment.urlopen', fake_urlopen)
312+
mocker.patch('schemarecomb.parent_alignment.Entrez.efetch',
313+
fake_efetch)
311314
312315
is equivalent to this:
313316
query_seq = str(bgl3_records[0].seq)
@@ -316,7 +319,7 @@ def mock_bgl3_blast_query(bgl3_records, mocker, blast_http_responses,
316319
responses = {'refseq_protein': refseq_responses,
317320
'pdbaa': pdb_responses}
318321
mocker.patch(
319-
'ggrecomb.parent_alignment.urlopen',
322+
'schemarecomb.parent_alignment.urlopen',
320323
wrap_urlopen(
321324
query_seq=query_seq,
322325
parents_aln_str=bgl3_parents_aln_str,
@@ -326,7 +329,7 @@ def mock_bgl3_blast_query(bgl3_records, mocker, blast_http_responses,
326329
)
327330
# also need to patch Entrez.efetch for BLAST runs
328331
mocker.patch(
329-
'ggrecomb.parent_alignment.Entrez.efetch',
332+
'schemarecomb.parent_alignment.Entrez.efetch',
330333
wrap_efetch(
331334
responses=responses,
332335
acc_to_database=acc_to_database
@@ -356,8 +359,8 @@ def mock_bgl3_blast_query(bgl3_records, mocker, blast_http_responses,
356359
@pytest.fixture
357360
def bgl3_mock_namespace(doctest_namespace, mocker, mock_bgl3_blast_query):
358361
fake_urlopen, fake_efetch = mock_bgl3_blast_query
359-
mocker.patch('ggrecomb.parent_alignment.urlopen', fake_urlopen)
360-
mocker.patch('ggrecomb.parent_alignment.Entrez.efetch', fake_efetch)
362+
mocker.patch('schemarecomb.parent_alignment.urlopen', fake_urlopen)
363+
mocker.patch('schemarecomb.parent_alignment.Entrez.efetch', fake_efetch)
361364

362365

363366
@pytest.fixture

doc/source/conf.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717

1818
# -- Project information -----------------------------------------------------
1919

20-
project = 'ggrecomb'
20+
project = 'schemarecomb'
2121
copyright = '2021, Bennett Bremer'
2222
author = 'Bennett Bremer'
2323

@@ -69,7 +69,7 @@
6969

7070

7171
def setup(app):
72-
import ggrecomb as sr
72+
import schemarecomb as sr
7373
# need to assign the names here, otherwise autodoc won't document these
7474
# classes, and will instead just say 'alias of ...'
7575
sr.generate_libraries.__name__ = 'generate_libraries'

doc/source/index.rst

+31-26
Original file line numberDiff line numberDiff line change
@@ -1,44 +1,49 @@
1-
.. ggrecomb documentation master file, created by
1+
.. schemarecomb documentation master file, created by
22
sphinx-quickstart on Thu Jul 29 16:15:20 2021.
33
You can adapt this file completely to your liking, but it should at least
44
contain the root `toctree` directive.
55
6-
Welcome to ggrecomb's documentation!
7-
====================================
6+
Welcome to schemarecomb's documentation!
7+
========================================
88

9-
With ggrecomb, you can design easy-to-use recombinant protein libraries, even if you aren't a computational expert.
9+
With schemarecomb, you can design easy-to-use recombinant protein libraries, even if you aren't a computational expert.
1010

1111

1212
Usage
1313
-----
1414

1515
..
16-
ggrecomb provides the flexibility to customize recombinant library design with a number of options. Check out the :ref:`Biologist's Guide<biologists>` if you're comfortable with recombinant protein library design but want to be guided through using the library. Conversely, users with computational expertise but limited biological experience should read the :ref:`Programmer's Guide<programmers>`. Finally, if you know what you're doing, glance at the :ref:`Quickstart <quickstart>` document and look at the :ref:`Reference Manual<reference>` as needed.
16+
schemarecomb provides the flexibility to customize recombinant library design with a number of options. Check out the :ref:`Biologist's Guide<biologists>` if you're comfortable with recombinant protein library design but want to be guided through using the library. Conversely, users with computational expertise but limited biological experience should read the :ref:`Programmer's Guide<programmers>`. Finally, if you know what you're doing, glance at the :ref:`Quickstart <quickstart>` document and look at the :ref:`Reference Manual<reference>` as needed.
1717
1818

1919
Here's a simple example::
2020

21-
import ggrecomb
22-
23-
# Specify library parameters.
24-
parent_fn = 'P450_sequences.fasta' # parent_file: 3 parents
25-
num_blocks = 8 # number of blocks in chimeras
26-
min_block_len = 40 # min length of chimeric blocks
27-
max_block_len = 80 # max length of chimeric blocks
28-
29-
# Create a parent alignment and get the closest PDB structure.
30-
parents = ggrecomb.ParentSequences.from_fasta(parent_fn, auto_align=True)
31-
parents.get_PDB()
32-
33-
# Run SCHEMA-RASPP to get libraries.
34-
raspp = ggrecomb.RASPP(parents, num_blocks)
35-
libraries = raspp.vary_m_proxy(min_block_len, max_block_len)
36-
37-
# Auto-select the best library and save the resulting DNA fragments.
38-
best_lib = ggrecomb.LibSelector.auto(libraries)
39-
best_lib.save('library_dna_fragments.fasta')
40-
41-
With this simple script, we generated a three parent, eight block chimeric P450 library. The saved DNA fragments can be ordered directly from a DNA synthesis provider and assembled with `NEB's Golden Gate Assembly Kit <https://www.neb.com/products/e1601-neb-golden-gate-assembly-mix>`_. There's no worrying about adding restriction sites since ggrecomb automatically adds BsaI sites.
21+
>>> import schemarecomb as sr
22+
>>> from Bio import SeqIO
23+
>>>
24+
>>> # pytest stuff, you can ignore this.
25+
>>> getfixture('bgl3_mock_namespace')
26+
>>> tempdir = getfixture('tmpdir') #doctest: +ELLIPSIS
27+
>>> out_fn = tempdir / 'bgl3_dna_frags.fasta'
28+
>>>
29+
>>> # Create a parent alignment and get the closest PDB structure.
30+
>>> fn = 'tests/fixtures/bgl3_1-parent/bgl3_p0.fasta'
31+
>>> parents = sr.ParentSequences.from_fasta(fn)
32+
>>> parents.obtain_seqs(6, 0.7) # BLAST takes about 10 minutes.
33+
>>> parents.align() # MUSCLE takes about a minute.
34+
>>> parents.get_PDB() # BLAST takes about 10 minutes.
35+
>>>
36+
>>> # Run SCHEMA-RASPP to get libraries.
37+
>>> libraries = sr.generate_libraries(parents, 7)
38+
>>>
39+
>>> # Auto-select the best library and save the resulting DNA fragments.
40+
>>> best_lib = max(libraries, key=lambda x: x.mutation_rate - x.energy)
41+
>>>
42+
>>> # Save the generated DNA fragments.
43+
>>> SeqIO.write(best_lib.dna_blocks, out_fn, 'fasta')
44+
42
45+
46+
With this simple script, we generated a six parent, seven block chimeric beta-glucosidase library. The saved DNA fragments can be ordered directly from a DNA synthesis provider and assembled with `NEB's Golden Gate Assembly Kit <https://www.neb.com/products/e1601-neb-golden-gate-assembly-mix>`_. There's no worrying about adding restriction sites since schemarecomb automatically adds BsaI sites.
4247

4348
View the :ref:`Quickstart Guide <quickstart>` for more example scripts and the :ref:`Reference Manual <reference>` for more details on specific classes and modules.
4449

doc/source/quickstart.rst

+21-21
Original file line numberDiff line numberDiff line change
@@ -10,25 +10,25 @@ Quickstart
1010

1111
To get started, you'll need a FASTA file with one or more parent amino acid sequences. If desired, we can find additional parents with BLAST.
1212

13-
Optionally, you can provide a PDB structure file, but otherwise we'll find one that matches the first parent sequence you provided. See :class:`ggrecomb.PDBStructure` for more details.
13+
Optionally, you can provide a PDB structure file, but otherwise we'll find one that matches the first parent sequence you provided. See :class:`schemarecomb.PDBStructure` for more details.
1414

1515
.. note::
1616

17-
This guide assumes you're using Python 3.9 on Linux. If you use MacOS, things will probably work the same, but no guarantees. If you have Windows, I recommend you use the `Windows Subsystem for Linux <https://docs.microsoft.com/en-us/windows/wsl/install-win10>`_, but again, no guarantees. Please raise an issue on the ggrecomb GitHub page if you have OS difficulty.
17+
This guide assumes you're using Python 3.9 on Linux. If you use MacOS, things will probably work the same, but no guarantees. If you have Windows, I recommend you use the `Windows Subsystem for Linux <https://docs.microsoft.com/en-us/windows/wsl/install-win10>`_, but again, no guarantees. Please raise an issue on the schemarecomb GitHub page if you have OS difficulty.
1818

1919

20-
1. Install ggrecomb
21-
-------------------
20+
1. Install schemarecomb
21+
-----------------------
2222

23-
ggrecomb is available on pip::
23+
schemarecomb is available on pip::
2424

25-
$ pip install ggrecomb
25+
$ pip install schemarecomb
2626

27-
Or you can install ggrecomb from source. See :ref:`Installation<install>` for more information.
27+
Or you can install schemarecomb from source. See :ref:`Installation<install>` for more information.
2828

29-
In a Python script, import ggrecomb::
29+
In a Python script, import schemarecomb::
3030

31-
import ggrecomb
31+
import schemarecomb
3232

3333

3434
2. Make a ParentSequences
@@ -37,35 +37,35 @@ In a Python script, import ggrecomb::
3737
Load your parent FASTA file and find additional parents if needed. For this example, we'll use beta-glucosidase (bgl3, PDB ID 1GNX)::
3838

3939
parent_fn = 'bgl3.fasta'
40-
p_aln = ggrecomb.ParentSequences.from_fasta(parent_fn)
41-
p_aln.obtain_seqs(num_final_seqs=4, desired_indentity=0.7)
40+
p_aln = schemarecomb.ParentSequences.from_fasta(parent_fn)
41+
p_aln.obtain_seqs(num_final_seqs=4, desired_identity=0.7)
42+
p_aln.get_PDB()
4243

43-
After running, p_aln is a ParentSequences with four parents that have about 70% pairwise idenity.
44+
After running, p_aln is a ParentSequences with four parents that have about 70% pairwise identity and the closest PDB structure.
4445

45-
See :class:`ggrecomb.ParentSequences` for more options.
46+
See :class:`schemarecomb.ParentSequences` for more options. Viewing :class:`schemarecomb.PDBStructure` may also be helpful.
4647

4748

4849
3. Run the SCHEMA-RASPP algorithm
4950
---------------------------------
5051

5152
SCHEMA-RASPP finds potential libraries and calculates the probability of Golden Gate assembly for each::
5253

53-
raspp = ggrecomb.RASPP(p_aln, 5)
54-
libraries = raspp.vary_m_proxy(60, 100)
54+
libraries = schemarecomb.generate_libraries(p_aln, 6)
5555

56-
This finds libraries with six blocks (five breakpoints) with block sizes between 60 and 100 amino acids.
56+
This finds libraries with six blocks (five breakpoints).
5757

58-
See ggrecomb.RASPP (TODO: make this link after refactoring PA docstring) for more options.
58+
See :func:`schemarecomb.generate_libraries` for more options.
5959

6060

6161
4. Select and save a library
6262
----------------------------
6363

64-
Let RASPP automatically select a library::
64+
Select the library with highest mutation_rate - energy and save generated DNA blocks::
6565

66-
best_lib = ggrecomb.LibSelector.auto(libraries)
67-
best_lib.save('bgl3_library_dna.fasta')
66+
best_lib = max(libraries, key=lambda x: x.mutation_rate - x.energy)
67+
SeqIO.write(best_lib.dna_blocks, 'bgl3_library_dna.fasta', 'fasta')
6868

6969
The DNA fragments in FASTA format in a file named "bgl3_library_dna.fasta". These fragments are ready to order and assemble with `NEB's Golden Gate Assembly Kit <https://www.neb.com/products/e1601-neb-golden-gate-assembly-mix>`_. You can simulate the Golden Gate reaction using SnapGene.
7070

71-
See ggrecomb.LibSelector for more options.
71+
See :class:`schemarecomb.Library` for more options.

doc/source/reference.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
Reference Manual
66
****************
77

8-
.. currentmodule:: ggrecomb
8+
.. currentmodule:: schemarecomb
99

1010

1111
Primary Classes and Functions

pyproject.toml

+8-8
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[tool.poetry]
2-
name = 'ggrecomb'
2+
name = 'schemarecomb'
33
version = '0.0.4'
44
description = 'Design recombinant protein libraries for Golden Gate Assembly.'
55
authors = ['Bennett Bremer <[email protected]>']
@@ -18,23 +18,23 @@ sphinx = '^4.0.1'
1818
sphinx-autodoc-typehints = '^1.12.0'
1919

2020
[tool.poetry.urls]
21-
"Download" = "https://pypi.org/project/ggrecomb"
22-
"Source Code" = "https://github.com/RomeroLab/ggrecomb"
23-
"Documentation" = "https://ggrecomb.readthedocs.io/en/latest"
21+
"Download" = "https://pypi.org/project/schemarecomb"
22+
"Source Code" = "https://github.com/RomeroLab/schemarecomb"
23+
"Documentation" = "https://schemarecomb.readthedocs.io/en/latest"
2424

2525
[tool.pytest.ini_options]
2626
addopts = "--doctest-modules"
2727
doctest_optionflags = "ELLIPSIS"
2828
testpaths = [
2929
"tests/unit/",
30-
"src/ggrecomb/libraries.py",
31-
"src/ggrecomb/pdb_structure.py",
32-
"src/ggrecomb/parent_alignment.py",
30+
"src/schemarecomb/libraries.py",
31+
"src/schemarecomb/pdb_structure.py",
32+
"src/schemarecomb/parent_alignment.py",
3333
"README.rst"
3434
]
3535

3636
# [tool.poetry.scripts]
37-
# ggrecomb = {reference = "bin/ggrecomb", type="file"}
37+
# schemarecomb = {reference = "bin/schemarecomb", type="file"}
3838

3939
[build-system]
4040
requires = ["poetry-core>=1.1.0a5"]

src/ggrecomb/__init__.py src/schemarecomb/__init__.py

+4-4
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
from ggrecomb.pdb_structure import _PDBStructure as PDBStructure
2-
from ggrecomb.parent_alignment import _ParentSequences as ParentSequences
3-
from ggrecomb.libraries import _Library as Library
4-
from ggrecomb.optimizers import _generate_libraries as generate_libraries
1+
from schemarecomb.pdb_structure import _PDBStructure as PDBStructure
2+
from schemarecomb.parent_alignment import _ParentSequences as ParentSequences
3+
from schemarecomb.libraries import _Library as Library
4+
from schemarecomb.optimizers import _generate_libraries as generate_libraries
55

66
from . import energy_functions
77
from . import restriction_enzymes

src/ggrecomb/breakpoints.py src/schemarecomb/breakpoints.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
1-
"""Recombination site discovery in :class:`~ggrecomb.ParentSequences`."""
1+
"""Recombination site discovery in :class:`~schemarecomb.ParentSequences`."""
22

33
import itertools
44
from typing import NamedTuple
55
from typing import Optional
66

7-
from ggrecomb import ParentSequences
7+
from schemarecomb import ParentSequences
88

99

1010
class Overhang(NamedTuple):

0 commit comments

Comments
 (0)