Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Systematics Normalization Project #291

Closed
wants to merge 135 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
135 commits
Select commit Hold shift + click to select a range
308ccba
Beginning of new OEE tracker
emilydolson May 10, 2018
904516d
Actually track stats; not compiling yet
emilydolson May 10, 2018
8ff0fa4
OEE stats work
emilydolson May 11, 2018
ea10926
OEE output file works
emilydolson May 11, 2018
778ba78
Fixed pointer problem
emilydolson May 12, 2018
2cd6791
Add skeletonize helper
emilydolson May 12, 2018
817d845
Compiles with Avida
emilydolson May 14, 2018
06bafe8
Merge branch 'master' of github.com:devosoft/Empirical into OEE
emilydolson May 24, 2018
b452d29
Added a few math tools + updates to work with latest version of emscr…
emilydolson Aug 30, 2018
b4b1559
Pull in latest changes from master
emilydolson Aug 30, 2018
3cda74e
Preliminary fix for synchronous generations (there were problems when…
emilydolson Sep 2, 2018
d027295
Improving OEE tests
emilydolson Sep 3, 2018
466ddec
finished OEE metric tests
emilydolson Sep 3, 2018
2d215f7
Fixed diversity
emilydolson Sep 4, 2018
3c3351c
Time and memory improvements to OEE
emilydolson Sep 4, 2018
8357dd8
Make sure to lump identical skeletons together
emilydolson Sep 5, 2018
1dd6b02
Added destruction time/canopy filter
emilydolson Sep 8, 2018
e287dae
Fix diversity
emilydolson Sep 8, 2018
8f251d9
Use bloom filter for novelty to keep RAM under control
emilydolson Sep 25, 2018
5171c87
Prune top of phylogeny
emilydolson Sep 26, 2018
e1999c5
Added snapshoting to Systematics class.
amlalejini Oct 8, 2018
7ed2026
Fixed Systematics snapshot field descriptions.
amlalejini Oct 8, 2018
8eb4f70
Merge pull request #2 from amlalejini/sys-snapshot
emilydolson Oct 8, 2018
09a45e5
Skeletonize during taxon's lifetime to make sure we use the right fit…
emilydolson Oct 8, 2018
343080f
Merge branch 'OEE' of github.com:emilydolson/Empirical into OEE
emilydolson Oct 8, 2018
338a892
Improve interface for MODES
emilydolson Oct 12, 2018
644d333
Add methods to weighted graph
emilydolson Oct 18, 2018
c3758d4
Began library of distance metrics
emilydolson Oct 18, 2018
8b53070
Temporary work-around
emilydolson Oct 18, 2018
d36befd
Added fixes to graph and OEE
emilydolson Oct 19, 2018
fe6c43b
Add ability to display current value of range inputs
emilydolson Oct 30, 2018
3257a04
Improvemetns to sytematics
emilydolson Nov 12, 2018
69e8cb9
Fix bug when orgs die of natural causes
emilydolson Nov 19, 2018
2a9e18b
Little clean-ups
emilydolson Jan 17, 2019
40a62a1
Add not equals operator
emilydolson Jan 25, 2019
59379bd
Added bounds-checking on histograms
emilydolson Jan 25, 2019
6270797
Made niche width explicit
emilydolson Jan 26, 2019
ae2aa0d
Being allowing the use of WorldPosition with systematics
emilydolson Jan 26, 2019
642acd5
Merge in master
emilydolson Jan 26, 2019
aa99dcb
Fix discrepancy in data node name
emilydolson Jan 27, 2019
8ec1ced
Fixed circular dependency
emilydolson Jan 27, 2019
f96e934
Fix const fitness function issue
emilydolson Jan 27, 2019
1e59808
Little fixes to make everything work efficiently
emilydolson Jan 28, 2019
9b96455
Data tracking improvements
emilydolson Jan 29, 2019
14fd55b
Start actually using WorldPosition in systematics
emilydolson Apr 10, 2019
3bc36e8
Merge
emilydolson Apr 10, 2019
fbdb99e
Actually save merge fix
emilydolson Apr 10, 2019
5ae6826
Fix syntax errors ; add max depth
emilydolson Apr 23, 2019
e861c75
Fix missing braces
emilydolson Apr 23, 2019
1e3782d
Merge
emilydolson Apr 23, 2019
b986cfb
It would help if I actually saved the file
emilydolson Apr 23, 2019
f16e8fa
Need to be explicit about position tracking
emilydolson Apr 23, 2019
38a6dc1
Styles can also control CSS classes
emilydolson Apr 23, 2019
9627273
Merge in new argmanager
emilydolson Apr 24, 2019
d17e407
Add preliminary automatic config web interface
emilydolson Apr 24, 2019
79fe0c2
Remove uneccesary prin statements
emilydolson Apr 24, 2019
5fc16fb
Make it possible to exclude settings
emilydolson Apr 25, 2019
3df1199
Print group labels
emilydolson Apr 25, 2019
1fa815c
Improve guess at good parameter ranges
emilydolson May 10, 2019
d814bf7
Add preliminary spatial stats
emilydolson May 10, 2019
c42deb6
Add Sackin index
emilydolson May 10, 2019
98ad638
Made systematics manager doubly-linked, added CollessLike metric
emilydolson May 13, 2019
5a0d2fd
Not storing ancestors no longer breaks everything
emilydolson May 14, 2019
0d23ad4
Make phylogeny tests more sstringent
emilydolson May 15, 2019
b1673db
CollessLike index currently doesn't agree with value reported in paper
emilydolson May 15, 2019
94e26a1
More precise e
emilydolson May 20, 2019
2e3e824
Fixed CollessLike metric
emilydolson May 21, 2019
dbdd023
Remove print statements
emilydolson May 21, 2019
454d092
mrca can get pruned if tree is being cleared
emilydolson May 23, 2019
12ec2d4
Add method to get all lines from a file
emilydolson Jun 13, 2019
00fbd16
Register new systematics metrics
emilydolson Jan 29, 2020
273eb7a
Minor bug fixes
emilydolson Jan 30, 2020
df82c1b
Can't remove parent until repro done
emilydolson Feb 11, 2020
9684946
Merge branch 'memic_model' of github.com:emilydolson/Empirical into m…
emilydolson Feb 11, 2020
402bb05
Fix seg-fault on exctinction in synchronous world
emilydolson Feb 13, 2020
514fa39
Merge in updates from my postdoc year
emilydolson May 26, 2020
8f30304
Fix vector being passed by reference
emilydolson May 26, 2020
a398dfd
Revert World_structure.h
emilydolson May 26, 2020
a195f40
Revert World_select.h
emilydolson May 26, 2020
47e6bef
Add comment back in
emilydolson May 26, 2020
9300561
Revert .gitignore
emilydolson May 26, 2020
ca6e845
change name of count to avoid conflict with Lexer
emilydolson May 26, 2020
72e96cb
Revert Ptr.h
emilydolson May 26, 2020
83245fa
Revert assert.h
emilydolson May 26, 2020
c2cb58f
Revert init.h
emilydolson May 26, 2020
0da2d16
Put systematics tests in the right place
emilydolson May 26, 2020
8cb86d8
Merge the two test_systematics.cc files
emilydolson May 26, 2020
e445f36
Switch has back to pass-by-reference
emilydolson May 27, 2020
c34c09f
Fix image example to work in light Slider.h not existing anymore
emilydolson May 27, 2020
a19359d
Fix web tests to work with latest version of all dependencies
emilydolson May 27, 2020
35f7ecf
Merge branch 'memic_merge' of github.com:emilydolson/Empirical into m…
emilydolson May 27, 2020
a3b7049
Source emscripten in travis so it can actually be used
emilydolson May 27, 2020
390645a
whether this is 5 or 5px seems to vary by browser
emilydolson May 27, 2020
0b12e3e
Does emcc not compile to native anymore?
emilydolson May 27, 2020
4a66b34
Fix typo
emilydolson May 27, 2020
fc1d8ae
Split Travis config into multiple parallel builds
emilydolson May 28, 2020
7172d0c
Compile web examples with web tests and native with native
emilydolson May 28, 2020
ee67fd7
if WASM=0 we can't use g4 flag
emilydolson May 28, 2020
02db9e6
Merge branch 'master' into memic_merge
emilydolson May 29, 2020
9c15c57
Make show value default to false
emilydolson May 30, 2020
c53aae8
Merge branch 'memic_merge' of github.com:emilydolson/Empirical into m…
emilydolson May 30, 2020
f728a3f
Merge in testing changes
emilydolson May 30, 2020
4898614
Swap in updated systematics tests
emilydolson May 30, 2020
c33e776
Add additional tests that didn't transfer over; fix resource tests
emilydolson May 30, 2020
0885e50
Add more tests
emilydolson May 30, 2020
cd675dd
Merge branch 'master' into memic_merge
emilydolson May 30, 2020
351b8cb
Add more vecter_utils tests
emilydolson May 31, 2020
56ad817
Add tests for new DataNode features
emilydolson May 31, 2020
bcd70e0
Add skew, kurtosis, and stdev tests
emilydolson May 31, 2020
7c18a0c
Merge branch 'memic_merge' of github.com:emilydolson/Empirical into m…
emilydolson May 31, 2020
85401ee
Fix missing template
emilydolson May 31, 2020
54a8c79
Removed duplicate count function
emilydolson May 31, 2020
9ad6112
Add more tests
emilydolson May 31, 2020
f9a434b
fix tempalte
emilydolson May 31, 2020
3a424d3
fix tempalte
emilydolson May 31, 2020
0b182dc
GetNodes should be in base graph class
emilydolson May 31, 2020
94055a7
Rename get site-specific fitness function to avoid conflict
emilydolson May 31, 2020
173eee1
Add more tests
emilydolson May 31, 2020
e0a54c2
adding systematics documentation
abbywlsn Jun 4, 2020
1c14505
miniphylotrees makes very simple trees, still needs to incorporate sy…
abbywlsn Jun 16, 2020
2824062
miniphylotrees prints data to csv file now
abbywlsn Jun 18, 2020
d0b9be3
added systematics manager tools, still broken though
abbywlsn Jun 18, 2020
01e3808
new changes to miniphylotrees
abbywlsn Jun 22, 2020
5b47ce7
miniphylotrees works and generates data, concerned about compiler dif…
abbywlsn Jun 25, 2020
10ff45e
new csv with percentile data and miniphylotrees is correctly outputti…
abbywlsn Jun 26, 2020
cf2bbe7
original file with no systematics implementation for example purposes
abbywlsn Jun 26, 2020
7e22970
new miniphylo changes
abbywlsn Jul 1, 2020
c9ad62a
changes made to systematics with percentile functions
abbywlsn Jul 1, 2020
1836ef4
new improvements and updated systematics functions
abbywlsn Jul 23, 2020
14dcc46
cleaned up miniphylotrees
abbywlsn Jul 28, 2020
28294af
updated systematics function, still needs work
abbywlsn Jul 30, 2020
72c442e
cleaning up directories
abbywlsn Jul 31, 2020
516735b
more directory cleaning
abbywlsn Jul 31, 2020
f1c61a3
cleaned up systematics.h
abbywlsn Jul 31, 2020
d0aca02
cleaning up some stuff in GenTrees
abbywlsn Jul 31, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .idea/.gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions .idea/.name

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions .idea/Empirical.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

202 changes: 202 additions & 0 deletions doc/library/Evolve/systematics.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,202 @@
.. SystematicsDocumentation documentation master file, created by
sphinx-quickstart on Thu May 28 16:40:07 2020.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.


Documentation for Systematics
====================================================

.. toctree::
:maxdepth: 2
:caption: Contents:

modules

Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

Systematics
===========

Systematics is a classification of organisms based on evolutionary (phylogenetic) relationships.

***************
Systematics.h
***************

This file is part of Empirical and is located in ``Empirical/source/Evolve/Systematics.h``

The systematics manager is used to track genotypes, species, clades, or lineages of organisms in a world.

Systematics allows a user to generate data to form phylogenetic trees.

The program can be run with different levels of abstraction, meaning the data can be generated by position,
phenotype, or even genotype if you have a lot of RAM.

**Note**: You are responsible for filling in templates! Adding the template just gives you a place to store your data.

Taxon Specifics
===============

* Taxon - a group of species with similar characteristics
* Genotypes are the most commonly used Taxon

A user can see the type and number of mutations that ocurred to bring about a taxon.

Some information that can be accessed is:

* taxon ID# ``GetID()``
* details of organisms in the taxon ``GetInfo()``
* pointer to the parent group (will return a null pointer if the species was injected) ``GetParent()``
* how many organisms currently exist in the group and how many total organisms have ever existed in the group ``GetNumOrgs()`` or ``GetTotOrgs()``
* how many direct offspring groups exist from this group and how many total extant offspring that exist from this taxa ``GetTotalOffspring()``
* how deep in the tree the node you are examining is ``GetDepth()``
* when did this taxon first appear in the population ``GetOriginationTime()``
* when did the taxon leave the population ``GetDestructionTime()``

New organisms are added to the taxon using ``AddOrg()``.
New offspring are added to the taxon with ``AddOffspring()`` .

Organisms are removed with ``RemoveOrg()``.
Offspring are removed with ``RemoveOffspring()`` .

If there are no more remaining organisms or offspring the taxon will deactivate.


General Systematics Data
=========================

Things that systematics can tell you about a phylogeny and how to access them:

* Are we tracking a synchronous population? ``GetTrackSynchronous()`` ``SetTrackSynchronous()``
* Are we storing all taxa that are still alive in the population? ``GetStoreActive()`` ``SetStoreActive()``
* Are we storing all taxa that are ancestors of the living organisms in the population? ``GetStoreAncestors()`` ``SetStoreAncestors()``
* Are we storing all taxa that have died out, as have all of their descendants? ``GetStoreOutside()`` ``SetStoreOutside()``
* Are we storing any taxa types that have died out? ``GetArchive()`` ``SetArchive()``
* Are we storing the positions of taxa? ``GetStorePosition()`` ``SetStorePosition()``
* How many living organisms are currently being tracked? ``GetTotalOrgs()``
* How many independent trees are being tracked? ``GetNumRoots()``
* What ID will the next taxon have? ``GetNextID()``
* What is the average phylogenetic depth of organisms in the population? ``GetAveDepth()``
* To find the most recent common ancestor (MRCA) use ``GetMRCA()`` or ``GetMRCADepth()`` to find the distance to the MRCA.

**The systematics class tracks the relationships among all organisms bases on the INFO_TYPE
provided. If an offspring has the same value for INFO_TYPE as its parent, it is grouped into
the same taxon. Otherwise a new Taxon is created and the old one is used as its parent in
the phylogeny. If the provided INFO_TYPE is the organism's genome, a traditional phylogeny
is formed, with genotypes. If the organism's behavior/task set is used, then organisms are
grouped by phenotypes. If the organism's position is used, the evolutionary path through
space is tracked. Any other aspect of organisms can be tracked this way as well.**


**Generally, all living organisms' taxa should be tracked and ancestral organisms' taxa should be maintained for lineage.
However, not all dead taxa should be maintained, it gets too big.**

***************************
Diversity and Distinction
***************************

Systematics.h can also be used to find phylogenetic diversity for all extant taxa in the tree,
assuming all edges from parent to child have a length of one.

When all branch lengths are equal, the phylogenetic diversity is the number of internal nodes plus the number of
extant taxa minus 1.

You can also find how distinct a specific taxa is from the rest of the population
based on the amount of unique evolutionary history that it represents.

*****************************
Synchronous Populations
*****************************

A synchronous population is a population in which each generation is a discrete time point
and a completely new set of individual organisms is created for each generation. This means that
an organism and its parent can never exist at the same time.

An asynchronous population is the opposite, where generations overlap and organisms reproduce
when they are ready.

In the systematics manager, synchronicity is controlled with

``GetTrackSynchronous()`` which returns true or false and
``SetTrackSynchronous(input true or false)`` which allows you to use a synchronous or asynchronous population.


Using the Systematics Manager
==============================

The systematics.h file alone will not give you any useful information. You must use a test file in conjunction with the systematics manager
in order to see output.

To retreive some results we will use the file Systematics.cc
which is located in Empirical/tests/Evolve/Systematics.cc.

To compile to code use this command in the tests directory::

make test-Systematics


**********
Output
**********

Terminal Output::

AddOrg 25 (id1, no parent)

AddOrg -10 (id2; parent id1)

AddOrg 26 (id3, parent id1)

AddOrg 27 (id4, parent id2)

The first line of output shows the first organism in the examined phylogeny. This organism is added with AddOrg
and is assigned an ID of id1. The organism has no parent, as seen in the farthest column of output, meaning that
organism id1 will be the root of the phylogeny and produce offspring.

If we then look at the first number is parenthesis, we see the second organism with and ID of id2. Id2 is a direct descendant of the id1 organism.

Lastly, if we look at id4, we see that its parent is id2, meaning that we have created another node in the tree
as the organisms move through generations, producing new offspring.

The terminal output should also include this section::

Active count: 11 [18|1,0|17] [17|1,2|11] [15|1,0|null] [12|1,1|11] [16|1,0|11] [11|1,3|null] [6|1,0|5] [19|1,0|17] [5|1,1|null] [4|1,0|null] [3|1,0|null]


The 11 at the front refers to the number of total taxa in the phylogeny.

If we look at the first set of numbers: ``[18|1, 0|17]``

The first number in brackets, 18 in this case, is the taxon of the organism where
a mutation occurred. 1, the next number, is the number of mutations that led to this branch.
0 is the number of offspring from this organism. Lastly, 17 is the id of the parent organism.

As for the second set ``[17|1, 2|11]`` -- this is taxon 17, one mutation occurred,
id17 had 2 offspring, and its parent is id11.

The last portion of the output has several lines of 3 numbers.

It should look like this: ::

1 : 0 : -1
2 : 0 : -1
3 : 0 : 0
4 : 0 : 0
5 : 0 : 0
6 : 0 : 0
7 : 0 : 0
8 : 0 : 987
9 : 0 : 986
10 : 0 : 987
11 : 0 : 988
12 : 0 : 987
13 : 0 : 988

The first number is the organism number. The second number is the position of the organism.
The third number is the fitness of the organism at position 0.
15 changes: 15 additions & 0 deletions source/Evolve/NK.h
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,21 @@ namespace emp {
return total;
}

/// Get the fitness of a site in a bitstring
// (pass by value so can be modified.)
double GetSiteFitness(size_t n, BitVector genome) const {
emp_assert(genome.GetSize() == N, genome.GetSize(), N);

// Use a double-length genome to easily handle wrap-around.
genome.Resize(N*2);
genome |= (genome << N);

size_t mask = emp::MaskLow<size_t>(K+1);
const size_t cur_val = (genome >> n).GetUInt(0) & mask;
return GetFitness(n, cur_val);
}


void SetState(size_t n, size_t state, double in_fit) { landscape[n][state] = in_fit; }

void RandomizeStates(Random & random, size_t num_states=1) {
Expand Down
Loading