Some python code to solve the graph partition problem in particular with its application to assigning individuals to rooms based on their cohabitation preferences i.e. where vertices are individuals, edges are cohabitation preferences, and partitions are rooms.
We assume that we have a range of possible maximal partition sizes, but that no more than a subset of them can be used. A specific use case is when we have
The preferences individuals have to being in the same room as another individual is encoded in a networkx digraph, where edges reflect preferences, and edge weights reflect the importance of those preferences. As such, each individual needs a unique identifier (like an email) to correspond to a unique node in the digraph.
The problem is set up so that individuals who want to be placed together will be, as far as possible. This does not mean that individuals who want to be placed exclusively together will be, as their is no component included for exclusivity. Exclusive room assignments need to be done manually and left out of the algorithm.
-
initialise.py contains two helper functions to find initial candidate solutions:
-
initial_random
fairly randomly allocates individuals to sets in the partition (with a bias towards using the larger sets, and allocating first the individuals from larger connected components, because it was easier to respect restrictions that way). This is the default initialisation used in the evolutionary algorithm if no other initial solution is given. -
initials_estimates
recursively uses the Fielder vector to compute sparse cuts of the preferences graph, before assigning groups are vertices to sets within the partition. In general this should perform better than random.
-
-
mip_solve.py implements the mixed integer program in PuLP and uses its default solver to find a solution. This is implemented in the function
solve_mip
which can be seen in theif __name__ == "__main__"
block at the end of the file. -
ea_solve.py implements the simple evolutionary algorithm to solve the partition problem. This is implemented in the function
solve_ea
which can be seen in theif __name__ == "__main__"
block at the end of the file. -
main.py solves the problem by finding initial estimates, then running
solve_mip
and finally potentially improving the solution by runningsolve_ea
on a dataset given as a tsv wherein nodes are specified as emails in column 0 and targets of outgoing edges are specified as emails in column 3 (for the particular use case of the 2024 Student's Retreat dataset).
The graph partition problem is, in general, NP-hard. Hence, we can use an exact solver for the NP-hard integer program PuLP to find a solution if the graph is small enough. If the graph is too large, the solver may not finish before the heat death of the universe, and we may prefer to use simulated annealing via SciPy or an evolutionary algorithm with DEAP.
We formulate it as follows. We have a set
Going through line by line, the first line is our optimisation criterion. In it,
We define
The remaining lines constrain these variables to represent what we desire:
- each vertex
$v$ must be assigned to exactly one partition$i$ - each partition
$i$ must have fewer than$C_i$ vertices assigned to it - each edge-is-cut variable
$E_e$ where$e=(u,v)$ must be$1$ if the edge is cut i.e. if$X_{u,i} \neq X_{v,i}$ for some partition$i$ - the partition-is-occupied
$Y_i$ variable must be 1 if there is any vertex in its partition$i$ - the partition-is-occupied
$Y_i$ variable must be 0 if there is no vertex in partition$i$ - the number of occupied partitions must be less than our threshold
$K$ .
We can also solve the task by implementing an evolutionary algorithm (EA). One such simple one is done here. The EA mutates candidate partitions by moving a Poisson-sampled number of individuals between different sets in the partition, subject to the constraint that the number of occupied sets does not go above a specified threshold (