-
Notifications
You must be signed in to change notification settings - Fork 2
Simplify MalSimulator and create wrappers for other interfaces #87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
aebfcb8 to
753ce55
Compare
502b688 to
fe27e46
Compare
ce89c22 to
950fed4
Compare
kasanari
reviewed
Jan 15, 2025
Hoclor
reviewed
Jan 17, 2025
nkakouros
reviewed
Jan 17, 2025
d1b0cf8 to
2fe03ed
Compare
Collaborator
|
Throwing this into the mix: # Author: Jakob Nyberg, 2025
from collections.abc import Callable
import logging
from typing import Any
import numpy as np
import numpy.typing as npt
logger = logging.getLogger(__name__)
Array = npt.NDArray[np.int32]
BoolArray = npt.NDArray[np.bool_]
def get_new_targets(
discovered_targets: npt.NDArray[np.int32],
mask: tuple[npt.NDArray[np.bool_], npt.NDArray[np.bool_]],
) -> tuple[npt.NDArray[np.int32], npt.NDArray[np.int32]]:
attack_surface = mask[1]
surface_indexes = np.flatnonzero(attack_surface)
new_targets = np.array(
[idx for idx in surface_indexes if idx not in discovered_targets],
dtype=np.int32,
)
return new_targets, surface_indexes
def move_target_to_back(
current_target: np.int32 | None,
targets: npt.NDArray[np.int32],
attack_surface: npt.NDArray[np.int32],
) -> tuple[npt.NDArray[np.int32], np.int32 | None]:
"""
If the current target was not compromised this turn, put it
on the bottom of the stack and focus on next target instead
"""
if not current_target:
return targets, current_target
if current_target in attack_surface:
targets = np.concatenate((current_target, targets[:-1]))
return targets, targets[-1]
return targets, current_target
def choose_target(
targets: npt.NDArray[np.int32],
attack_surface: npt.NDArray[np.int32],
) -> tuple[npt.NDArray[np.int32], np.int32, bool]:
# targets that have not been removed from the attack surface by another agent
valid_targets = np.array(
[t for t in targets if t in attack_surface], dtype=np.int32
)
if len(valid_targets) == 0:
return valid_targets, np.int32(0), True
return valid_targets[:-1], valid_targets[-1], False
def compute_action(
permute_func: Callable[[Array], Array],
action_func: Callable[
[np.int32 | None, Array, Array], tuple[Array, np.int32 | None, bool]
],
add_targets_func: Callable[[Array, Array], Array],
):
def _compute_action(
targets: Array,
mask: tuple[BoolArray, BoolArray],
current_target: np.int32 | None,
):
new_targets, surface_indexes = get_new_targets(targets, mask)
targets, current_target, done = action_func(
current_target,
add_targets_func(targets, permute_func(new_targets)),
surface_indexes,
)
action = 0 if done else 1
if action == 0:
logger.debug(
'Attacker agent does not have any valid targets and will terminate'
)
logger.debug(f'Attacker targets: {targets}')
logger.debug(f'Attacker current target: {current_target}')
logger.debug(f'Attacker action: {action}')
return targets, action, current_target
return _compute_action
def create_permute_func(seed: int | None, randomize: bool) -> Callable[[Array], Array]:
s = seed if seed else np.random.SeedSequence().entropy
rng = np.random.default_rng(s) if randomize else None
return rng.permutation if rng else lambda x: x
class BreadthFirstAttacker:
def __init__(self, agent_config: dict[str, Any]) -> None:
self.current_target: np.int32 | None = None
self.targets: npt.NDArray[np.int32] = np.array([], dtype=np.int32)
permute_func = create_permute_func(
agent_config.get('seed', None), agent_config.get('randomize', False)
)
self.compute_action = compute_action(
permute_func, self._action_func, self._add_new_targets_func
)
def __call__(
self,
_: dict[str, Any],
mask: tuple[npt.NDArray[np.bool_], npt.NDArray[np.bool_]],
):
self.targets, action, self.current_target = self.compute_action(
self.targets, mask, self.current_target
)
return (action, self.current_target)
@staticmethod
def _action_func(
current_target: np.int32 | None,
targets: Array,
surface_indexes: Array,
):
targets, current_target = move_target_to_back(
current_target, targets, surface_indexes
)
return choose_target(targets, surface_indexes)
@staticmethod
def _add_new_targets_func(targets: Array, new_targets: Array):
new_targets = np.flip(new_targets) # to comply with the original implementation
return np.concatenate([new_targets, targets])
class DepthFirstAttacker:
def __init__(self, agent_config: dict[str, Any]) -> None:
self.current_target: np.int32 | None = None
self.targets: npt.NDArray[np.int32] = np.array([], dtype=np.int32)
permute_func = create_permute_func(
agent_config.get('seed', None), agent_config.get('randomize', False)
)
self.compute_action = compute_action(
permute_func, self._action_func, self._add_new_targets_func
)
def __call__(
self,
_: dict[str, Any],
mask: tuple[npt.NDArray[np.bool_], npt.NDArray[np.bool_]],
):
self.targets, action, self.current_target = self.compute_action(
self.targets, mask, self.current_target
)
return (action, self.current_target)
@staticmethod
def _action_func(
current_target: np.int32 | None,
targets: Array,
surface_indexes: Array,
):
# keep working on a target unless it has been removed from the attack surface
return (
choose_target(targets, surface_indexes)
if current_target not in surface_indexes
else (targets, current_target, False)
)
@staticmethod
def _add_new_targets_func(targets: Array, new_targets: Array):
# add new targets to the front of the list, so that the agent works on the latest targets first
return np.concatenate([targets, new_targets]) |
nkakouros
reviewed
Jan 20, 2025
nkakouros
reviewed
Jan 20, 2025
This was referenced Jan 22, 2025
d208cd3 to
45a1182
Compare
kasanari
reviewed
Jan 29, 2025
nkakouros
reviewed
Feb 6, 2025
nkakouros
reviewed
Feb 6, 2025
nkakouros
reviewed
Feb 6, 2025
nkakouros
reviewed
Feb 6, 2025
nkakouros
reviewed
Feb 6, 2025
- test_step: required attack surface to be set - test_pz: had to make sure ParallelEnv returned only alive agents on .agents
…emovals separately. Have both Attacker and Defender AgentStates use the common performed_nodes property. Cleanup return values of attacker and defenders steps and update the AgentState within the _attacker/_defender_step as much as possible.
…ict with agent states - All are private - @Property MalSimulator.agent_states is still the public interface to fetch AgentStates
…r_agents only return alive agents
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a rather big redesign of the simulator.
MalSimulator is now very much simplified:
Things related to building up observations (specifically the ParallelEnv part that before was in the MalSimulator) has now been moved to a wrapper
malsim_vectorized_obs_env.py(naming here is an open question).This wrapper builds up state step by step from the performed actions returned from MalSimulator.step(actions).
Logging for that wrapper was also factored into its own module (should it be?)
Added the class DecisionAgent (these are agents like BFS/DFS + keyboard + in the future more advanced heuristics).
This was to create a common interface for working with DecisionAgents.
All our DecisionAgents are still not done, they are still tailored towards the ParallelEnv/MalSim Vectorized Obs Env, but the goal is to have them work with the regular MalSimulator (on attack graphs state).
Any number of agents can now be specified in the scenario file. This is a cleaner solution in my opinion.
Let me know if something does not make sense.