us army research, development and engineering

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

U.S. ARMY RESEARCH, DEVELOPMENT AND ENGINEERING COMMAND Bonware to the Rescue: the Future Autonomous Cyber Defense Agents Dr. Alexander Kott Chief Scientist

Presented at the Conference on Applied Machine Learning for Information Security October 12, 2018, Washington DC

ARL

DD MMM YYYY



INTELLIGENT THINGS AND HUMANS IN A VERY COMPLEX WORLD

2


2


INTELLIGENT THINGS WILL BE DIVERSE

Munitions

Sensors Weapons Wearable Devices

Vehicles

Robots


3


Intelligent Things and Humans will Operate in Teams


4


Joint Human Human-Intelligent Agent Decision Making


5

5


AI and Cyber Multiply Threats and Vulnerabilities


6


INTELLIGENT THINGS ARE THREATS AND TARGETS

Pervasive connectivity and intelligence open opportunities for cross-domain attack and defense

Kinetic Directed Energy Electronic Attacks Against its Things Jamming RF Channels Destroying Fiber Channels Depriving Things of their Power Sources Electronic Eavesdropping

Deploying Malware


7


HUMANS ARE A CLASS OF INTELLIGENT THINGS

Perhaps most importantly, the enemy attacks the cognition of human Soldiers Humans will be “Intelligent Things” that are most susceptible to deceptions Humans’ will be handicapped when they are concerned (even if incorrectly) that the information is untrustworthy


8


AI will Fight Cyber Attacks


9


NATO IST-152 Research Group Agent Concept: AICA


10


IST-152 OVERVIEW NATO Research Group IST-152: Intelligent, Autonomous and Trusted Agents for Cyber Defense and Resilience NATO-designated as Unclassified, Public Releasable Motivation • • • • • • •

11

Growing focus on AI, autonomy, and issues of human trust in AI Cyber is exceptionally ripe for strong AI; autonomous yet human-managed agents for cyber operation Malware is growing in autonomy and sophistication Current manual and semi-manual approaches grossly inadequate Needed: autonomous agents that actively patrol the friendly network Needed: detect and react to hostile activities far faster than human reaction time Needed: trust and control by humans


11


CONTEXT ASSUMPTIONS • • • • • • • • •

12

a conflict with a technically sophisticated adversary, enemy software cyber agents -- malware -- will infiltrate friendly network to limit the scope, consider a single military vehicle with one or more computers one or more computers are assumed to have been compromised, communications between the vehicle and other elements of the friendly force are limited and intermittent at best. conventional centralized cyber defense is often infeasible. the human war-fighters residing on the vehicle will not have the necessary skills or time available to perform cyber defense functions locally, even more so if the vehicle is unmanned.


12


AUTONOMY BY NECESSITY

The agent (or multiple agents per thing) will: • stealthily observe/patrol the networks, • detect the enemy agents while remaining concealed, • destroy or degrade the enemy malware. • do so mostly autonomously, without support or guidance of a human expert.

13


13


LIMITED CONTROL Provisions are made to enable a remote or local human controller to observe, direct and modify the actions of the agent. However, human control is often impossible.

The agent has to plan, analyze and perform most or all of its actions autonomously. Provisions are made for the agent to collaborate with other agents However, when the communications are impaired or observed by the enemy, the agent operates alone.

14


14


ACCEPTANCE OF RISK The agent has to take destructive actions, such as deleting or quarantining certain software, autonomously Destructive actions are controlled by the rules of engagement, Allowed only on the computer where the agent resides. Still, in general, cannot be guaranteed to preserve availability or integrity of the functions and data of friendly computers.

This risk, in a military environment, has to be balanced against the death or destruction caused by the enemy if the agent’s action is not taken.

15


15


OTHER REQUIREMENTS The enemy malware, and its capabilities and TTPs, evolve rapidly. Therefore, the agent is capable of autonomous learning. The enemy malware seeks to find and destroy the agent. Therefore, the agent has means for stealth, camouflage and concealment. The agent takes measures that reduce the probability that the enemy malware will detect the agent. Agent is originally installed by a human controller or by an authorized process. Self-propagation is assumed to occur only under exceptional and wellspecified conditions of military necessity

16


16


AGENT ARCHITECTURE OVERVIEW

Agent Goal Management

World Model Goals

Sensors

Current State and History

Stealth and Security World Dynamics

World State Identifier

Self Assurance

Learning Percepts

Action Selector PlannerPredictor

Action Execution Collaboration and Negotiation Actions

Environment

17


17


EXAMPLES OF PERCEPTS

• Report from Nmap probing; • Observation of a change to the file system; • A signal that someone has interacted with a fake webpage (honey-page) or fake service.

18


18


EXAMPLES OF ACTIONS

• Remap ports; • Check integrity of the file system; • Create and deploy a fake password file, with an alarm mechanism activated when the file is accessed; • Create and deploy a fake web page or web service; • Deposit a file with “poison pill”; • Identify a suspicious file; • Sandbox a suspicious file; • Analyze the behavior of a software in the sandbox

19


19


PLANNING

20


20


ACTION SELECTION

21


21


COLLABORATION AND NEGOTIATION

22


22


LEARNING – WHAT? • The World Dynamics model. • Reward for an action, possibly as a function of a sequence of prior actions and observations, or regardless of prior sequence. • Reward for a sequence of actions (a plan), possibly as a function of a sequence of prior actions and observations, or regardless of prior sequence. • Recommendation(s) of a suitable action(s), i.e., a plan, possibly as a function of state, or of a sequence of prior actions and observations, or regardless of prior sequence. • Classification of a sequence of observations as evidence of malicious activities of certain type.

23


23


LEARNING VIA CASE-BASED REASONING percepts

actions

State assessor

Collection of experiences

Proposed plan CBR plan proposer

24


24


LEARNING VIA NN percepts

actions

State assessor

Collection of experiences

NN learner

25

NN action selector


25


LEARNING BEST ACTION FROM EXPERIENCES (which action was taken (which action was taken (which percepts were most recently) previously) seen most recently)

(which percepts were seen previously) Input layer

a1=0, a2=1, a3=0 || a1=0, a2=0, a3=1 || e1=0, e2=0, e3=1, e4=0 || e1=1, e2=0, e3=0, e4=0

Multiple hidden layers

a1=0.07, a2=.023, a3=0.79

Output layer

(rewards for taking this action as the next)

26


26


ARL Active Computer Network Defense Agent From the poster at the ARL Technical Advisory Board Review; POC – Travis Parker


27


OBJECTIVES

• • • •

Model adversary decision process Explore defender effects that can increase costs to the adversary Develop a methodology that will deny, degrade, disrupt adversary actions and expend their resources Increase non-recoupable costs for the adversary: – –

Expend adversary time and effort Force premature deployment of novel exploits

From the poster at the ARL Technical Advisory Board Review; POC – Travis Parker

28


28


ACTIONS AND RESPONSES


29


29


APPROACH • •

Model strategic interactions at the level of a single attacker – Different adversarial actions provoke different responses from the system Focus on decision making under uncertainty both of attacker actions and attacker payoffs – –

• •

Initially assume finite portfolio of attacker/defender actions, to be loosened in future work. Attempt to learn a policy to maximize attacker costs (i.e., time) across an interaction while simultaneously maximizing attacker coverage during the learning process –

•

Learn attacker behavior in response to defender actions Assume that attacker is acting to maximize their own payoffs

Managing the exploration-exploitation tradeoff

Implement selective effects: – –

Deny or redirect adversary only when attack is predicted to succeed against an asset. Disguises our defensive capabilities from adversary.


30


30


ARCHITECTURE


31


31


RESULTS

• • •

Functional Active Defense System prototypes are currently operational on testbed networks. Active Defense System has successfully defended against a modeled data exfiltration attempt. Active Defense System framework components have been employed to research the targeting of malicious network traffic.


32


32


THEORY OF MOVING TARGET DEFENSE Adaptive Cyber defense, Lead: George Mason U, ACD Technologies

Quantitative Models

Adversarial Reasoning

ACD Mechanisms effectiveness, cost, resources ACD Tactics and Strategies

attack surface goals and objectives

scheduling

Controltheoretic techniques

reinforcement learning decision models

Gametheoretic techniques

issues, metrics methods, tasks

ACD Scenarios: Driving problems, Decision contexts


3333


Modeling the Effects


34


ASSESSING IMPACT VIA MODELS restoreHost()

Create Alert

malwareDetected()

Between(1,3)h AvailabilityAlert

Restore Functionality

Malicious Activity Discovered ?

Making sense of C2 means relating it to mission impact

Yes Submit Alert

CyCS-deleteTicket() No getNextAlert()

Get Next Alert

Alert Type ?

takeHostOffline()

wipeHost()

5m

Between(1,3)h

5m

Take offline

Wipe and Restore

Put online

WipeAlert

CyCS-deleteTicket() CyCS-createTicket()

Submit Alert InfectedAlert IntegrityAlert ConfidentialityAlert getInfectionSource()

ForensicAlert

None

putHostOnline()

CyCS-deleteTicket()

Create Alert

submitAlert()

getAllInfected()

Between(1,3)h

0m

Between(2,6)h

Between(3,9)h

Trace Attack Source

Issue New Alert

Get Signature

Find other infections

No 0m Targets Available ?

Yes

Issue New Alerts

No alert present

Release Resource

Mission is a nexus of a numerous physical assets, information, activities, friendly, enemy…

Create Alert getWait()

submitAlert()

Wait to Issue Alert

Issue Alert

Start Defender

AMICA explored comprehensive models that cove infrastructure, missions, defenders and attackers

Results are insightful but modeling is labor intensive S. Noel, J. Ludwig, P. Jain, D. Johnson, R. Thomas, J. McFarland, B. King, S. Webster and B. Tello, "Analyzing Mission Impacts of Cyber Actions," in Proceedings of the NATO IST-128 Workshop on Cyber Attack Detection, Forensics and Attribution for Assessment of Mission Impact, Istanbul, 2015.


35


EXAMPLE: MODEL-DRIVEN MISSION IMPACT ASSESSMENT Analyzing Mission Impacts of Cyber Actions (AMICA) Mission is Joint Targeting Process MITRE, MIT-LL, IDA, CMU SEI Questions it can answer: • How long of an attack can the mission withstand without impact? • How long does it take the mission to recover from an attack? • What is more damaging to the mission; loss reach back availability or degradation of Air & Space Operations Center (AOC) system assets? • How many targets can be impacted by confidentiality/integrity before impacting mission?


36


AMICA CONNECTS KINETIC MISSION TO CYBER ACTIONS

Outputs

Inputs

Mission Metrics

Mission Scenario

Visualization

Cyber Scenario Attacker Cap’s

Events Logs

Defender Cap’s

Adapted by permission from the paper by S. Noel et. al., “Analyzing Mission Impacts of Cyber Actions,” presented at the NATO IST-128 Workshop on Assessment of Mission Impact, Istanbul, Turkey, June 15-17 2015 UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

37


EXTENSIBLE M&S LIBRARIES TO QUICKLY CREATE THE NEEDED ANALYSIS ENVIRONMENT Library of Mission Models Library of Infrastructure Models (Covering multiple missions) (Targeting, BMD, etc.)

Developing parameterized libraries of models Each piece of AMICA is designed to be modular and extensible to support future mission areas, cyber dependencies, attack patterns, defenses Well defined interfaces


Library of Defender Models (workflows)

Malicious Malicious Malicious Malicious

Library of Attacker Models (attack graphs)

38


MISSION MODEL Process model capturing workflow, timing, and resources for the DoD kinetic targeting process (from CJCSI 3370.01) Originally developed for EUCOM as part of Austere Challenge 10 & selected due to pedigree and maturity – 200+ steps with timing & resources (dependent on target complexity) – Covers targeting process from basic targeting development through MAAP/ATO & BDA Modified for AMICA by breaking into modules and connecting to CyCS nodes


39


ATTACKER MODEL Modeled as process simulation that captures the steps the attacker follows

getTargets()

getTargets()

– –

Assumes attacker has some knowledge of mission and access on secure network Responsive to defense actions Adjust sophistication through probability of success/detection on attack steps

Between(1,3)d Get Spear Phishing Targets

No

getNextTarget()

Targets Available ?

Targets Available ?

No

Between(30,90)m

getNextTarget()

Yes

Yes Infect Target

Goal Node Reachable ?

Target Infected ?

launchAttack()

No

Yes Between(30,90)m

Perform Network Scan

Gate By Time:AttackTime Hours

–

No

Between(15,45)m

Yes

Choose & Infect Target

launchAttack()

Yes

Compromise Goal Node

No

launchAttack()

Goal Node Compromised ?

Yes

0m launchAttack() Gate By Time:2 Hours ConfidentialityAttack

Conceptually follows ‘Cyber:14’ threat models

isInfected()

Goal Node Still Compromised ?

Yes

Perform Attack 0m

Wait for desired time to affect Mission

launchAttack()

Attack Type ?

IntegrityAttack Perform Attack 0m

0m

launchAttack()

AvailabilityAttack

–

Perform Attack

Cyber:14 study (ARCYBER, defense of Dept. of Defense Information Network (DODIN)) Contains 1000s of nodes (mainly system-steps) of integrated attacker and defender/sensor actions for server-, host-, and emailbased attacks

Initial Foothold

- Scan network for goal node (e.g. database) reachability

- Includes time for research to find targets

- Infect laterally until target node is reachable

Perform Attack

CyCS-createTicket() No 0m CyCS() - check status

Gate By Time:30 Minutes Mission Still affected ?

Affect Mission

Yes

Attack Successfull ?

Periodically check for detection Yes Create Alert

Lateral Movement

- Initial access via spear phishing campaign

No

isReachable()

Between(30,90)m

No

–

Target Infected ?

Achieve Goal - Realize an effect on confidentiality, integrity, or availability on goal node

- Maintain presence and re-infect as necessary


40

No


DEFENDER MODEL restoreHost()

Process simulation of reactive defender (not proactive) actions Multi-tiered incident response model – Defender can impact mission (by alerts, taking down machines) – Includes defender resource/personnel constraints Conceptually follows ‘Cyber:14’ defense models

Create Alert

malwareDetected()

Between(1,3)h AvailabilityAlert

Restore Functionality

Malicious Activity Discovered ?

Yes Submit Alert

CyCS-deleteTicket() No getNextAlert()

Get Next Alert

Alert Type ?

takeHostOffline()

wipeHost()

5m

Between(1,3)h

5m

Take offline

Wipe and Restore

Put online

WipeAlert

CyCS-deleteTicket() CyCS-createTicket()

Submit Alert InfectedAlert IntegrityAlert ConfidentialityAlert getInfectionSource()

ForensicAlert

None

putHostOnline()

CyCS-deleteTicket()

Create Alert

submitAlert()

getAllInfected()

Between(1,3)h

0m

Between(2,6)h

Between(3,9)h

Trace Attack Source

Issue New Alert

Get Signature

Find other infections

No 0m Targets Available ?

Yes

Issue New Alerts

Release Resource

No alert present

Create Alert getWait()

submitAlert()

Wait to Issue Alert

Issue Alert

Start Defender

Triage - Defender response triggered by IT alert - IT alerts prioritized by expected impact

Reboot, Restore, Rebuild - Mitigation based on alert type (crash, infection, corruption) - More aggressive responses may impose greater mission impact

Forensics - For more serious threats - Trace attack to source, build signatures - Submit new alerts for all compromised machines


41


Overcoming the Limitations of Today’s AI


42


GAPS IN TODAY’S AI

Gaps  AI & ML with small samples, dirty data, high clutter Learning in Complex Data Environments

 AI & ML with highly heterogeneous data  Adversarial AI & ML in contested, deceptive environment

 Distributed AI & ML with limited communications  AI & ML computing with extremely low size, weight, and power, time available (SWaPT)

Resource-constrained AI Processing at the Point-of-Need

 Explainability & programmability for AI & ML  AI & ML with integrated quantitative models

43


Generalizable & Predictable AI

43


ADAPTIVE REAL-TIME LEARNING DURING A MISSION

Outcomes: • Theory of learning under non-stationary distributions with limited a priori training and data • Working concepts of control and perception tailored to online learning

Research Areas:

ti m e

• Learning for High-Speed Navigation in Unknown Environments • Online Unsupervised Percept Modeling • Stable and Risk-Aware Learning and Adaptation

l e a r n

Payoff: • Dynamics and perceptual modeling in real-time to support maneuver in complex, dynamic environments • Stable training/learning in the field in the presence of noise/adversaries 44 UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

44


HUMANS HELPING MACHINES LEARN Human-in-the-loop reinforcement learning system to provide improve decision-making in dynamically-changing environments, where data availability and computational resources are limited. Key Technical Reinforcement-learning AI solution using Demonstration: real-time human input to solve the unsolved Atari™ Bowling task Human-guided

Traditional Approaches

Philosophical similarities with OpenAI/DeepMind research, but with ability to run in real-time and claimed significant improvements in sample efficiency - Jack Clark, Director of

Reconceiving human-technology roles in the future Battlefield

Strategy and Communications at Open AI. Key Publications:

• •

Converged on state-of-the-art solution for previously unsolved Atari™ Bowling task in 15 minutes Learned AI policy outperforms expert humans

Warnell et al. Deep TAMER: Interactive Agent Shaping in High Dimensional State Spaces. AAAI-18. Koppel et al. Policy Eval. in Infinite MDPs: Eff. Kernel Gradient Temporal Difference. AAAI-18.

Potential: • Broader applications of AI through solutions for unstructured environments with ill-defined rewards. • Faster, more optimal AI solutions with less data • Rapidly-adaptable Human-AI teams that learn from human understanding of high-level goals UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

45


SEMANTIC CONCEPTS LEARNING

Outcomes: • Algorithms for rapid learning of semantic understanding of environment from visual information • Techniques to train tactical robot behaviors from human demonstrations

Research Areas: • Online unsupervised scene segmentation algorithms • Inverse optimal control to enable generalizable learning-from-demonstration of robot behaviors

Payoff: • Learning behaviors that cue from semantic representations of the environment will yield better adaptation to new environments/scenarios • Rapid training of new autonomous vehicle behaviors in the field 46 UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

46


Robust Human State Detection

Developed robust Deep Learning (DL) system for detecting human states across large, diverse data collections Key Technical Demonstration:

Task-driven human state identification in closed-loop systems

Using Task-derived Human State Information to Improve Human-Machine Collaboration Key Publications:

Neurophysiological interpretation of task-based states

• •

Discovering task-based Human States

•

Uncovering task-dependent structure in data automatically from DL model 1st place in international machine learning challenge to detect human interest from EEG across multiple domains State-of-the-art performance across 5 different BCI tasks

Lawhern, et al (2016). EEGNet: A Compact Convolutional Network for EEG-based Brain-Computer Interfaces. Gordon, et al (2017). Real-World BCI: Cross-Domain Learning and Practical Applications Solon, et al (2017) A Generalized Deep Learning Framework for CrossDomain Learning in Brain-Computer Interfaces. Solon, et al (2017) Deep Learning Approaches for P300 Classification in Image Triage: Applications to the NAILS Task

Potential: • Detecting high level and potential sub-conscious cognitive constructs and incorporating information in closedloop systems for improved human-machine collaboration UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

47


ROBUST INTERACTIONS VIA NATURAL LANGUAGE DIALOGUE

Outcomes: • Dialogue clarification techniques enabling agents to prompt for clarification & ask for help • Models of human-robot team communication in uncertain environments

Research Areas: • Explainable Dialogue Interaction in Resource-Constrained Environments • Language Grounding for Robotics (Symbol Grounding) • Multimedia Exploitation of Multimodal Information Streams (Speech, Video, Mapping) H: What do you see in front of you?

R: I think I see some tables.

Tool

Teammate

Payoff: • Naturalistic, bi-directional communication between Soldiers and semi-autonomous agents • Reduced time for reconnaissance missions with “heads-up, hands-free” language interface 48 UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

48


The Science of Adversarial Interactions


49


SCIENCE OF CYBER DECEPTION Goal: To create leap ahead cyber deception methods that will improve cyber security certainty and resource utilization through establishment of a joint computationalcognitive model that incorporates an adversaries' cognitive state in order to successfully manipulate and mislead them Current State: • Passive honey pot with no adversarial knowledge • Lack of formal way of creating an optimized deception scheme based on learning the target's cognitive state and capability in order to effectively manipulate New Approaches: • Establish model for estimating and tracking adversarial cognitive states and decision processes • Define metrics quantifying information effectiveness in driving cognitive state change under the deception context • Build an integrated framework of deception composition and projection methods to successfully manipulate adversaries’ cognitive state and decision-making process to our advantage. UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

50


THEORY OF MOVING TARGET DEFENSE Goal: To establish methods of cyber adaptation that provide both resiliency and survivability by minimizing an attackers’ ability to understand and attack our systems through the employment of proactive multi-level obfuscation and migration strategies. Current State: Ad-hoc approaches as point solutions with no adversarial modeling New Approach: • Analytical models and performance metrics capturing the dynamics between cyber attack and defense • Unified framework for analyzing resiliency, agility, and performance trade-offs • Redefine system resiliency and robustness under an adversarial setting, with the incorporation of attack/response dynamics.


51


References


52


REFERENCES Boddy, M. S., Gohde, J., Haigh, T., & Harp, S. A. (2005, June). Course of Action Generation for Cyber Security Using Classical Planning. In ICAPS (pp. 12-21). Chen, Jessie YC, Shan G. Lakhmani, Kimberly Stowers, Anthony R. Selkowitz, Julia L. Wright, and Michael Barnes. "Situation awareness-based agent transparency and human-autonomy teaming effectiveness." Theoretical issues in ergonomics science 19, no. 3 (2018): 259-282. Evans, A. William, Matthew Marge, Ethan Stump, Garrett Warnell, Joseph Conroy, Douglas Summers-Stay, and David Baran. "The future of human robot teams in the army: factors affecting a model of human-system dialogue towards greater team collaboration." In Advances in Human Factors in Robots and Unmanned Systems, pp. 197-209. Springer, Cham, 2017. Fabio De Gaspari, Sushil Jajodia, Luigi V. Mancini, Agostino Panico. “AHEAD: A New Architecture for Active Defense,” SafeConfig’16, October 24 2016, Vienna, Austria Kott, Alexander, David S. Alberts, and Cliff Wang. "Will Cybersecurity Dictate the Outcome of Future Wars?." Computer 48.12 (2015): 98-101. Kott, A., Singh, R., McEneaney, W. M., & Milks, W. (2011). Hypothesis-driven information fusion in adversarial, deceptive environments. Information Fusion, 12(2), 131-144. Kott, Alexander, Ananthram Swami, and Bruce J. West. "The internet of battle things." Computer 49, no. 12 (2016): 7075. Kott, Alexander. "Challenges and Characteristics of Intelligent Autonomy for Internet of Battle Things in Highly Adversarial Environments." arXiv preprint arXiv:1803.11256 (2018). Kott, Alexander, Luigi V. Mancini, Paul Théron, Martin Drašar, Edlira Dushku, Heiko Günther, Markus Kont et al. "Initial Reference Architecture of an Intelligent Autonomous Agent for Cyber Defense." arXiv preprint arXiv:1803.10664 (2018). Lawhern, Vernon, Amelia Solon, Nicholas Waytowich, Stephen M. Gordon, Chou Hung, and Brent J. Lance. "EEGNet: a compact convolutional neural network for EEG-based brain--computer interfaces." Journal of neural engineering (2018). Marathe, Amar R., Jason S. Metcalfe, Brent J. Lance, Jamie R. Lukos, David Jangraw, Kuan-Ting Lai, Jonathan Touryan et al. "The privileged sensing framework: A principled approach to improved human-autonomy integration." Theoretical Issues in Ergonomics Science 19, no. 3 (2018): 283-320. UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

53


REFERENCES (CONT.) Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602. Muttik, I., Good Viruses. Evaluating the Risks. Online at https://www.defcon.org/images/defcon-16/dc16presentations/defcon-16-muttik.pdf Rasch, Robert, Alexander Kott, and Kenneth D. Forbus. "AI on the battlefield: An experimental exploration." AAAI/IAAI. 2002. Sarraute, Carlos, Gerardo Richarte, and Jorge Lucángeli Obes. "An algorithm to find optimal attack paths in nondeterministic scenarios." Proceedings of the 4th ACM workshop on Security and artificial intelligence. ACM, 2011. Sarraute, Carlos, Olivier Buffet, and Jörg Hoffmann. "POMDPs make better hackers: Accounting for uncertainty in penetration testing." Twenty-Sixth AAAI Conference on Artificial Intelligence. 2012. Stytz, Martin R., Dale E. Lichtblau, and Sheila B. Banks. Toward using intelligent agents to detect, assess, and counter cyberattacks in a network-centric environment. INSTITUTE FOR DEFENSE ANALYSES ALEXANDRIA VA, 2005. Theron, P., Alexander Kott, Martin Drašar, Krzysztof Rzadca, Benoît LeBlanc, Mauno Pihelgas, Luigi Mancini, Agostino Panico, "Towards an Active, Autonomous and Intelligent Cyber Defense of Military Systems: the NATO AICA Reference Architecture," In Proceedings of the ICMCIS Conference, Warsaw, Poland, May 2018 Warnell, Garrett, Nicholas Waytowich, Vernon Lawhern, and Peter Stone. "Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces." arXiv preprint arXiv:1709.10163 (2017). Wigness, Maggie, and John G. Rogers. "Unsupervised Semantic Scene Labeling for Streaming Data." In Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 4612-4621. 2017.


54


Questions?


55