us army research, development and engineering

1 downloads 0 Views 3MB Size Report
the Future Autonomous Cyber Defense Agents. Presented at ..... Adaptive Cyber defense, Lead: George Mason U,. THEORY ... AMICA explored comprehensive.
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

U.S. ARMY RESEARCH, DEVELOPMENT AND ENGINEERING COMMAND Bonware to the Rescue: the Future Autonomous Cyber Defense Agents Dr. Alexander Kott Chief Scientist

Presented at the Conference on Applied Machine Learning for Information Security October 12, 2018, Washington DC

ARL

DD MMM YYYY

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

INTELLIGENT THINGS AND HUMANS IN A VERY COMPLEX WORLD

2

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

2

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

INTELLIGENT THINGS WILL BE DIVERSE

Munitions

Sensors Weapons Wearable Devices

Vehicles

Robots

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

3

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

Intelligent Things and Humans will Operate in Teams

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

4

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

Joint Human Human-Intelligent Agent Decision Making

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

5

5

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

AI and Cyber Multiply Threats and Vulnerabilities

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

6

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

INTELLIGENT THINGS ARE THREATS AND TARGETS

Pervasive connectivity and intelligence open opportunities for cross-domain attack and defense

Kinetic Directed Energy Electronic Attacks Against its Things Jamming RF Channels Destroying Fiber Channels Depriving Things of their Power Sources Electronic Eavesdropping

Deploying Malware

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

7

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

HUMANS ARE A CLASS OF INTELLIGENT THINGS

Perhaps most importantly, the enemy attacks the cognition of human Soldiers Humans will be “Intelligent Things” that are most susceptible to deceptions Humans’ will be handicapped when they are concerned (even if incorrectly) that the information is untrustworthy

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

8

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

AI will Fight Cyber Attacks

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

9

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

NATO IST-152 Research Group Agent Concept: AICA

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

10

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

IST-152 OVERVIEW NATO Research Group IST-152: Intelligent, Autonomous and Trusted Agents for Cyber Defense and Resilience NATO-designated as Unclassified, Public Releasable Motivation • • • • • • •

11

Growing focus on AI, autonomy, and issues of human trust in AI Cyber is exceptionally ripe for strong AI; autonomous yet human-managed agents for cyber operation Malware is growing in autonomy and sophistication Current manual and semi-manual approaches grossly inadequate Needed: autonomous agents that actively patrol the friendly network Needed: detect and react to hostile activities far faster than human reaction time Needed: trust and control by humans

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

11

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

CONTEXT ASSUMPTIONS • • • • • • • • •

12

a conflict with a technically sophisticated adversary, enemy software cyber agents -- malware -- will infiltrate friendly network to limit the scope, consider a single military vehicle with one or more computers one or more computers are assumed to have been compromised, communications between the vehicle and other elements of the friendly force are limited and intermittent at best. conventional centralized cyber defense is often infeasible. the human war-fighters residing on the vehicle will not have the necessary skills or time available to perform cyber defense functions locally, even more so if the vehicle is unmanned.

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

12

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

AUTONOMY BY NECESSITY

The agent (or multiple agents per thing) will: • stealthily observe/patrol the networks, • detect the enemy agents while remaining concealed, • destroy or degrade the enemy malware. • do so mostly autonomously, without support or guidance of a human expert.

13

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

13

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

LIMITED CONTROL Provisions are made to enable a remote or local human controller to observe, direct and modify the actions of the agent. However, human control is often impossible.

The agent has to plan, analyze and perform most or all of its actions autonomously. Provisions are made for the agent to collaborate with other agents However, when the communications are impaired or observed by the enemy, the agent operates alone.

14

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

14

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

ACCEPTANCE OF RISK The agent has to take destructive actions, such as deleting or quarantining certain software, autonomously Destructive actions are controlled by the rules of engagement, Allowed only on the computer where the agent resides. Still, in general, cannot be guaranteed to preserve availability or integrity of the functions and data of friendly computers.

This risk, in a military environment, has to be balanced against the death or destruction caused by the enemy if the agent’s action is not taken.

15

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

15

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

OTHER REQUIREMENTS The enemy malware, and its capabilities and TTPs, evolve rapidly. Therefore, the agent is capable of autonomous learning. The enemy malware seeks to find and destroy the agent. Therefore, the agent has means for stealth, camouflage and concealment. The agent takes measures that reduce the probability that the enemy malware will detect the agent. Agent is originally installed by a human controller or by an authorized process. Self-propagation is assumed to occur only under exceptional and wellspecified conditions of military necessity

16

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

16

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

AGENT ARCHITECTURE OVERVIEW

Agent Goal Management

World Model Goals

Sensors

Current State and History

Stealth and Security World Dynamics

World State Identifier

Self Assurance

Learning Percepts

Action Selector PlannerPredictor

Action Execution Collaboration and Negotiation Actions

Environment

17

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

17

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

EXAMPLES OF PERCEPTS

• Report from Nmap probing; • Observation of a change to the file system; • A signal that someone has interacted with a fake webpage (honey-page) or fake service.

18

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

18

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

EXAMPLES OF ACTIONS

• Remap ports; • Check integrity of the file system; • Create and deploy a fake password file, with an alarm mechanism activated when the file is accessed; • Create and deploy a fake web page or web service; • Deposit a file with “poison pill”; • Identify a suspicious file; • Sandbox a suspicious file; • Analyze the behavior of a software in the sandbox

19

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

19

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

PLANNING

20

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

20

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

ACTION SELECTION

21

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

21

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

COLLABORATION AND NEGOTIATION

22

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

22

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

LEARNING – WHAT? • The World Dynamics model. • Reward for an action, possibly as a function of a sequence of prior actions and observations, or regardless of prior sequence. • Reward for a sequence of actions (a plan), possibly as a function of a sequence of prior actions and observations, or regardless of prior sequence. • Recommendation(s) of a suitable action(s), i.e., a plan, possibly as a function of state, or of a sequence of prior actions and observations, or regardless of prior sequence. • Classification of a sequence of observations as evidence of malicious activities of certain type.

23

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

23

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

LEARNING VIA CASE-BASED REASONING percepts

actions

State assessor

Collection of experiences

Proposed plan CBR plan proposer

24

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

24

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

LEARNING VIA NN percepts

actions

State assessor

Collection of experiences

NN learner

25

NN action selector

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

25

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

LEARNING BEST ACTION FROM EXPERIENCES (which action was taken (which action was taken (which percepts were most recently) previously) seen most recently)

(which percepts were seen previously) Input layer

a1=0, a2=1, a3=0 || a1=0, a2=0, a3=1 || e1=0, e2=0, e3=1, e4=0 || e1=1, e2=0, e3=0, e4=0

Multiple hidden layers

a1=0.07, a2=.023, a3=0.79

Output layer

(rewards for taking this action as the next)

26

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

26

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

ARL Active Computer Network Defense Agent From the poster at the ARL Technical Advisory Board Review; POC – Travis Parker

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

27

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

OBJECTIVES

• • • •

Model adversary decision process Explore defender effects that can increase costs to the adversary Develop a methodology that will deny, degrade, disrupt adversary actions and expend their resources Increase non-recoupable costs for the adversary: – –

Expend adversary time and effort Force premature deployment of novel exploits

From the poster at the ARL Technical Advisory Board Review; POC – Travis Parker

28

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

28

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

ACTIONS AND RESPONSES

From the poster at the ARL Technical Advisory Board Review; POC – Travis Parker

29

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

29

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

APPROACH • •

Model strategic interactions at the level of a single attacker – Different adversarial actions provoke different responses from the system Focus on decision making under uncertainty both of attacker actions and attacker payoffs – –

• •

Initially assume finite portfolio of attacker/defender actions, to be loosened in future work. Attempt to learn a policy to maximize attacker costs (i.e., time) across an interaction while simultaneously maximizing attacker coverage during the learning process –



Learn attacker behavior in response to defender actions Assume that attacker is acting to maximize their own payoffs

Managing the exploration-exploitation tradeoff

Implement selective effects: – –

Deny or redirect adversary only when attack is predicted to succeed against an asset. Disguises our defensive capabilities from adversary.

From the poster at the ARL Technical Advisory Board Review; POC – Travis Parker

30

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

30

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

ARCHITECTURE

From the poster at the ARL Technical Advisory Board Review; POC – Travis Parker

31

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

31

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

RESULTS

• • •

Functional Active Defense System prototypes are currently operational on testbed networks. Active Defense System has successfully defended against a modeled data exfiltration attempt. Active Defense System framework components have been employed to research the targeting of malicious network traffic.

From the poster at the ARL Technical Advisory Board Review; POC – Travis Parker

32

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

32

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

THEORY OF MOVING TARGET DEFENSE Adaptive Cyber defense, Lead: George Mason U, ACD Technologies

Quantitative Models

Adversarial Reasoning

ACD Mechanisms effectiveness, cost, resources ACD Tactics and Strategies

attack surface goals and objectives

scheduling

Controltheoretic techniques

reinforcement learning decision models

Gametheoretic techniques

issues, metrics methods, tasks

ACD Scenarios: Driving problems, Decision contexts

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

3333

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

Modeling the Effects

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

34

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

ASSESSING IMPACT VIA MODELS restoreHost()

Create Alert

malwareDetected()

Between(1,3)h AvailabilityAlert

Restore Functionality

Malicious Activity Discovered ?

Making sense of C2 means relating it to mission impact

Yes Submit Alert

CyCS-deleteTicket() No getNextAlert()

Get Next Alert

Alert Type ?

takeHostOffline()

wipeHost()

5m

Between(1,3)h

5m

Take offline

Wipe and Restore

Put online

WipeAlert

CyCS-deleteTicket() CyCS-createTicket()

Submit Alert InfectedAlert IntegrityAlert ConfidentialityAlert getInfectionSource()

ForensicAlert

None

putHostOnline()

CyCS-deleteTicket()

Create Alert

submitAlert()

getAllInfected()

Between(1,3)h

0m

Between(2,6)h

Between(3,9)h

Trace Attack Source

Issue New Alert

Get Signature

Find other infections

No 0m Targets Available ?

Yes

Issue New Alerts

No alert present

Release Resource

Mission is a nexus of a numerous physical assets, information, activities, friendly, enemy…

Create Alert getWait()

submitAlert()

Wait to Issue Alert

Issue Alert

Start Defender

AMICA explored comprehensive models that cove infrastructure, missions, defenders and attackers

Results are insightful but modeling is labor intensive S. Noel, J. Ludwig, P. Jain, D. Johnson, R. Thomas, J. McFarland, B. King, S. Webster and B. Tello, "Analyzing Mission Impacts of Cyber Actions," in Proceedings of the NATO IST-128 Workshop on Cyber Attack Detection, Forensics and Attribution for Assessment of Mission Impact, Istanbul, 2015.

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

35

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

EXAMPLE: MODEL-DRIVEN MISSION IMPACT ASSESSMENT Analyzing Mission Impacts of Cyber Actions (AMICA) Mission is Joint Targeting Process MITRE, MIT-LL, IDA, CMU SEI Questions it can answer: • How long of an attack can the mission withstand without impact? • How long does it take the mission to recover from an attack? • What is more damaging to the mission; loss reach back availability or degradation of Air & Space Operations Center (AOC) system assets? • How many targets can be impacted by confidentiality/integrity before impacting mission?

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

36

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

AMICA CONNECTS KINETIC MISSION TO CYBER ACTIONS

Outputs

Inputs

Mission Metrics

Mission Scenario

Visualization

Cyber Scenario Attacker Cap’s

Events Logs

Defender Cap’s

Adapted by permission from the paper by S. Noel et. al., “Analyzing Mission Impacts of Cyber Actions,” presented at the NATO IST-128 Workshop on Assessment of Mission Impact, Istanbul, Turkey, June 15-17 2015 UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

37

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

EXTENSIBLE M&S LIBRARIES TO QUICKLY CREATE THE NEEDED ANALYSIS ENVIRONMENT Library of Mission Models Library of Infrastructure Models (Covering multiple missions) (Targeting, BMD, etc.)

Developing parameterized libraries of models Each piece of AMICA is designed to be modular and extensible to support future mission areas, cyber dependencies, attack patterns, defenses Well defined interfaces

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

Library of Defender Models (workflows)

Malicious Malicious Malicious Malicious

Library of Attacker Models (attack graphs)

38

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

MISSION MODEL Process model capturing workflow, timing, and resources for the DoD kinetic targeting process (from CJCSI 3370.01) Originally developed for EUCOM as part of Austere Challenge 10 & selected due to pedigree and maturity – 200+ steps with timing & resources (dependent on target complexity) – Covers targeting process from basic targeting development through MAAP/ATO & BDA Modified for AMICA by breaking into modules and connecting to CyCS nodes

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

39

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

ATTACKER MODEL Modeled as process simulation that captures the steps the attacker follows

getTargets()

getTargets()

– –

Assumes attacker has some knowledge of mission and access on secure network Responsive to defense actions Adjust sophistication through probability of success/detection on attack steps

Between(1,3)d Get Spear Phishing Targets

No

getNextTarget()

Targets Available ?

Targets Available ?

No

Between(30,90)m

getNextTarget()

Yes

Yes Infect Target

Goal Node Reachable ?

Target Infected ?

launchAttack()

No

Yes Between(30,90)m

Perform Network Scan

Gate By Time:AttackTime Hours



No

Between(15,45)m

Yes

Choose & Infect Target

launchAttack()

Yes

Compromise Goal Node

No

launchAttack()

Goal Node Compromised ?

Yes

0m launchAttack() Gate By Time:2 Hours ConfidentialityAttack

Conceptually follows ‘Cyber:14’ threat models

isInfected()

Goal Node Still Compromised ?

Yes

Perform Attack 0m

Wait for desired time to affect Mission

launchAttack()

Attack Type ?

IntegrityAttack Perform Attack 0m

0m

launchAttack()

AvailabilityAttack



Perform Attack

Cyber:14 study (ARCYBER, defense of Dept. of Defense Information Network (DODIN)) Contains 1000s of nodes (mainly system-steps) of integrated attacker and defender/sensor actions for server-, host-, and emailbased attacks

Initial Foothold

- Scan network for goal node (e.g. database) reachability

- Includes time for research to find targets

- Infect laterally until target node is reachable

Perform Attack

CyCS-createTicket() No 0m CyCS() - check status

Gate By Time:30 Minutes Mission Still affected ?

Affect Mission

Yes

Attack Successfull ?

Periodically check for detection Yes Create Alert

Lateral Movement

- Initial access via spear phishing campaign

No

isReachable()

Between(30,90)m

No



Target Infected ?

Achieve Goal - Realize an effect on confidentiality, integrity, or availability on goal node

- Maintain presence and re-infect as necessary

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

40

No

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

DEFENDER MODEL restoreHost()

Process simulation of reactive defender (not proactive) actions Multi-tiered incident response model – Defender can impact mission (by alerts, taking down machines) – Includes defender resource/personnel constraints Conceptually follows ‘Cyber:14’ defense models

Create Alert

malwareDetected()

Between(1,3)h AvailabilityAlert

Restore Functionality

Malicious Activity Discovered ?

Yes Submit Alert

CyCS-deleteTicket() No getNextAlert()

Get Next Alert

Alert Type ?

takeHostOffline()

wipeHost()

5m

Between(1,3)h

5m

Take offline

Wipe and Restore

Put online

WipeAlert

CyCS-deleteTicket() CyCS-createTicket()

Submit Alert InfectedAlert IntegrityAlert ConfidentialityAlert getInfectionSource()

ForensicAlert

None

putHostOnline()

CyCS-deleteTicket()

Create Alert

submitAlert()

getAllInfected()

Between(1,3)h

0m

Between(2,6)h

Between(3,9)h

Trace Attack Source

Issue New Alert

Get Signature

Find other infections

No 0m Targets Available ?

Yes

Issue New Alerts

Release Resource

No alert present

Create Alert getWait()

submitAlert()

Wait to Issue Alert

Issue Alert

Start Defender

Triage - Defender response triggered by IT alert - IT alerts prioritized by expected impact

Reboot, Restore, Rebuild - Mitigation based on alert type (crash, infection, corruption) - More aggressive responses may impose greater mission impact

Forensics - For more serious threats - Trace attack to source, build signatures - Submit new alerts for all compromised machines

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

41

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

Overcoming the Limitations of Today’s AI

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

42

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

GAPS IN TODAY’S AI

Gaps  AI & ML with small samples, dirty data, high clutter Learning in Complex Data Environments

 AI & ML with highly heterogeneous data  Adversarial AI & ML in contested, deceptive environment

 Distributed AI & ML with limited communications  AI & ML computing with extremely low size, weight, and power, time available (SWaPT)

Resource-constrained AI Processing at the Point-of-Need

 Explainability & programmability for AI & ML  AI & ML with integrated quantitative models

43

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

Generalizable & Predictable AI

43

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

ADAPTIVE REAL-TIME LEARNING DURING A MISSION

Outcomes: • Theory of learning under non-stationary distributions with limited a priori training and data • Working concepts of control and perception tailored to online learning

Research Areas:

ti m e

• Learning for High-Speed Navigation in Unknown Environments • Online Unsupervised Percept Modeling • Stable and Risk-Aware Learning and Adaptation

l e a r n

Payoff: • Dynamics and perceptual modeling in real-time to support maneuver in complex, dynamic environments • Stable training/learning in the field in the presence of noise/adversaries 44 UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

44

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

HUMANS HELPING MACHINES LEARN Human-in-the-loop reinforcement learning system to provide improve decision-making in dynamically-changing environments, where data availability and computational resources are limited. Key Technical Reinforcement-learning AI solution using Demonstration: real-time human input to solve the unsolved Atari™ Bowling task Human-guided

Traditional Approaches

Philosophical similarities with OpenAI/DeepMind research, but with ability to run in real-time and claimed significant improvements in sample efficiency - Jack Clark, Director of

Reconceiving human-technology roles in the future Battlefield

Strategy and Communications at Open AI. Key Publications:

• •

Converged on state-of-the-art solution for previously unsolved Atari™ Bowling task in 15 minutes Learned AI policy outperforms expert humans

Warnell et al. Deep TAMER: Interactive Agent Shaping in High Dimensional State Spaces. AAAI-18. Koppel et al. Policy Eval. in Infinite MDPs: Eff. Kernel Gradient Temporal Difference. AAAI-18.

Potential: • Broader applications of AI through solutions for unstructured environments with ill-defined rewards. • Faster, more optimal AI solutions with less data • Rapidly-adaptable Human-AI teams that learn from human understanding of high-level goals UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

45

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

SEMANTIC CONCEPTS LEARNING

Outcomes: • Algorithms for rapid learning of semantic understanding of environment from visual information • Techniques to train tactical robot behaviors from human demonstrations

Research Areas: • Online unsupervised scene segmentation algorithms • Inverse optimal control to enable generalizable learning-from-demonstration of robot behaviors

Payoff: • Learning behaviors that cue from semantic representations of the environment will yield better adaptation to new environments/scenarios • Rapid training of new autonomous vehicle behaviors in the field 46 UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

46

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

Robust Human State Detection

Developed robust Deep Learning (DL) system for detecting human states across large, diverse data collections Key Technical Demonstration:

Task-driven human state identification in closed-loop systems

Using Task-derived Human State Information to Improve Human-Machine Collaboration Key Publications:

Neurophysiological interpretation of task-based states

• •

Discovering task-based Human States



Uncovering task-dependent structure in data automatically from DL model 1st place in international machine learning challenge to detect human interest from EEG across multiple domains State-of-the-art performance across 5 different BCI tasks

Lawhern, et al (2016). EEGNet: A Compact Convolutional Network for EEG-based Brain-Computer Interfaces. Gordon, et al (2017). Real-World BCI: Cross-Domain Learning and Practical Applications Solon, et al (2017) A Generalized Deep Learning Framework for CrossDomain Learning in Brain-Computer Interfaces. Solon, et al (2017) Deep Learning Approaches for P300 Classification in Image Triage: Applications to the NAILS Task

Potential: • Detecting high level and potential sub-conscious cognitive constructs and incorporating information in closedloop systems for improved human-machine collaboration UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

47

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

ROBUST INTERACTIONS VIA NATURAL LANGUAGE DIALOGUE

Outcomes: • Dialogue clarification techniques enabling agents to prompt for clarification & ask for help • Models of human-robot team communication in uncertain environments

Research Areas: • Explainable Dialogue Interaction in Resource-Constrained Environments • Language Grounding for Robotics (Symbol Grounding) • Multimedia Exploitation of Multimodal Information Streams (Speech, Video, Mapping) H: What do you see in front of you?

R: I think I see some tables.

Tool

Teammate

Payoff: • Naturalistic, bi-directional communication between Soldiers and semi-autonomous agents • Reduced time for reconnaissance missions with “heads-up, hands-free” language interface 48 UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

48

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

The Science of Adversarial Interactions

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

49

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

SCIENCE OF CYBER DECEPTION Goal: To create leap ahead cyber deception methods that will improve cyber security certainty and resource utilization through establishment of a joint computationalcognitive model that incorporates an adversaries' cognitive state in order to successfully manipulate and mislead them Current State: • Passive honey pot with no adversarial knowledge • Lack of formal way of creating an optimized deception scheme based on learning the target's cognitive state and capability in order to effectively manipulate New Approaches: • Establish model for estimating and tracking adversarial cognitive states and decision processes • Define metrics quantifying information effectiveness in driving cognitive state change under the deception context • Build an integrated framework of deception composition and projection methods to successfully manipulate adversaries’ cognitive state and decision-making process to our advantage. UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

50

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

THEORY OF MOVING TARGET DEFENSE Goal: To establish methods of cyber adaptation that provide both resiliency and survivability by minimizing an attackers’ ability to understand and attack our systems through the employment of proactive multi-level obfuscation and migration strategies. Current State: Ad-hoc approaches as point solutions with no adversarial modeling New Approach: • Analytical models and performance metrics capturing the dynamics between cyber attack and defense • Unified framework for analyzing resiliency, agility, and performance trade-offs • Redefine system resiliency and robustness under an adversarial setting, with the incorporation of attack/response dynamics.

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

51

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

References

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

52

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

REFERENCES Boddy, M. S., Gohde, J., Haigh, T., & Harp, S. A. (2005, June). Course of Action Generation for Cyber Security Using Classical Planning. In ICAPS (pp. 12-21). Chen, Jessie YC, Shan G. Lakhmani, Kimberly Stowers, Anthony R. Selkowitz, Julia L. Wright, and Michael Barnes. "Situation awareness-based agent transparency and human-autonomy teaming effectiveness." Theoretical issues in ergonomics science 19, no. 3 (2018): 259-282. Evans, A. William, Matthew Marge, Ethan Stump, Garrett Warnell, Joseph Conroy, Douglas Summers-Stay, and David Baran. "The future of human robot teams in the army: factors affecting a model of human-system dialogue towards greater team collaboration." In Advances in Human Factors in Robots and Unmanned Systems, pp. 197-209. Springer, Cham, 2017. Fabio De Gaspari, Sushil Jajodia, Luigi V. Mancini, Agostino Panico. “AHEAD: A New Architecture for Active Defense,” SafeConfig’16, October 24 2016, Vienna, Austria Kott, Alexander, David S. Alberts, and Cliff Wang. "Will Cybersecurity Dictate the Outcome of Future Wars?." Computer 48.12 (2015): 98-101. Kott, A., Singh, R., McEneaney, W. M., & Milks, W. (2011). Hypothesis-driven information fusion in adversarial, deceptive environments. Information Fusion, 12(2), 131-144. Kott, Alexander, Ananthram Swami, and Bruce J. West. "The internet of battle things." Computer 49, no. 12 (2016): 7075. Kott, Alexander. "Challenges and Characteristics of Intelligent Autonomy for Internet of Battle Things in Highly Adversarial Environments." arXiv preprint arXiv:1803.11256 (2018). Kott, Alexander, Luigi V. Mancini, Paul Théron, Martin Drašar, Edlira Dushku, Heiko Günther, Markus Kont et al. "Initial Reference Architecture of an Intelligent Autonomous Agent for Cyber Defense." arXiv preprint arXiv:1803.10664 (2018). Lawhern, Vernon, Amelia Solon, Nicholas Waytowich, Stephen M. Gordon, Chou Hung, and Brent J. Lance. "EEGNet: a compact convolutional neural network for EEG-based brain--computer interfaces." Journal of neural engineering (2018). Marathe, Amar R., Jason S. Metcalfe, Brent J. Lance, Jamie R. Lukos, David Jangraw, Kuan-Ting Lai, Jonathan Touryan et al. "The privileged sensing framework: A principled approach to improved human-autonomy integration." Theoretical Issues in Ergonomics Science 19, no. 3 (2018): 283-320. UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

53

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

REFERENCES (CONT.) Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602. Muttik, I., Good Viruses. Evaluating the Risks. Online at https://www.defcon.org/images/defcon-16/dc16presentations/defcon-16-muttik.pdf Rasch, Robert, Alexander Kott, and Kenneth D. Forbus. "AI on the battlefield: An experimental exploration." AAAI/IAAI. 2002. Sarraute, Carlos, Gerardo Richarte, and Jorge Lucángeli Obes. "An algorithm to find optimal attack paths in nondeterministic scenarios." Proceedings of the 4th ACM workshop on Security and artificial intelligence. ACM, 2011. Sarraute, Carlos, Olivier Buffet, and Jörg Hoffmann. "POMDPs make better hackers: Accounting for uncertainty in penetration testing." Twenty-Sixth AAAI Conference on Artificial Intelligence. 2012. Stytz, Martin R., Dale E. Lichtblau, and Sheila B. Banks. Toward using intelligent agents to detect, assess, and counter cyberattacks in a network-centric environment. INSTITUTE FOR DEFENSE ANALYSES ALEXANDRIA VA, 2005. Theron, P., Alexander Kott, Martin Drašar, Krzysztof Rzadca, Benoît LeBlanc, Mauno Pihelgas, Luigi Mancini, Agostino Panico, "Towards an Active, Autonomous and Intelligent Cyber Defense of Military Systems: the NATO AICA Reference Architecture," In Proceedings of the ICMCIS Conference, Warsaw, Poland, May 2018 Warnell, Garrett, Nicholas Waytowich, Vernon Lawhern, and Peter Stone. "Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces." arXiv preprint arXiv:1709.10163 (2017). Wigness, Maggie, and John G. Rogers. "Unsupervised Semantic Scene Labeling for Streaming Data." In Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 4612-4621. 2017.

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

54

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

Questions?

UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE

55