the Future Autonomous Cyber Defense Agents. Presented at ..... Adaptive Cyber defense, Lead: George Mason U,. THEORY ... AMICA explored comprehensive.
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
U.S. ARMY RESEARCH, DEVELOPMENT AND ENGINEERING COMMAND Bonware to the Rescue: the Future Autonomous Cyber Defense Agents Dr. Alexander Kott Chief Scientist
Presented at the Conference on Applied Machine Learning for Information Security October 12, 2018, Washington DC
ARL
DD MMM YYYY
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
INTELLIGENT THINGS AND HUMANS IN A VERY COMPLEX WORLD
2
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
2
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
INTELLIGENT THINGS WILL BE DIVERSE
Munitions
Sensors Weapons Wearable Devices
Vehicles
Robots
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
3
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
Intelligent Things and Humans will Operate in Teams
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
4
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
Joint Human Human-Intelligent Agent Decision Making
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
5
5
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
AI and Cyber Multiply Threats and Vulnerabilities
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
6
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
INTELLIGENT THINGS ARE THREATS AND TARGETS
Pervasive connectivity and intelligence open opportunities for cross-domain attack and defense
Kinetic Directed Energy Electronic Attacks Against its Things Jamming RF Channels Destroying Fiber Channels Depriving Things of their Power Sources Electronic Eavesdropping
Deploying Malware
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
7
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
HUMANS ARE A CLASS OF INTELLIGENT THINGS
Perhaps most importantly, the enemy attacks the cognition of human Soldiers Humans will be “Intelligent Things” that are most susceptible to deceptions Humans’ will be handicapped when they are concerned (even if incorrectly) that the information is untrustworthy
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
8
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
AI will Fight Cyber Attacks
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
9
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
NATO IST-152 Research Group Agent Concept: AICA
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
10
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
IST-152 OVERVIEW NATO Research Group IST-152: Intelligent, Autonomous and Trusted Agents for Cyber Defense and Resilience NATO-designated as Unclassified, Public Releasable Motivation • • • • • • •
11
Growing focus on AI, autonomy, and issues of human trust in AI Cyber is exceptionally ripe for strong AI; autonomous yet human-managed agents for cyber operation Malware is growing in autonomy and sophistication Current manual and semi-manual approaches grossly inadequate Needed: autonomous agents that actively patrol the friendly network Needed: detect and react to hostile activities far faster than human reaction time Needed: trust and control by humans
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
11
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
CONTEXT ASSUMPTIONS • • • • • • • • •
12
a conflict with a technically sophisticated adversary, enemy software cyber agents -- malware -- will infiltrate friendly network to limit the scope, consider a single military vehicle with one or more computers one or more computers are assumed to have been compromised, communications between the vehicle and other elements of the friendly force are limited and intermittent at best. conventional centralized cyber defense is often infeasible. the human war-fighters residing on the vehicle will not have the necessary skills or time available to perform cyber defense functions locally, even more so if the vehicle is unmanned.
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
12
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
AUTONOMY BY NECESSITY
The agent (or multiple agents per thing) will: • stealthily observe/patrol the networks, • detect the enemy agents while remaining concealed, • destroy or degrade the enemy malware. • do so mostly autonomously, without support or guidance of a human expert.
13
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
13
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
LIMITED CONTROL Provisions are made to enable a remote or local human controller to observe, direct and modify the actions of the agent. However, human control is often impossible.
The agent has to plan, analyze and perform most or all of its actions autonomously. Provisions are made for the agent to collaborate with other agents However, when the communications are impaired or observed by the enemy, the agent operates alone.
14
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
14
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
ACCEPTANCE OF RISK The agent has to take destructive actions, such as deleting or quarantining certain software, autonomously Destructive actions are controlled by the rules of engagement, Allowed only on the computer where the agent resides. Still, in general, cannot be guaranteed to preserve availability or integrity of the functions and data of friendly computers.
This risk, in a military environment, has to be balanced against the death or destruction caused by the enemy if the agent’s action is not taken.
15
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
15
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
OTHER REQUIREMENTS The enemy malware, and its capabilities and TTPs, evolve rapidly. Therefore, the agent is capable of autonomous learning. The enemy malware seeks to find and destroy the agent. Therefore, the agent has means for stealth, camouflage and concealment. The agent takes measures that reduce the probability that the enemy malware will detect the agent. Agent is originally installed by a human controller or by an authorized process. Self-propagation is assumed to occur only under exceptional and wellspecified conditions of military necessity
16
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
16
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
AGENT ARCHITECTURE OVERVIEW
Agent Goal Management
World Model Goals
Sensors
Current State and History
Stealth and Security World Dynamics
World State Identifier
Self Assurance
Learning Percepts
Action Selector PlannerPredictor
Action Execution Collaboration and Negotiation Actions
Environment
17
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
17
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
EXAMPLES OF PERCEPTS
• Report from Nmap probing; • Observation of a change to the file system; • A signal that someone has interacted with a fake webpage (honey-page) or fake service.
18
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
18
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
EXAMPLES OF ACTIONS
• Remap ports; • Check integrity of the file system; • Create and deploy a fake password file, with an alarm mechanism activated when the file is accessed; • Create and deploy a fake web page or web service; • Deposit a file with “poison pill”; • Identify a suspicious file; • Sandbox a suspicious file; • Analyze the behavior of a software in the sandbox
19
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
19
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
PLANNING
20
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
20
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
ACTION SELECTION
21
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
21
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
COLLABORATION AND NEGOTIATION
22
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
22
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
LEARNING – WHAT? • The World Dynamics model. • Reward for an action, possibly as a function of a sequence of prior actions and observations, or regardless of prior sequence. • Reward for a sequence of actions (a plan), possibly as a function of a sequence of prior actions and observations, or regardless of prior sequence. • Recommendation(s) of a suitable action(s), i.e., a plan, possibly as a function of state, or of a sequence of prior actions and observations, or regardless of prior sequence. • Classification of a sequence of observations as evidence of malicious activities of certain type.
23
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
23
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
LEARNING VIA CASE-BASED REASONING percepts
actions
State assessor
Collection of experiences
Proposed plan CBR plan proposer
24
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
24
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
LEARNING VIA NN percepts
actions
State assessor
Collection of experiences
NN learner
25
NN action selector
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
25
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
LEARNING BEST ACTION FROM EXPERIENCES (which action was taken (which action was taken (which percepts were most recently) previously) seen most recently)
(which percepts were seen previously) Input layer
a1=0, a2=1, a3=0 || a1=0, a2=0, a3=1 || e1=0, e2=0, e3=1, e4=0 || e1=1, e2=0, e3=0, e4=0
Multiple hidden layers
a1=0.07, a2=.023, a3=0.79
Output layer
(rewards for taking this action as the next)
26
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
26
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
ARL Active Computer Network Defense Agent From the poster at the ARL Technical Advisory Board Review; POC – Travis Parker
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
27
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
OBJECTIVES
• • • •
Model adversary decision process Explore defender effects that can increase costs to the adversary Develop a methodology that will deny, degrade, disrupt adversary actions and expend their resources Increase non-recoupable costs for the adversary: – –
Expend adversary time and effort Force premature deployment of novel exploits
From the poster at the ARL Technical Advisory Board Review; POC – Travis Parker
28
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
28
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
ACTIONS AND RESPONSES
From the poster at the ARL Technical Advisory Board Review; POC – Travis Parker
29
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
29
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
APPROACH • •
Model strategic interactions at the level of a single attacker – Different adversarial actions provoke different responses from the system Focus on decision making under uncertainty both of attacker actions and attacker payoffs – –
• •
Initially assume finite portfolio of attacker/defender actions, to be loosened in future work. Attempt to learn a policy to maximize attacker costs (i.e., time) across an interaction while simultaneously maximizing attacker coverage during the learning process –
•
Learn attacker behavior in response to defender actions Assume that attacker is acting to maximize their own payoffs
Managing the exploration-exploitation tradeoff
Implement selective effects: – –
Deny or redirect adversary only when attack is predicted to succeed against an asset. Disguises our defensive capabilities from adversary.
From the poster at the ARL Technical Advisory Board Review; POC – Travis Parker
30
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
30
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
ARCHITECTURE
From the poster at the ARL Technical Advisory Board Review; POC – Travis Parker
31
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
31
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
RESULTS
• • •
Functional Active Defense System prototypes are currently operational on testbed networks. Active Defense System has successfully defended against a modeled data exfiltration attempt. Active Defense System framework components have been employed to research the targeting of malicious network traffic.
From the poster at the ARL Technical Advisory Board Review; POC – Travis Parker
32
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
32
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
THEORY OF MOVING TARGET DEFENSE Adaptive Cyber defense, Lead: George Mason U, ACD Technologies
Quantitative Models
Adversarial Reasoning
ACD Mechanisms effectiveness, cost, resources ACD Tactics and Strategies
attack surface goals and objectives
scheduling
Controltheoretic techniques
reinforcement learning decision models
Gametheoretic techniques
issues, metrics methods, tasks
ACD Scenarios: Driving problems, Decision contexts
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
3333
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
Modeling the Effects
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
34
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
ASSESSING IMPACT VIA MODELS restoreHost()
Create Alert
malwareDetected()
Between(1,3)h AvailabilityAlert
Restore Functionality
Malicious Activity Discovered ?
Making sense of C2 means relating it to mission impact
Yes Submit Alert
CyCS-deleteTicket() No getNextAlert()
Get Next Alert
Alert Type ?
takeHostOffline()
wipeHost()
5m
Between(1,3)h
5m
Take offline
Wipe and Restore
Put online
WipeAlert
CyCS-deleteTicket() CyCS-createTicket()
Submit Alert InfectedAlert IntegrityAlert ConfidentialityAlert getInfectionSource()
ForensicAlert
None
putHostOnline()
CyCS-deleteTicket()
Create Alert
submitAlert()
getAllInfected()
Between(1,3)h
0m
Between(2,6)h
Between(3,9)h
Trace Attack Source
Issue New Alert
Get Signature
Find other infections
No 0m Targets Available ?
Yes
Issue New Alerts
No alert present
Release Resource
Mission is a nexus of a numerous physical assets, information, activities, friendly, enemy…
Create Alert getWait()
submitAlert()
Wait to Issue Alert
Issue Alert
Start Defender
AMICA explored comprehensive models that cove infrastructure, missions, defenders and attackers
Results are insightful but modeling is labor intensive S. Noel, J. Ludwig, P. Jain, D. Johnson, R. Thomas, J. McFarland, B. King, S. Webster and B. Tello, "Analyzing Mission Impacts of Cyber Actions," in Proceedings of the NATO IST-128 Workshop on Cyber Attack Detection, Forensics and Attribution for Assessment of Mission Impact, Istanbul, 2015.
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
35
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
EXAMPLE: MODEL-DRIVEN MISSION IMPACT ASSESSMENT Analyzing Mission Impacts of Cyber Actions (AMICA) Mission is Joint Targeting Process MITRE, MIT-LL, IDA, CMU SEI Questions it can answer: • How long of an attack can the mission withstand without impact? • How long does it take the mission to recover from an attack? • What is more damaging to the mission; loss reach back availability or degradation of Air & Space Operations Center (AOC) system assets? • How many targets can be impacted by confidentiality/integrity before impacting mission?
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
36
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
AMICA CONNECTS KINETIC MISSION TO CYBER ACTIONS
Outputs
Inputs
Mission Metrics
Mission Scenario
Visualization
Cyber Scenario Attacker Cap’s
Events Logs
Defender Cap’s
Adapted by permission from the paper by S. Noel et. al., “Analyzing Mission Impacts of Cyber Actions,” presented at the NATO IST-128 Workshop on Assessment of Mission Impact, Istanbul, Turkey, June 15-17 2015 UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
37
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
EXTENSIBLE M&S LIBRARIES TO QUICKLY CREATE THE NEEDED ANALYSIS ENVIRONMENT Library of Mission Models Library of Infrastructure Models (Covering multiple missions) (Targeting, BMD, etc.)
Developing parameterized libraries of models Each piece of AMICA is designed to be modular and extensible to support future mission areas, cyber dependencies, attack patterns, defenses Well defined interfaces
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
Library of Defender Models (workflows)
Malicious Malicious Malicious Malicious
Library of Attacker Models (attack graphs)
38
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
MISSION MODEL Process model capturing workflow, timing, and resources for the DoD kinetic targeting process (from CJCSI 3370.01) Originally developed for EUCOM as part of Austere Challenge 10 & selected due to pedigree and maturity – 200+ steps with timing & resources (dependent on target complexity) – Covers targeting process from basic targeting development through MAAP/ATO & BDA Modified for AMICA by breaking into modules and connecting to CyCS nodes
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
39
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
ATTACKER MODEL Modeled as process simulation that captures the steps the attacker follows
getTargets()
getTargets()
– –
Assumes attacker has some knowledge of mission and access on secure network Responsive to defense actions Adjust sophistication through probability of success/detection on attack steps
Between(1,3)d Get Spear Phishing Targets
No
getNextTarget()
Targets Available ?
Targets Available ?
No
Between(30,90)m
getNextTarget()
Yes
Yes Infect Target
Goal Node Reachable ?
Target Infected ?
launchAttack()
No
Yes Between(30,90)m
Perform Network Scan
Gate By Time:AttackTime Hours
–
No
Between(15,45)m
Yes
Choose & Infect Target
launchAttack()
Yes
Compromise Goal Node
No
launchAttack()
Goal Node Compromised ?
Yes
0m launchAttack() Gate By Time:2 Hours ConfidentialityAttack
Conceptually follows ‘Cyber:14’ threat models
isInfected()
Goal Node Still Compromised ?
Yes
Perform Attack 0m
Wait for desired time to affect Mission
launchAttack()
Attack Type ?
IntegrityAttack Perform Attack 0m
0m
launchAttack()
AvailabilityAttack
–
Perform Attack
Cyber:14 study (ARCYBER, defense of Dept. of Defense Information Network (DODIN)) Contains 1000s of nodes (mainly system-steps) of integrated attacker and defender/sensor actions for server-, host-, and emailbased attacks
Initial Foothold
- Scan network for goal node (e.g. database) reachability
- Includes time for research to find targets
- Infect laterally until target node is reachable
Perform Attack
CyCS-createTicket() No 0m CyCS() - check status
Gate By Time:30 Minutes Mission Still affected ?
Affect Mission
Yes
Attack Successfull ?
Periodically check for detection Yes Create Alert
Lateral Movement
- Initial access via spear phishing campaign
No
isReachable()
Between(30,90)m
No
–
Target Infected ?
Achieve Goal - Realize an effect on confidentiality, integrity, or availability on goal node
- Maintain presence and re-infect as necessary
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
40
No
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
DEFENDER MODEL restoreHost()
Process simulation of reactive defender (not proactive) actions Multi-tiered incident response model – Defender can impact mission (by alerts, taking down machines) – Includes defender resource/personnel constraints Conceptually follows ‘Cyber:14’ defense models
Create Alert
malwareDetected()
Between(1,3)h AvailabilityAlert
Restore Functionality
Malicious Activity Discovered ?
Yes Submit Alert
CyCS-deleteTicket() No getNextAlert()
Get Next Alert
Alert Type ?
takeHostOffline()
wipeHost()
5m
Between(1,3)h
5m
Take offline
Wipe and Restore
Put online
WipeAlert
CyCS-deleteTicket() CyCS-createTicket()
Submit Alert InfectedAlert IntegrityAlert ConfidentialityAlert getInfectionSource()
ForensicAlert
None
putHostOnline()
CyCS-deleteTicket()
Create Alert
submitAlert()
getAllInfected()
Between(1,3)h
0m
Between(2,6)h
Between(3,9)h
Trace Attack Source
Issue New Alert
Get Signature
Find other infections
No 0m Targets Available ?
Yes
Issue New Alerts
Release Resource
No alert present
Create Alert getWait()
submitAlert()
Wait to Issue Alert
Issue Alert
Start Defender
Triage - Defender response triggered by IT alert - IT alerts prioritized by expected impact
Reboot, Restore, Rebuild - Mitigation based on alert type (crash, infection, corruption) - More aggressive responses may impose greater mission impact
Forensics - For more serious threats - Trace attack to source, build signatures - Submit new alerts for all compromised machines
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
41
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
Overcoming the Limitations of Today’s AI
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
42
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
GAPS IN TODAY’S AI
Gaps AI & ML with small samples, dirty data, high clutter Learning in Complex Data Environments
AI & ML with highly heterogeneous data Adversarial AI & ML in contested, deceptive environment
Distributed AI & ML with limited communications AI & ML computing with extremely low size, weight, and power, time available (SWaPT)
Resource-constrained AI Processing at the Point-of-Need
Explainability & programmability for AI & ML AI & ML with integrated quantitative models
43
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
Generalizable & Predictable AI
43
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
ADAPTIVE REAL-TIME LEARNING DURING A MISSION
Outcomes: • Theory of learning under non-stationary distributions with limited a priori training and data • Working concepts of control and perception tailored to online learning
Research Areas:
ti m e
• Learning for High-Speed Navigation in Unknown Environments • Online Unsupervised Percept Modeling • Stable and Risk-Aware Learning and Adaptation
l e a r n
Payoff: • Dynamics and perceptual modeling in real-time to support maneuver in complex, dynamic environments • Stable training/learning in the field in the presence of noise/adversaries 44 UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
44
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
HUMANS HELPING MACHINES LEARN Human-in-the-loop reinforcement learning system to provide improve decision-making in dynamically-changing environments, where data availability and computational resources are limited. Key Technical Reinforcement-learning AI solution using Demonstration: real-time human input to solve the unsolved Atari™ Bowling task Human-guided
Traditional Approaches
Philosophical similarities with OpenAI/DeepMind research, but with ability to run in real-time and claimed significant improvements in sample efficiency - Jack Clark, Director of
Reconceiving human-technology roles in the future Battlefield
Strategy and Communications at Open AI. Key Publications:
• •
Converged on state-of-the-art solution for previously unsolved Atari™ Bowling task in 15 minutes Learned AI policy outperforms expert humans
Warnell et al. Deep TAMER: Interactive Agent Shaping in High Dimensional State Spaces. AAAI-18. Koppel et al. Policy Eval. in Infinite MDPs: Eff. Kernel Gradient Temporal Difference. AAAI-18.
Potential: • Broader applications of AI through solutions for unstructured environments with ill-defined rewards. • Faster, more optimal AI solutions with less data • Rapidly-adaptable Human-AI teams that learn from human understanding of high-level goals UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
45
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
SEMANTIC CONCEPTS LEARNING
Outcomes: • Algorithms for rapid learning of semantic understanding of environment from visual information • Techniques to train tactical robot behaviors from human demonstrations
Research Areas: • Online unsupervised scene segmentation algorithms • Inverse optimal control to enable generalizable learning-from-demonstration of robot behaviors
Payoff: • Learning behaviors that cue from semantic representations of the environment will yield better adaptation to new environments/scenarios • Rapid training of new autonomous vehicle behaviors in the field 46 UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
46
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
Robust Human State Detection
Developed robust Deep Learning (DL) system for detecting human states across large, diverse data collections Key Technical Demonstration:
Task-driven human state identification in closed-loop systems
Using Task-derived Human State Information to Improve Human-Machine Collaboration Key Publications:
Neurophysiological interpretation of task-based states
• •
Discovering task-based Human States
•
Uncovering task-dependent structure in data automatically from DL model 1st place in international machine learning challenge to detect human interest from EEG across multiple domains State-of-the-art performance across 5 different BCI tasks
Lawhern, et al (2016). EEGNet: A Compact Convolutional Network for EEG-based Brain-Computer Interfaces. Gordon, et al (2017). Real-World BCI: Cross-Domain Learning and Practical Applications Solon, et al (2017) A Generalized Deep Learning Framework for CrossDomain Learning in Brain-Computer Interfaces. Solon, et al (2017) Deep Learning Approaches for P300 Classification in Image Triage: Applications to the NAILS Task
Potential: • Detecting high level and potential sub-conscious cognitive constructs and incorporating information in closedloop systems for improved human-machine collaboration UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
47
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
ROBUST INTERACTIONS VIA NATURAL LANGUAGE DIALOGUE
Outcomes: • Dialogue clarification techniques enabling agents to prompt for clarification & ask for help • Models of human-robot team communication in uncertain environments
Research Areas: • Explainable Dialogue Interaction in Resource-Constrained Environments • Language Grounding for Robotics (Symbol Grounding) • Multimedia Exploitation of Multimodal Information Streams (Speech, Video, Mapping) H: What do you see in front of you?
R: I think I see some tables.
Tool
Teammate
Payoff: • Naturalistic, bi-directional communication between Soldiers and semi-autonomous agents • Reduced time for reconnaissance missions with “heads-up, hands-free” language interface 48 UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
48
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
The Science of Adversarial Interactions
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
49
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
SCIENCE OF CYBER DECEPTION Goal: To create leap ahead cyber deception methods that will improve cyber security certainty and resource utilization through establishment of a joint computationalcognitive model that incorporates an adversaries' cognitive state in order to successfully manipulate and mislead them Current State: • Passive honey pot with no adversarial knowledge • Lack of formal way of creating an optimized deception scheme based on learning the target's cognitive state and capability in order to effectively manipulate New Approaches: • Establish model for estimating and tracking adversarial cognitive states and decision processes • Define metrics quantifying information effectiveness in driving cognitive state change under the deception context • Build an integrated framework of deception composition and projection methods to successfully manipulate adversaries’ cognitive state and decision-making process to our advantage. UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
50
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
THEORY OF MOVING TARGET DEFENSE Goal: To establish methods of cyber adaptation that provide both resiliency and survivability by minimizing an attackers’ ability to understand and attack our systems through the employment of proactive multi-level obfuscation and migration strategies. Current State: Ad-hoc approaches as point solutions with no adversarial modeling New Approach: • Analytical models and performance metrics capturing the dynamics between cyber attack and defense • Unified framework for analyzing resiliency, agility, and performance trade-offs • Redefine system resiliency and robustness under an adversarial setting, with the incorporation of attack/response dynamics.
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
51
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
References
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
52
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
REFERENCES Boddy, M. S., Gohde, J., Haigh, T., & Harp, S. A. (2005, June). Course of Action Generation for Cyber Security Using Classical Planning. In ICAPS (pp. 12-21). Chen, Jessie YC, Shan G. Lakhmani, Kimberly Stowers, Anthony R. Selkowitz, Julia L. Wright, and Michael Barnes. "Situation awareness-based agent transparency and human-autonomy teaming effectiveness." Theoretical issues in ergonomics science 19, no. 3 (2018): 259-282. Evans, A. William, Matthew Marge, Ethan Stump, Garrett Warnell, Joseph Conroy, Douglas Summers-Stay, and David Baran. "The future of human robot teams in the army: factors affecting a model of human-system dialogue towards greater team collaboration." In Advances in Human Factors in Robots and Unmanned Systems, pp. 197-209. Springer, Cham, 2017. Fabio De Gaspari, Sushil Jajodia, Luigi V. Mancini, Agostino Panico. “AHEAD: A New Architecture for Active Defense,” SafeConfig’16, October 24 2016, Vienna, Austria Kott, Alexander, David S. Alberts, and Cliff Wang. "Will Cybersecurity Dictate the Outcome of Future Wars?." Computer 48.12 (2015): 98-101. Kott, A., Singh, R., McEneaney, W. M., & Milks, W. (2011). Hypothesis-driven information fusion in adversarial, deceptive environments. Information Fusion, 12(2), 131-144. Kott, Alexander, Ananthram Swami, and Bruce J. West. "The internet of battle things." Computer 49, no. 12 (2016): 7075. Kott, Alexander. "Challenges and Characteristics of Intelligent Autonomy for Internet of Battle Things in Highly Adversarial Environments." arXiv preprint arXiv:1803.11256 (2018). Kott, Alexander, Luigi V. Mancini, Paul Théron, Martin Drašar, Edlira Dushku, Heiko Günther, Markus Kont et al. "Initial Reference Architecture of an Intelligent Autonomous Agent for Cyber Defense." arXiv preprint arXiv:1803.10664 (2018). Lawhern, Vernon, Amelia Solon, Nicholas Waytowich, Stephen M. Gordon, Chou Hung, and Brent J. Lance. "EEGNet: a compact convolutional neural network for EEG-based brain--computer interfaces." Journal of neural engineering (2018). Marathe, Amar R., Jason S. Metcalfe, Brent J. Lance, Jamie R. Lukos, David Jangraw, Kuan-Ting Lai, Jonathan Touryan et al. "The privileged sensing framework: A principled approach to improved human-autonomy integration." Theoretical Issues in Ergonomics Science 19, no. 3 (2018): 283-320. UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
53
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
REFERENCES (CONT.) Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602. Muttik, I., Good Viruses. Evaluating the Risks. Online at https://www.defcon.org/images/defcon-16/dc16presentations/defcon-16-muttik.pdf Rasch, Robert, Alexander Kott, and Kenneth D. Forbus. "AI on the battlefield: An experimental exploration." AAAI/IAAI. 2002. Sarraute, Carlos, Gerardo Richarte, and Jorge Lucángeli Obes. "An algorithm to find optimal attack paths in nondeterministic scenarios." Proceedings of the 4th ACM workshop on Security and artificial intelligence. ACM, 2011. Sarraute, Carlos, Olivier Buffet, and Jörg Hoffmann. "POMDPs make better hackers: Accounting for uncertainty in penetration testing." Twenty-Sixth AAAI Conference on Artificial Intelligence. 2012. Stytz, Martin R., Dale E. Lichtblau, and Sheila B. Banks. Toward using intelligent agents to detect, assess, and counter cyberattacks in a network-centric environment. INSTITUTE FOR DEFENSE ANALYSES ALEXANDRIA VA, 2005. Theron, P., Alexander Kott, Martin Drašar, Krzysztof Rzadca, Benoît LeBlanc, Mauno Pihelgas, Luigi Mancini, Agostino Panico, "Towards an Active, Autonomous and Intelligent Cyber Defense of Military Systems: the NATO AICA Reference Architecture," In Proceedings of the ICMCIS Conference, Warsaw, Poland, May 2018 Warnell, Garrett, Nicholas Waytowich, Vernon Lawhern, and Peter Stone. "Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces." arXiv preprint arXiv:1709.10163 (2017). Wigness, Maggie, and John G. Rogers. "Unsupervised Semantic Scene Labeling for Streaming Data." In Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 4612-4621. 2017.
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
54
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
Questions?
UNCLASSIFIED//APPROVED FOR PUBLIC RELEASE
55