Virtual Research Environments

6 downloads 43045 Views 9MB Size Report
will allow you to 'build your own' toolbox from a set of scientific codes and utilities ... Portal. Provenance. Metadata. Scrip ng. Tool. eScript. Mag. Grav. NCI. Cloud.
Virtual Research Environments: enabling a step change in geoscience research globally Lesley Wyborn1 and Helen Glaves2 1Na4onal Computa4onal Infrastructure, Australian Na4onal University 2Bri4sh Geological Survey

© National Computational Infrastructure 2016

Virtual Research Environments: enabling a step change in geoscience research globally Lesley Wyborn1, Helen Glaves2 and ???? 3 1Na4onal Computa4onal Infrastructure, Australian Na4onal University 2Bri4sh Geological Survey 3Someone from the Geosciences in the US?

© National Computational Infrastructure 2016

This presenta4on has an iden4ty crisis

•  Virtual Research Environments are currently funded: –  in Europe as Virtual Research Environments –  in Australia as Virtual Laboratories –  in the USA as Science Gateways

•  Elsewhere they have been called –  –  –  – 

Co-laboratories Virtual Observatories Collabora4ve Interac4ve Environments Analy4cs PlaPorms/Engines

•  All enable sharing of resources and common infrastructures over the Internet

© National Computational Infrastructure 2016

GSA, Denver, Colorado, 2016

What is a VRE? Source: http://www.worldatlasbook.com/europe/europe-political-map.html

‘A Virtual Research Environment or VRE is an online

tool for researchers to facilitate sharing and collabora4on. VREs give you an integrated online environment with access to shared documents and resources needed in the course of a research project’ (Universiteit Leiden 2016)

‘A VRE is a set of online tools to facilitate or enhance the research process. A VRE can aid with collabora4on and communica4on amongst members of a research group, whether they share an office or work on different sides of the world (University of Newcastle, 2016) ‘A virtual research environment (VRE) or virtual laboratory is an online system helping researchers collaborate. Features usually include collabora4on support (Web forums and wikis), document hos4ng, and some discipline-specific tools, such as data analysis, visualisa4on, or simula4on management’ (Wikipedia, 2016) © National Computational Infrastructure 2016 GSA, Denver, Colorado, 2016

What is a science gateway? Source: http://www.freelargeimages.com/map-of-usa-772/

‘A Science Gateway is a community-developed set of tools, applica4ons, and data that are integrated via a portal or a suite of applica4ons, usually in a graphical user interface, that is further customized to meet the needs of a specific community’ © National Computational Infrastructure 2016

GSA, Denver, Colorado, 2016

© National Computational Infrastructure 2016

GSA, Denver, Colorado, 2016

What is a virtual laboratory? Source: http://www.travel-australia-online.com/maps-of-australia.html

National eResearch Collaboration Tools and Resources (NeCTAR)

See: https://nectar.org.au/

‘Virtual Laboratories are rich domain-oriented online environments that draw together research data, models, analysis tools and workflows to support collabora4ve research across ins4tu4onal and discipline boundaries’ © National Computational Infrastructure 2016

GSA, Denver, Colorado, 2016

Consensus view??? A VRE, science gateway or virtual laboratory is an on-line system suppor4ng collabora4ve research that enables harnessing of the power of the Internet to support a more dynamic, online approach to collabora4ve working Key features: –  Provide access to data resources that are accessible online –  Enable online use of discipline-specific tools, such as data analysis, visualisa4on, or simula4on management –  Online access to compute resources –  Collabora4on support (Web forums and Wikis) –  May include publica4on management and teaching tools VRE’s are important in fields where research is primarily carried out in teams spanning mul4ple ins4tu4ons and even countries



© National Computational Infrastructure 2016

GSA, Denver, Colorado, 2016

VRE’s are the enablers of transdisciplinary research

© National Computational Infrastructure 2016

GSA, Denver, Colorado, 2016

Who Uses them?

•  VRE’s are very diverse and can range from –  individual researchers working on distributed data resources, –  to teams of highly skilled researchers accessing online fairly substan4al High Performance Compu4ng environments that facilitate in situ processing of large volumes of data using community codes developed through interna4onal coopera4ve efforts

•  They can be developed by any discipline on almost any infrastructure

© National Computational Infrastructure 2016

GSA, Denver, Colorado, 2016

In Australia: 13 Virtual Labs from Diverse Research Groups

1.  All Sky Virtual Lab (Astronomy)

9. Industrial Ecology Virtual Laboratory

2.  Virtual Geophysics Laboratory

10. Climate & Weather Science VL

3.  Virtual Hazards, Impact & Risk Laboratory 11. Biodiversity & Climate Change VL 4.  Humanities Network Infrastructure

12. Microbial Genomics Virtual Laboratory

5.  Endocrine Genomics Virtual Lab

13. Genomics Virtual Laboratory

6.  ALVEO 7.  Characterisation Virtual Laboratory 8.  Marine Virtual Laboratory

Source: http://www.travel-australia-online.com/maps-of-australia.html © National Computational Infrastructure 2016

GSA, Denver, Colorado, 2016

Serving a diversity of Use Cases and Competency of Users Needs HPC

Climate

Astronomy Genomics

Gigabytes of Data

Hazards

Users: Fewer More CS Skilled

Geophysics

Marine

Petabytes of Data Characterisation

Humanities

VRE’s down under Users: More Less CS Skilled

Biodiversity Cities Species

Doesn’t Need HPC © National Computational Infrastructure 2016

Source: http://www.travel-australia-online.com/maps-of-australia.html

GSA, Denver, Colorado, 2016

Common Components

Provenance & reproducibility

Software sustainability

Impact measures

Virtual Lab

User engagement

Knowledge transfer & skills development Software reuse

Slide Courtesy Of Michelle Barker, NeCTAR, Australia © National Computational Infrastructure 2016

GSA, Denver, Colorado, 2016

Working together to solve common problems

Virtual Virtual Lab Lab

Virtual Lab

Nectar common projects

Virtual Virtual Lab Lab

Slide Courtesy Of Michelle Barker, NeCTAR, Australia © National Computational Infrastructure 2016

GSA, Denver, Colorado, 2016

Solving Common Problems, Sharing Core Infrastructures

Researcher training

Movement of data

Data storage Security & authen?cai? on

Provenance & reproducibility

Virtual Lab

Advocacy & coordina?on

Compute access

User support

Data management

Research collabora?on pla@orms

SoBware Skills sustainability development & knowledge transfer

Slide Courtesy Of Michelle Barker, NeCTAR, Australia © National Computational Infrastructure 2016

GSA, Denver, Colorado, 2016

Sharing core components and infrastructures across multiple VL’s enhances sustainability and is more cost effective

Introducing the Virtual Geophysics Laboratory

© National Computational Infrastructure 2016

GSA, Denver, Colorado, 2016

Data Services (1) on a browser

Layers discovered via remote registries Layers consist of numerous remote data services

© National Computational Infrastructure 2016

GSA, Denver, Colorado, 2016

Data Services (2) on a browser

© National Computational Infrastructure 2016

GSA, Denver, Colorado, 2016

Sobware as a Service on a browser

A variety of different scientific codes are already available in the form of “Toolboxes”. Currently tool boxes correspond to VM images with codes installed – future versions will allow you to ‘build your own’ toolbox from a set of scientific codes and utilities © National Computational Infrastructure 2016

GSA, Denver, Colorado, 2016

Sobware as a Service (2) on a browser

Flexibility in what computing resources to utilise

© National Computational Infrastructure 2016

GSA, Denver, Colorado, 2016

Monitoring jobs from a browser

Wyborn AGU 2013 IN43B-05

© National Computational Infrastructure 2016

GSA, Denver, Colorado, 2016

Wyborn AGU 2013 IN43B-05

Components of the Virtual Geophysics Laboratory Data Services

Magne?cs Gravity

Processing Services

eScript Under world

DEM

Compute Services

NCI Petascale NCI Cloud NeCTAR Cloud

Enablers (e.g., OGC ‘Glue’)

Service Orchestra?on

NCI Mag. Grav. Cloud VGL Portal eScript

VGL Scrip?ng Portal Tool Provenance Metadata

Amazon Cloud Desktop © National Computational Infrastructure 2016

Dynamic Virtual Geophysics Laboratories

GSA, Denver, Colorado, 2016

Mag. Grav.

DEM NCI Cloud

VGL Portal NCI Petascale

Under world

Repurposing to a Virtual Hazards Laboratory Data Services

Processing Services

Compute Services

Enablers (e.g., OGC ‘Glue’)

Unchanged

Magne?cs

ANUGA

Gravity

EQRM

NCI Petascale NCI Cloud NeCTAR Cloud

DEM

Service Orchestra?on VGL Scrip?ng Portal Tool Provenance Metadata

Amazon Cloud

Landsat Bathymetry © National Computational Infrastructure 2016

Desktop GSA, Denver, Colorado, 2016

Dynamic Virtual Hazards Laboratories NCI Mag. Grav. Petascale VGL Portal EQRM

DEM

Bathy DEM

NCI Cloud

VGL Portal Amazon ANUGA Cloud

Repurposing to a Virtual Environmental Laboratory Data Services

Processing Services

Compute Services

Unchanged Climate Records

Wind Modelling

Species Land Use Analy?cs

DEM

NCI Petascale NCI Cloud NeCTAR Cloud

Enablers Dynamic Virtual (e.g., OGC ‘Glue’) Environmental Laboratories

Service Orchestra?on VGL Scrip?ng Portal Tool Provenance Metadata

Amazon Cloud

Landsat Bathymetry © National Computational Infrastructure 2016

Desktop GSA, Denver, Colorado, 2016

Amazon Sat. Species Cloud VGL Portal Bug DEM tracking Weather DEM

NCI HPC

VGL Portal Amazon Tsunami Cloud

Repurposing to a Virtual Geochemistry Laboratory? Data Services

Processing Services

Interfaces are critical

© National Computational Infrastructure 2016

Compute Services

Enablers (eg. OGC “Glue”)

Virtual Laboratory

Sharing core components across multiple VL’s enhances sustainability and is more cost efficient GSA, Denver, Colorado, 2016

To advance VREs further To achieve our vision of virtual environments in which applica4ons can access data from mul4ple domains anywhere, and then process at their preferred loca4on using the the most appropriate sobware there are three key issues in moving forward: 1.  Technical –  more effort needs to be put into standardisa4on of the interfaces that enable distributed systems to be loosely coupled and interact in real 4me

2.  Social –  More effort is needed in raising awareness of the poten4al of virtual environments and to work collabora4vely to build globally shared infrastructures par4cularly around sobware. Data may be local but sobware should be global!

3.  Sustainability –  Is cri4cal and needs to be part of the development plan –  Globally sharing developments will also enhance sustainability © National Computational Infrastructure 2016

GSA, Denver, Colorado, 2016