Arihant Patawari USC Stevens Neuroimaging and Informatics Institute July 9 th 2015 Organization 1 G AAIN Virtual Appliances Expanding the GAAIN application with Docker ID: 400209
Download Presentation The PPT/PDF document "GAAIN Virtual Appliances: Virtual Machin..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
GAAIN Virtual Appliances: Virtual Machine Technology for Scientific Data Analysis
Arihant
Patawari
USC Stevens Neuroimaging and Informatics Institute
July
9
th
2015Slide2
Organization
1) G
AAIN
Virtual Appliances -Expanding the GAAIN application with Docker as well as Virtual Machines - Objectives: Support production data analysis in GAAIN 2) Medical Datasets Element Name Matching - Integration into larger GEM system - Scalability issues - Mine features from data - Neural Network classifiers
9/8/14Slide3
The Virtual Machine
A (computing) machine purely “made of software”
A machine
within a machine
WHY ? : Sharable, transportable over a network Slide4
GAAIN Virtual Machines
Investigator
Data PartnerSlide5
Virtual Appliances
9/8/14Slide6
6
How do we provide a scientific investigator a dedicated analysis development resource
How do we ensure that an analysis resource is sharable
How do we run applications that require graphical display (such as a UI)How can we connect client and server applicationsHow do we ensure automated cloud backups
How do we send over analysis machine to data partners
How do we access data partner data
How do we get beck analysis results into GAAIN network
…..
Objectives ….Slide7
7
Designed to provide framework for
(specific) application encapsulation
Provide minimal support for applicationNot intended as a general purpose computing machine
Other aspects
-
Dockerfile
management
-
Docker
Hub
- Security
Relatively
new
and
evolving framework
Recap:
Docker
FrameworkSlide8
8
Intended as
full computing machine
Command line control, not scriptsInteroperability with Open Virtualization Format (OVF)
Also VM environments like
VMSphere
,
XenServer
and others
Recap: Virtual MachinesSlide9
PC
VBox
VM
(Pipeline)
Server
PC
VBox
VM
(Pipeline)
Server
PC
Docker
(Pipeline)
Server
PC
VBox
VM
Docker
(Pipeline)
Server
Many Possibilities !
c
lient on PC, server in VM
c
lient
and server in
VM
c
lient
and server in
Docker
c
lient
in VM, server in
Docker
,
Docker
in VM
√Slide10
10
Docker LifeCycle
Docker File
Hub
Data Partner’s Machine
Result in Shared FolderSlide11
Docker vs Virtual Machines
Aspects
VirtualBoxDocker
Virtual Image Type
(Formats)
The open-virtual-format as ‘.OVF’ and ‘’.OVA’ files
Proprietary Docker image format
Requirements
Any virtualization hypervisor that can run the open virtualization format images
Docker Engine
Architecture
Typically the core virtual image contains a complete operating system of choice
Minimal system layer is provided and components are added only as required
‘Typical’ Image Sizes
Encapsulating a simple application (for instance a single workflow) results in a machine of size ~ 1.5GB. However options are recently becoming available for including only a liminal operating system layer.
Typically only a few hundred MB for the same applications
Management and Sharing
No specific capabilities provided
Docker Hub for centralized Docker image storage, tagging, and sharing
Access Control
No specific capabilities provided
Docker Hub provides account management and access control
Network Access
Can provide network access between Virtual Box VM and external machines/networks.
External network access to Docker image can be provided but with limitations
Host Folder Mounting
Possible but with some additional software installation
Host folder mounting can be done more easily with a single commandSlide12
12
Virtual Machines and
Docker
Virtual Machines provide - More robust platform - Interoperability - Network accessDocker provides - Small application packages - Hub management - Security and access control - “On-demand”Slide13
13
Docker File
- Docker can build images automatically by reading the instructions from a Docker file.
- Only requirement is to have docker installed- Docker file can be created automatically by recording the actions performed just by a command (Using Auto-commit module), which makes it flexible for any user unfamiliar with docker commands- Just by executing simple text file, the whole system can built from the scratch. - The idea behind using docker file, it helps to manage size with requirements. Slide14
14
Best supported for Linux but some challenges with Windows and Mac
Graphical Displays
- Could be achieved with X Window and other software on Windows - Challenges with Mac OS Network access
- Restricted due to security issues
- Port forwarding
Frequent updates to framework
Some Challenges with
DockerSlide15
15
Working Prototype
Virtual Machine Manager
Auto Pipeline pop-up
Investigator
Data Partner
Image push to Hub
Docker HUB
Web Service
GAAIN Server
Docker Auto-Invocation
ResultsSlide16
16
Features
Flexible and no interoperability issues.
Better control and management of workflows images through docker hub.Better accessibility and ease of use directly.Automated invocation of workflows at data partner’s end using java based web-service.Dedicated application just for creating and testing workflows, with automated script for pushing it to the hub.Minimal size of overall system (1.5-2.0 GB)Slide17
17
Some technical issues, we faced
- Virtual appliance creation with various minimal install OS’s
- Scripts for automatic invocation of pipeline - Installation of docker on different Operating system version.- Compatibility of pipeline with different OS’s- Memory bubble and deleting existing images from the system.- Memory Overrun can be solved by deletion of images which are not required.- Web services Slide18
18
Choice of VM OS
Choice of
Docker OSHow to get a minimal VM workingHow to work with a minimal Docker imageHow to enable network access in VMs
Limitations of network access
How to mount folders from host
Differences between Linux, Windows and Mac hosts
How to get GUI displays to work in different
Docker
images
and VMs
How to enable external (client/server access) to VMs and
Docker
images
How to
autostart
applications
How to manage scripts
Issues AddressedSlide19
19
All files and code for
system
is provided on the Google Drive shared folderComprehensive “How-To” ManualMiscellaneous