Steve Tuecke Computation Institute University of Chicago and Argonne National Laboratory Reliable file transfer Easy fireandforget transfers Automatic fault recovery High performance Across multiple security domains ID: 933674
Download Presentation The PPT/PDF document "Delivering a scalable service" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Delivering a scalable service
Steve Tuecke
Computation Institute
University of Chicago and Argonne National Laboratory
Slide2Reliable file transfer.
Easy “fire-and-forget” transfers
Automatic fault recoveryHigh performanceAcross multiple security domainsNo IT required.Software as a Service (SaaS)No client software installationNew features automatically availableConsolidated support & troubleshootingWorks with existing GridFTP serversGlobus Connect solves “last mile problem”Supports Blue Waters, XSEDE, NERSC, ALCF, and many universities
What is Globus Online?
2
Slide3SaaS changes assumptions and approach throughout the software lifecycle
Architecture and design
Designed for specific environmentSoftware developmentNo porting. Focus on functionality.OperationsNobody else will operateFocus on availability, automation, monitoringSupportTightly integrated with operationsWe are delivering a service, not softwareSoftware as a Service (SaaS) vs Traditional software delivery3
Slide4Product managementProduct developmentDeveloper-operators (
dev
-ops)User eXperience (UX) managerWeb design and developmentUser servicesHelp desk / supportConsulting servicesMarketingTeam4
Slide5“Continuous” service updatesGlobus Online updates almost every week
And hot fixes for critical issues
Independent updates of component servicesNexus, Transfer (backend, CLI, REST, relay, history), Web GUI, Storage, sample endpoints, …Use Agile ScrumBacklogTime-boxed development (sprints)ScrumsSprint reviewsAgility5
Slide6Uses Amazon Web Services (AWS)EC2, EBS, S3, ELB, …
Many EC2 instances
Each service running on 1 or more instancesReplication across availability zones within regionAll services within an Amazon security groupBackups to S3 in another regionOperations servicesChef based automated deploymentLogging to common server (rsyslog, logstash, etc.)Nagios monitoringOSSEC host-based intrusion detectionAccess limited to “need to have”Zendesk based help desk w/ Globus Online user SSOProduction environment6
Slide7Dev Test Integration Staging Production
Dev
: AWS and laptopsTest: Multiple (partial) test instances on AWS (branches) Integration: Full copy of production with next code to be released on AWS (trunk)Staging: Full copy of production on AWS, to test updatesProduction: AWSGitHub repositoriesJira w/ GreenHopper for Scrum managementPython is primary development language, usingPostgreSQL and Cassandra databasesGlobus Toolkit C librariesMany open source Python librariesDevelopment and test environments7
Slide8How can we enable other groups to enhance the Globus Online ecosystem without replicating everything we have done?Globus Integrate platform
Globus Nexus: identity, group, and profile service
REST APIs to servicesDon’t constrain your implementation and hosting approachesJava, Python, Ruby, etc.AWS, Google App Engine, Liferay, etc.Platform as a Service