Doug Szumski Mark Goddard Forest Godfrey Overview OpenStack T he foundation of Crays next generation system management software We need to support B ooting large numbers of diskless compute nodes ID: 675413
Download Presentation The PPT/PDF document "Bending Ironic for Big Iron" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Bending Ironic for Big Iron
Doug
Szumski, Mark Goddard, Forest GodfreySlide2
Overview
OpenStack
T
he foundation of Cray’s next generation system management softwareWe need to support:Booting large numbers of diskless compute nodesCinder integration for IronicFlexible provisioning of diskful nodesBareon agent for IronicSlide3
Cinder integration for Ironic
Based on upstream spec by Satoru Moriya
https://review.openstack.org/#/
c/200496
Configured by instance_info
fields
No additional database tables or changes to any
APIs
Supports:Booting disklessly from Cinder via iSCSI (no FC)In-band connection to the iSCSI targetAttachment of additional volumes at boot time (but not dynamically)Extended support for Dracut based ramdisks through the generation of the PXE config fileWe’ve shared our implementation here:https://review.openstack.org/#/c/265856
100s of cabinets
XC series compute
blade (4 nodes)
up to 48 blades per cabinetSlide4
Diskless boot
1. Nova
boot from CLI
2. Request IP for instance3. Get storage port MAC address (or IP) and IQN from Ironic node driver info
4. Lookup IP address of the node on the storage network from the storage port MAC address
5
. Prepare Cinder volumes and retrieve iSCSI target info
6
. Patch Ironic with block device info7. Call Ironic to begin deployment8. Cache kernel and ramdisk, build the kernel cmdline using Jinja29. Configure TFTP server10. Setup DHCP for PXE boot
11. Set boot device to PXE12. Reboot target node13. Target node broadcasts, DHCP server responds with an IP and the location of the bootloader
14. PXE boot the kernel and ramdisk15. Mount iSCSI targets and pivot into the rootfsSlide5
Bareon
(Fuel) Agent
What is
Bareon?“flexible and data driven interface to perform actions which are related to operating system installation” - wiki.openstack.org/wiki/BareonIn particular, Cray uses the
Bareon agent with Ironic
Similar in concept to the Ironic Python Agent (IPA
)
Why does Cray use
Bareon?Deploy baremetal nodes in a flexible, perhaps non-cloud like wayDeploying multiple images / multi-bootSupport complex partitioning schemesEg. Creation of shared partitions, LVM groups, consistent identification of block devices.Rsync deploy – useful for upgrades / updatesRun arbitrary actions during or post deployhttps://github.com/openstack/bareonSlide6
Bareon
agent
1
. Nova boot from CLI2. Request IP3. Nova calls Ironic4. Configure TFTP server5. Cache images (deploy
kernel & ramdisk
,
filesystem
,
cloud_default_deploy_config, deploy_config and driver_actions) and write provision script for Bareon agent6. Update MAC and PXE config7.
Set boot device to PXE8. Reboot target node9. Target node gets IP10. PXE boot the
Bareon agent11. Bareon
agent calls back
12. SFTP across provision script and forward rsync server port by SSH
13. Trigger provisioning by
SSH: provision --data_driver ironic --
deploy_driver
rsync
14.
Partition and clean local storage, mount partitions,
rsync
filesystem
across, write fstab, configure bootloader and unmount partitions15. Run driver actions over SSH, eg update BIOS, SFTP file across from Swift16. Set boot device to local disk17. Reboot nodeSlide7
Scaling Ironic
Where are we at?
Diskless boot tested on a 128 node system
Read only Cinder volume with multi-attach and overlay filesystemIronic multi-conductorImmediate focus pointDeploying OpenStack with KollaSupport scaling of OpenStack services
Where do we want to go?100,000 (?) nodes by 2018 for Shastahttp://www.cray.com/blog/the-cray-shasta-system
/Slide8
Thank you for listening