/
Aleksandrs Gutcaits,  Dr. Aleksandrs Gutcaits,  Dr.

Aleksandrs Gutcaits, Dr. - PowerPoint Presentation

gagnon
gagnon . @gagnon
Follow
27 views
Uploaded On 2024-02-02

Aleksandrs Gutcaits, Dr. - PPT Presentation

chem Senior expert Researcher 20220919 Tier2 CMS Status T2LVHPCNET Current Project Tasks Aleksandrs Gutcaits Setting up of Latvian Tier2 data center ARC Computing Element for CERN CMS experiments ID: 1043953

arc cluster clusters slurm cluster arc slurm clusters job federation grid jobs cms local federated scheduling latvian tier2 data

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Aleksandrs Gutcaits, Dr." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Aleksandrs Gutcaits, Dr. chemSenior expert, Researcher2022-09-19Tier2 CMS Status (T2_LV_HPCNET )

2. Current Project TasksAleksandrs GutcaitsSetting up of Latvian Tier2 data center (ARC Computing Element) for CERN CMS experiments. Implementation and development of federation of computing clusters for efficient use of the computing resources all involved academic institutions.

3. ARC-CE for Latvian CERN Tier2 Data Center for CMS Experiments (1) Aleksandrs GutcaitsPublishing information about itselfPerforming user authorization based on Grid credentials and local policiesMapping Grid users to local accountsAccepting Grid jobs from authorized usersDownloading additional input files from Grid storages on behalf of the authorized userCaching input filesSubmitting jobs to the local batch systemsUploading user-specified job output to Grid storages on behalf of the userCollecting accounting information (usage record)

4. ARC-CE for Latvian CERN Tier2 Data Center for CMS Experiments (2) Aleksandrs GutcaitsAlready Tested ModulesARC-CE (v.6) was implemented and tested for job submitting using test user certificates ARC-CE is running on separate VMA-REX (ARC Resource-coupled EXecution service) and gridftpd service now is ready for job submitting and fetching.ARC Information System provides web-based Grid Monitor with user-friendly overview of all cluster(s) resources and current jobs

5. ARC-CE Currently Implemented ServicesAleksandrs GutcaitsCurrently working services (TestCA certificates) arc-arex arc-datadelivery-service arc-gridftpd (currently re-testing) arc-infosys-ldap

6. T2_LV_HPCNET Site Clusters ConfigurationAleksandrs GutcaitsCurrently Slurm has been tested on federated and multi-cluster configuration:Clusters tier2rtu, tier2test and MariaDB are running on VMs (2 nodes x 4 CPUs)LU (Latvian University) lucluster is a real working cluster with SLURM workload manager. Currently it has been set as INACTIVE and Grid monitor shows only 2 cluster data.

7. Grid Monitor for T2_LV_HPCNET siteAleksandrs Gutcaits

8. Federated SLURM Workload Manager Setup for LU-RTU ClustersAleksandrs GutcaitsSlurm workload manager is implemented at LU HPC cluster (lucluster)Federation of LU HPC and modeled RTU HPC was createdJob scheduling with unique job IDs is possible on all federated clusters MariaDB SQL accounting database is configurated to use on federated clusters

9. SLURM: Federation

10. RTU Tier2 CMS Current Hardware InstallationAleksandrs Gutcaits

11. Thank You for Attention

12. ArchiveAleksandrs Gutcaits

13. Concerning SLURM FederationAleksandrs Gutcaitshttps://slurm.schedmd.com/federation.html Slurm Federated Scheduling GuideSlurm version 17.11 includes support for creating a federation of clusters and scheduling jobs in a peer-to-peer fashion between them. Jobs submitted to a federation receive a unique job ID that is unique among all clusters in the federation. A job is submitted to the local cluster (the cluster defined in the slurm.conf) and is then replicated across the clusters in the federation. Each cluster then independently attempts to the schedule the job based off of its own scheduling policies. The clusters coordinate with the "origin" cluster (cluster the job was submitted to) to schedule the job. NOTE: This is not intended as a high-throughput environment. If scheduling more than 50,000 jobs a day, consider configuring fewer clusters that the sibling jobs can be submitted to or directing load to the local cluster only (e.g. --cluster-constraint= or -M submission options could be used to do this). 

14. SLURM: Partitions

15. SLURM: Federation

16. SLURM: Multicluster

17. High level architecture10 Gb/s56 Gb/s

18. Tier-2 services

19. Network

Related Contents


Next Show more