How to define How to use T CartierMichaud Andrea Apollonio Milan Ashwin Vekaria Miriam Ruth Blumenschein Jan Uythoven Risk matrices of LHC Defining acceptable failures rate wrt severity recovery time ID: 778194
Download The PPT/PDF document "Update on risk matrices for LHC" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Slide2Update on risk matrices for LHCHow to define ?How to use ?
T.
Cartier-Michaud, Andrea Apollonio, Milan Ashwin Vekaria,
Miriam Ruth Blumenschein, Jan Uythoven
Slide3Risk matrices of LHC
Defining acceptable
failures rate w.r.t. severity / recovery time
:
Risk matric of LHC
LHC
risk
matrix
Recovery
∞
year
month
week
day
hours
minutes
S7
S6
S5
S4
S3
S2
S1
Frequency
1 / hour
1 / day
1 / week
1 / month
1 / year
1 / 10
years
1 / 100 years
1 / 1000 years
Slide4Risk matrices of LHC
Defining acceptable
failures rate w.r.t. severity / recovery time:
Respecting this matrix implies
At most 15% of “recovery time”
It does not directly translate into availability
Recovery implies turnaround ~> lost physics ?
Risk matric of LHC
Type of stop
Total
duration [days]
Percentage vs
op
Minutes type
250 days x 1/day x 15 minutes = 2.6
2.6/250 ~= 1%
Hours type
36 weeks x 1/week x 5 hours = 7.5
7.5/250 ~= 3%
Day type
8 months x 1/month x 2 days = 16
16/250 ~= 6%
Week type
1 year x 1/year x 10 days = 10
10/250 ~= 4%
Month type
1 year x 1/ 10 years x 45 days = 4.5
4.5/250 ~= 2 %
Year type
1 years x 1/100 years x 250 days = 2.5
2.5/250 ~= 1%
Slide5Using AFT (2017 and 2018) distribution function of fault durations
1582 faults for 2097.56 hours 2100h / (2x6000h) ~= 17%
If a fault occurs in stable beam recovery = fault duration + turnaround (~5 hours)
256 faults in stable beam
(512.83 + 1280) + 1584.73 ~= 3400
hours of recovery ?
Risk matric of LHC
Background of risk matrices
2017 & 2018
1326 faults
1584.73 hours
2017 & 2018256 faults
512.83 hours
Slide6Risk matric of LHC
2017 & 2018
1326 faults
1584.73 hours
2017 & 2018
256 faults
512.83 hours
Slide7Risk matrices of LHC
Different distributions would be acceptable ?
Risk matric of LHC
LHC
risk
matrix
Recovery
∞
year
month
week
day
hours
minutes
S7
S6
S5
S4
S3
S2
S1
Frequency
1 / hour
1 / day
1 / week
1 / month
1 / year
1 / 10
years
1 / 100 years
1 / 1000 years
Slide8Risk matrices of LHC
Defining acceptable
failures rate w.r.t. severity / recovery time:
Fancy results, WIP !
Need for a better measurement of turnaround
Extension of this study to 2016 ? 2015 ?
(faults of “week kind”)
Filtering by “filling scheme” ? (using past statistics to
predict future !?)
Risk matric of LHC
Type of stop
(with
turnaround of 5h)
number
Impact [hours]
Percentage vs
op
<
30 min
402
51.9
0.9
%
30 min < < 10h
366.5
1200.5
20
%
10 h < < 48h
22
411.8
6.9%
Slide9Measured “background” = how far to the edge are we ?
Risk matric of LHC
Background of risk matrices
LHC
risk
matrix
Recovery
∞
year
month
week
day
hours
minutes
S7
S6
S5
S4
S3
S2
S1
Frequency
1 / hour
1 / day
1 / week
1 / month
1 / year
???
1 / 10
years
1 / 100 years
???
1 / 1000 years
???
Slide10Risk matric of LHC
Performance & Protection
Protection
Performance
Performance range:
Frequency ~< 1/month
Available statistics
Possible predictions
(
Isographe
,
AvailSim
)
Protection range:
Frequency >~ 1/years
Not so much statistics yet
Difficult to predict
“ double or nothing”
Slide11Risk matric of LHC
How to use for new system ?
1) Study of schematics / reliability test / expert estimations
Failure rate / distribution function
2) Probability to fail (or not to fail) in a given period
Cumulative distribution function
Probability
that the failure X happens before
2 hours is less than or
equal to 10
%
3) How to set thresholds ?
Fixing x (duration) or T (probability threshold)
10% of failure for a given duration
Is unlikely to fail once before the given duration ?
90% of failure for a given duration
Is likely to fail at least once !
4) Maintenance strategy ? (time dependent failure rates)
AFT should / could automatically provide “background” of risk matrices
Risk matrices could be an output of AWG reports
Need for a tuning of categories ?
How to predict the impact of a new system ?
If
Probability that the failure X happens before x is less
than or equal
to the threshold T
Depending on the back ground andon the threshold:
T = 10 % (unlikely)
x = 1 week
it will happen every month
severity = S2
fine
severity = S3
maybe not fine !
slot already populated
Risk matric of LHC
Conclusion
(WIP)