Reliability data consists essentially of equipment failure ratesThe number of failures of an item per unit time. May be observed failure rate (e.g. from 'returns' data); assessed failure rate (from tests) or extrapolated (from tests at higher stress levels). and repair ratesThe number of corrective maintenance actions that can be carried out in unit time. Corrective action should also include logistic allowance, diagnosis time, and time needed to bring the equipment fully back on line, unless some of these activities are considered elsewhere in calculations.. The same information can also be expressed as mean time between failures ( MTBFThe mean time between failures is the total time divided by the total number of failures. When the failure rate is constant with respect to time MTBF is the reciprocal of the failure rate. ) and mean time to restore service ( MTTRSThe mean time to restore service is the total repair time divided by the total number of repairs (failures). The repair rate is usually constant with respect to time so that MTTRS is the reciprocal of the repair rate. The repair action should include logistic allowance, diagnosis time, and time needed to bring the equipment fully back on line, unless some of these activities are considered elsewhere. ).
Failure ratesThe number of failures of an item per unit time. May be observed failure rate (e.g. from 'returns' data); assessed failure rate (from tests) or extrapolated (from tests at higher stress levels). data is basically empirical in origin, but there is a problem.: We are normally interested in the failure rates of inherently reliable equipment so that direct measurement of failure rates requires large number of tests each of long duration. This is usually impractical for use as part of design process and alternative methods must be used. Direct determination of failure rates from equipment returns from already installed systems is a very useful source of direct data, so long as effective management and diagnosis of the failure mode of the returns is carried out. But this is no use for the first installations of a type of equipment. Determination of the failure rate either from tests or returns is based on a worst case of the next failure being just about to occur and with a Chi squareThis is the distribution of the standard deviation determined from a normal population. It arises in the calculation of the rate of occurrence from truncated life tests. distribution and at 5% significance.
Equipment failures actually originate from individual component failures so that the failure rates of equipment
can be estimated by accumulating the failure rates of all the components in the equipment. Various
authoritiesUS MIL HANDBOOK 217E.
CNET (French PTT) Data.
HRD5 (British Telecom).
RADC Non-Electronic Parts Handbook NPRD.
OREDA (Offshore data).
provide estimates of component failure rates. To estimate the equipment failure rate all the components
in the equipment are listed , along with the numbers if items (and any redundancy between them) the
failure modeThe functional consequence of a specific failure. Failure of an item may not result in the loss of all functions performed by the item.
of the item and the individual failure rate of the component. Summing the failure rates for a failure mode
estimates the equipment failure rate for the failure mode. These calculations can be very conveniently be
carried out on a spreadsheet using standard redundancy formulae (see Reliability Equations).
The methods above actually only apply if all the components have a constant failure rate. In reality most components have some type of bathtub failure rate curve with accelerated failure rates at the start and end of the component life. However except for major components in which this effect is pronounced it is usually ignored. Early failures are dealt with as part of warranty and equipment is replaced before it expires.
Corrective maintenance data is usually much easier to deal with. Modern site maintenance is usually based at a moduleA collection of hardware components that are more or less permanently connected. If a component in the module fails the complete module is replaced. level with the modules being replaced on site and failed modules either being scrapped or returned to supplier or a third level maintainer for refurbishment. Diagnosis, replacement and restarting can be estimated by observation, experiment or expert opinion. The critical aspect is the time to restore service so that logistic allowances need to be included for arrival of maintenance staff and spare parts.
Optimal Solutions welcomes enquiries on Reliability, and would be pleased to provide consultancy tailored to your requirements. You can get in touch by sending a message from our Contact Us page, or by calling us on the number below.