Benefits and Challenges of System Prognostics

8 downloads 0 Views 656KB Size Report
Bo Sun, Member, IEEE, Shengkui Zeng, Rui Kang, Member, IEEE, and Michael G. ... B. Sun, S. Zeng, and R. Kang are with the School of Reliability and Systems.
IEEE TRANSACTIONS ON RELIABILITY, VOL. 61, NO. 2, JUNE 2012

323

Benefits and Challenges of System Prognostics Bo Sun, Member, IEEE, Shengkui Zeng, Rui Kang, Member, IEEE, and Michael G. Pecht, Fellow, IEEE

Abstract—Prognostics is an engineering discipline utilizing in-situ monitoring and analysis to assess system degradation trends, and determine remaining useful life. This paper discusses the benefits of prognostics in terms of system life-cycle processes, such as design and development, production, operations, logistics support, and maintenance. Challenges for prognostics technologies from the viewpoint of both system designers and users will be addressed. These challenges include implementing optimum sensor systems and settings, selecting applicable prognostics methods, addressing prognostic uncertainties, and estimating the cost-benefit implications of prognostics implementation. The research opportunities are summarized as well.

HUMS

Health and Usage Monitoring System

ICAS

Integrated Condition Assessment System

IDPS

Integrated Diagnostics and Prognostics System

IRR

Internal Rate of Return

IVHM

Integrated Vehicle Health Management

JDIS

Joint Distributed Information System

JSF

Joint Strike Fighter

LAV

Light Armored Vehicle

LCC

Life-cycle Costs

LRU

Line Replaceable Unit

ACRONYMS

MTTR

Mean Time to Repair

ACMS

Aircraft Condition Monitoring System

NASA

National Aeronautics and Space Administration

ALS

Autonomic Logistics System

NFF

No Fault Found

ANNs

Artificial Neural Networks

NPV

Net Present Value

AOG

Aircraft on Ground

NTF

No Trouble Found

ARL

Applied Research Laboratory

OEMs

Original Equipment Manufacturers

ATSV

ARL Trade Space Visualizer

PAR

Precision Approach Radar

CALCE

Center for Advanced Life Cycle Engineering

PCB

Printed Circuit Board

CBA

Cost-benefit Analysis

PDF

Probability Density Function

CBM

Condition-based Maintenance

PHM

Prognostics and Health Management

CND

Cannot Duplicate

PNNL

Pacific Northwest National Laboratory

COTS

Commercial off the Shelf

PoF

Physics of Failure

DoD

Department of Defense

RCA

Root Cause Analysis

EEEU

End Effector Electronics Unit

RF

Radio Frequency

EMS

Engine Monitoring System

ROI

Return on Investment

FMEA

Failure Modes and Effects Analysis

RTOK

Re-test Ok

RUL

Remaining Useful Life

SVMs

Support Vector Machines

TMS

Transmitter Management Subsystem

TNI

Trouble Not Identified

TWT

Traveling Wave Tube

Index Terms—Cost-benefit analysis, prognostics and health management, prognostics.

Manuscript received November 01, 2011; revised January 05, 2012; accepted January 27, 2012. Date of publication April 27, 2012; date of current version May 28, 2012. This work was supported in part by a Grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (CityU8/CRF/09). Associate Editor: Q. Miao. B. Sun, S. Zeng, and R. Kang are with the School of Reliability and Systems Engineering, Beihang University, Beijing 100191, China (e-mail: sunbo@buaa. edu.cn; [email protected]; [email protected]). M. G. Pecht is with the Prognostics and Health Management Center, City University of Hong Kong, Kowloon, Hong Kong, and is also with the Center for Advanced Life Cycle Engineering, The University of Maryland, MD 20742 USA (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TR.2012.2194173

I. INTRODUCTION

O

VER the past decade, prognostics and health management (PHM) has emerged as one of the key enablers for achieving system reliability, safety, maintainability, availability, supportability, and economic affordability [1], [2]. PHM

0018-9529/$31.00 © 2012 IEEE

324

is employed to analyze system performance and environmental data. Different methods are used to assess the degradation of a product or the deviation of a product from normal operating conditions. The main functionalities of PHM include fault detection, diagnostics, prognostics, and health management [2]. Prognostics permits the reliability of a system to be evaluated in its actual life cycle conditions. Moreover, prognostics predicts when and where failures will occur, thus giving users the opportunity to mitigate system-level risks [1], [2]. The importance of PHM implementation is explicitly stated in the U.S. Department of Defense (DoD) 5000.02 policy document on defense acquisition [3]. The policy states that “program managers shall optimize operational readiness via diagnostics, prognostics, and health management techniques in embedded and off-equipment applications.” Thus, PHM has become a requirement for systems sold to the DoD. Many names have been used to describe prognostics. For example, the prognostics technology used in U.S. Army rotorcraft was called the Health and Usage Monitoring System (HUMS) [4]. In aerospace, Integrated Vehicle Health Management (IVHM) was the term given for prognostics of reusable rockets, and later for various space applications used by NASA [5]. In other fields of the military, several prognostics labels have been defined, including the Aircraft Condition Monitoring System (ACMS), the Engine Monitoring System (EMS) [6], [7], the Integrated Diagnostics and Prognostics System (IDPS) [8], and the Integrated Condition Assessment System (ICAS) [8]. In the Joint Strike Fighter (JSF) program, the name Prognostics and Health Management (PHM) was adopted [9], [10]. Since then, prognostics technology has become an area of flourishing international research. Many prognostics practices have been conducted in various engineering applications, such as in the defense and military industry [4]–[11], the aerospace industry [12]–[14], wind power systems [15], civil infrastructure [2], batteries [16], mechanical manufacturing [17]–[23], consumer electronics, and computers [24]–[31]. For aerospace systems, Pratt & Whitney implements advanced prognostics and health management systems in their engine for the F135 multipurpose fighter [12]. Similar diagnostic and health monitoring systems are included in the Airbus A380 and Boeing 787 as well. For military applications, the U.S. Army includes prognostics technology in their weapons platforms, support vehicles, and even munitions [13]. For mechanical systems, vibration data-based algorithms and techniques are implemented in the compressors of propulsion and power-drive systems in ships [17]. Prognostics on gear tooth fatigue is applied in a helicopter gearbox through advanced physics-of-failure (PoF) modeling, and intelligent utilization of relevant diagnostic information [18]. Li and Nikitsaranont [19] used a combination of regression techniques, including both linear and quadratic models, to predict the remaining useful life (RUL) of gas turbine engines. Wang [20] developed a stochastic filtering approach to model the remaining life of ball bearings with a cost function to be minimized. Christer et al. [21] applied a state space model to an induction furnace, and analyzed the benefits of using the model

IEEE TRANSACTIONS ON RELIABILITY, VOL. 61, NO. 2, JUNE 2012

for determining the optimal refurbishment time. Wang et al. [22] predicted the failure of three water pumps using vibration data. For wind power systems, a state-of-the-art PHM technology for rotor and gearbox systems similar to HUMS has been implemented [15]. For electronics systems, physics-of-failure (PoF) based methods have been shown to be effective for prognostics. The PoF approach uses a product’s actual environmental and operational loads, together with PoF models, to calculate the accumulated damage, and predict the RUL of the product [24], [25]. The PoF-based approach has been successfully applied to notebook computers [26], the electronics in the NASA space shuttle solid rocket booster [27], commercial off-the-shelf (COTS) devices [28], and flash memory [29]. On the other hand, data-driven methods are also widely used in electronics prognostics. Prognostics has been implemented using a variety of techniques. The most important techniques are Markov chains, stochastic processes, and time series analysis. Some applications in electronics where data-driven approaches have been used for RUL estimation include computer servers [32], global positioning systems [33], avionics [34], and power electronics devices (IGBTs) used in avionics [35]. A fusion prognostics approach, which combines data-driven and PoF-based methods, has been developed to estimate the RUL of electronic components and systems [36], [37]. From all the applications that use prognostics, we can see that implementing prognostics generates many benefits. In this paper, the key benefits of prognostics are presented in Section II. The current barriers and technological challenges involved in implementing prognostics are discussed in Section III, and research opportunities are discussed in Section IV. II. BENEFITS OF PROGNOSTICS An effective prognostics capability enables customers, product manufacturers, and OEMs to monitor system health, estimate the RUL of systems, and take corrective actions. The benefits of prognostics include improved system safety, increased system operations reliability and mission availability, decreased unnecessary maintenance actions, and reduced system life-cycle costs (LCC). Prognostics can bring benefits in all stages of the system life-cycle process including design and development, production and construction, operations, logistics support and maintenance, phase out, and disposal. An overview of the prospective benefits of prognostics in the system life-cycle process is shown in Fig. 1. A. Benefits for System Design and Development 1) Optimize System Design: In system design and development, design knowledge and historical experience are transferred to the next generation system. An effective prognostics system is capable of collecting and storing useful historical information, such as system usage patterns, operating conditions, environmental conditions, known failure modes, and possible deficiencies. With this information, system designers can improve and optimize the design or re-design of critical components and sub-systems, then leading to fault-free systems.

SUN et al.: BENEFITS AND CHALLENGES OF SYSTEM PROGNOSTICS

325

Fig. 1. Potential benefits of prognostics in system life cycle process.

Product data (e.g., structural strength, environmental resistance, and failure data) are usually obtained from a large number of performance tests, environmental tests, reliability tests, and simulation tests. These tests are usually time consuming and costly. Furthermore, accelerated testing technology is often utilized, and this kind of test cannot represent the actual operating conditions of the system. Prognostics can provide actual operating and environmental condition data and information in the system life cycle, which in turn can be used to optimize design and test. The real life-cycle environmental and operational conditions of a product can be obtained by sensor systems. These data and information are fundamental and essential for the system design or test. 2) Improve Reliability Prediction Accuracy: Reliability prediction is of importance in system design and development. Traditional reliability prediction methods based on the use of handbooks, such as MIL-HDBK-217 and its progeny (e.g., 217PLUS, Telcordia, FIDES, PRISM, and GJB-299), have been proven to be misleading, as they provide incorrect life predictions for electronic products and systems [38]–[40]. In contrast, prognostics permits the in-situ monitoring and collection of environmental and usage loads of a system by using sensors in actual application conditions. The data collected in this way reflect the actual life cycle conditions of the system. This individualized system evaluation facilitates more accurate damage and RUL assessment. With more accurate reliability prediction, a better reliability plan can be made. More cost-effective and leaner reliability design can then be implemented.

3) Assist Design of Logistics Support System: During the preliminary design phase, system design considerations and decisions have a great impact on the logistics support system. Prognostics can assist in constructing a logistics support system. For electronic systems, the maintenance sector has been centered on the belief that electronics failures cannot be anticipated, but can only be dealt with as they occur [1]. This situation has led to large-scale support infrastructures made up of scattered spare parts depots, complex inspection and maintenance equipment, and other support resources. For this support system to operate dynamically, the corresponding transportation and supply chains must be effectively managed. Due to the failure time and the number of electronics that can be predicted in advance, the use of prognostics enables a significant reduction in the logistics support system infrastructure, and more effective use of various support resources. With advance failure warning and information about RUL, the logistics support system can dynamically tailor and implement logistics actions. In the JSF program, for example, the concept of the autonomic logistics system (ALS) has been proposed to enable the JSF to be better utilized throughout the life of its platform, and at a lower cost than what other aircraft may induce. In ALS, information about component life and system health is taken from the onboard PHM system. System reasoning and air-vehicle reasoning are then passed along to the joint distributed information system (JDIS) to inform the supply chain about what it has to do to keep the airplane operating effectively [41], [42]. B. Benefits in Production 1) Strengthen Process Quality Control: It is generally believed that quality and reliability must be designed into the

326

system production processes. Furthermore, optimal values for system and process parameters must be established and controlled throughout the production manufacturing phases of the life cycle. However, quality control approaches now being used to emphasize conformance to design specifications are necessary but not sufficient [43]. Quality losses accumulate whenever the parameter of a piece of machining equipment deviates from its nominal or optimal value, such as the performance of a cutting tool, which degrades gradually. The monitoring and prognostics of manufacturing equipment status (e.g., vibration, strength, power, and operating mode) as well as wear or breakage status can provide more information about equipment itself than traditional quality control, thus promoting the quality control process and quality assurance. In addition to providing feedback about design defects, prognostics can also monitor and record process defects such as poor component or assembly quality, human operation-induced damage, and overstress events, then transfer them to next-generation system production. Data and information on process defects of systems can be used by system manufacturers to take effective corrective action to improve system quality. 2) Maintenance Development of Sub-Systems Provided by OEMs: Most of the information used for high-level system prognostics comes from sub-systems and assemblies. It is necessary for system designers to work together with the original equipment manufacturers (OEMs) to define the variables to be monitored, the algorithms to be developed, and the tests to be accomplished for efficient condition-based maintenance (CBM). The OEMs may provide a component-level or sub-system-level prognostics solution for system manufacturers and operators. With this real-time monitoring and prognostics information, the system manufacturer can conduct condition-based maintenance, and use historical information during the development process. One example is the Primus Epic Central Maintenance Computer, which is included in the avionics systems used in several models of business jets, regional aircrafts, and rotorcrafts; and is currently in aircraft applications of Gulfstream business jets, Embraer regional jets, the Raytheon Hawker 4000 business jet, and the Agusta AB-139 helicopter [44]. A key lesson learned from this research is that “one major error made in several of the Primus Epic programs was not having the maintenance function represented by a maintainability specialist at design reviews and integration testing. Several suppliers and OEMs did not remain focused on maintenance-related issues as the system and software designs evolved.” The prognostics capability required by a system designer can urge OEMs to take effective efforts. C. Benefits for System Operations 1) Increase System Safety: Compared to traditional fault diagnostics, the major advantage of prognostics is the ability to predict failures. The goal of fault diagnostics is to warn system users, maintenance personnel, and support and logistics personnel that a certain portion of a system no longer operates normally. Prognostics provides the ability to anticipate incipient faults (at some predictable time or mission period) in components or subsystems prior to their progressing to catastrophic failure of the system. These capabilities enable more accurate

IEEE TRANSACTIONS ON RELIABILITY, VOL. 61, NO. 2, JUNE 2012

Fig. 2. Advance failure warning capability of prognostics [2].

management of the health of systems. Fig. 2 shows an example of degrading parameters and damage accumulation prognostics. The prognostic “distance,” which is the time interval between the advance failure warning time and the estimated failure time, is shown in Fig. 2. The required advance time before failure can vary from seconds to hours to days or even weeks or years. For example, the U.S. space shuttle has a 4-second window to eject the crew upon takeoff [5]. An advance time of even a few minutes before failure could be very significant, and could enhance system safety, especially for mission-essential systems whose failure might cause a disastrous accident. Another example is the prognostic warning time for an impending failure in an aircraft, which must permit safe landing as a minimum criterion. A prognostic warning indicating the need for unit replacement may ensure a lead time of hours, while a prognostic warning indicating the need for corrosion maintenance might provide a lead time of months [2]. 2) Improve Operational Reliability: From the perspective of a user or operator, any event that causes a system to stop performing its intended function is a failure event. These events include all design-related failures that affect the system’s function. Also included are maintenance-induced failures, no fault found (NFF) events, and other anomalies that may have been outside the designer’s contractual responsibility or technical control. With proper design and effective production process control, the inherent reliability of a system can be defined. However, under actual operating conditions, the environmental and operational loads may be quite different from what the system was designed for, and will affect the life consumption and operational reliability of the system. In such cases, a system with high inherent reliability under “improper usage” (such as wrong operation mode, or inappropriate operating stress) could lead to extremely low operational reliability. The monitoring capability of PHM makes it possible to take active control actions regarding environmental and operational conditions, and thus increase service lifetime. For example, ground combat vehicles have the operational capability requirement to operate in a “silent watch” mode, where the vehicle’s critical systems (e.g., communications)

SUN et al.: BENEFITS AND CHALLENGES OF SYSTEM PROGNOSTICS

must consume battery power for many hours (without the engine running for recharging). Without accurate information about the charge level of the batteries, the batteries can be drained to the point where they are even unable to restart the vehicle, which would affect the operation reliability of the system [45]. Such factors must be taken into account when designing a prognostics system for the batteries. Another example is the sudden unintended acceleration in cars that may be caused in part by electronic interference [47]. There still remains a strong suspicion by many experts that there is a problem with the electronic engine control technology used in some cars. In theory, a cell phone, satellite radio, or even a restaurant’s large microwave could cause an electronically controlled car’s accelerator to surge out of control. These factors affect the operational reliability of the system, and may not necessarily be accounted for in the design phase. Over the past 10 years, several major electronic component manufacturers have ceased production of military-grade components that were once considered immune to obsolescence [48], [49]. Many industries, such as aerospace, have also encountered this problem. When military-grade components become unavailable, industries have to turn to commercial components and up-rating technology [50]. The reliable operating time for uprated components is usually 5 to 7 years, while the anticipated life of an aircraft system or subsystem is generally longer than that [48], [49]. With the help of prognostics, a component’s usage conditions can be monitored and recorded. Then the RUL of a component is predicted with PoF-based or other methods. Further, sub-systems can be replaced before the end of their useful life based on the life information of components with the comprehensive considerations of economy and risks. Thus, operational reliability and mission success can be achieved. 3) Extend Service Life of the System: Aging and obsolescence have been major problems troubling system operations for many years. This is especially true for systems with long service life, such as the airplanes, trains, nuclear power plants, and communication base stations [9]. The components used in these systems all face aging issues. Using PHM, operators can determine the remaining life and extended life, and develop replacement plans for systems and their sub-systems. As an example of extended life analysis using PHM, the space shuttle’s remote manipulator system end effector electronics units (EEEUs) were designed in the 1970s with a target application life of 20 years [53]. When these systems performed without any failures for nearly 20 years, NASA required an analysis to determine the RUL of these systems. In 2001–2002, the manufacturer of the shuttle remote manipulator system in collaboration with the Center for Advanced Life Cycle Engineering (CALCE) performed RUL analysis, and determined that the EEEU could be extended until 2020 [53]. This project was so successful that NASA subsequently had CALCE conduct this analysis for the electronics in the space shuttle booster rocket [27]. Another example of extended life applications using PHM involved the U.S. Army AN/GPN-22 and AN/TPN-25 Precision Approach Radar (PAR) systems. The PAR is currently used at bases worldwide, but was initially deployed in the late 1970s.

327

Obsolescence issues have been plaguing many PAR transmitter components. These systems now use a microprocessor-based transmitter management subsystem (TMS) that contains a prognostics engine for empirical prognostic analysis, and provides prognostics capability to evaluate and help extend system life by replacing the tube before its failure [54]. A transmitter Traveling Wave Tube (TWT) Prognostic Engine autonomously performs an analysis to determine optimal tube filament parameters to extend tube life, and computes tube life estimates. Except for catastrophic failures, this approach allows for tube replacement during scheduled maintenance, minimizing system downtime. 4) Reduce the Occurrence of No Fault Found (NFF): With the increasing complexity of modern systems, there have been more reports of intermittent malfunctions that occur but cannot be verified, replicated at will, or attributed to a specific failure site, mode, or mechanism [55]. These conditions are often labeled: no-fault found (NFF), no trouble found (NTF), trouble not identified (TNI), cannot duplicate (CND), and “re-test ok” (RTOK). In many instances, these conditions are not considered to be failures by industry. However, the downtime of these systems due to these conditions can lead to significant economic losses and safety concerns. In a commercial field failure study, NFF observations reported by commercial airline repair depots ranged from 50–60% [7]. Sorensen [51] found that the occurrence of intermittent and NFF failures on military aircraft can be as high as 50%. In the data mining of over 12,000 depot-level aerospace maintenance records from Honeywell since 2008, it was found that over 40% of these records showed NFF, and it was noted that there were about ten times more NFFs reported for electronics systems than for mechanical systems. Furthermore, electronics NFFs required an average of 47% more time in depot-level repair than non-NFFs (i.e., it takes a lot more effort to prove something is non-defective than it does to find and diagnose a defect) [55]. The result is increased maintenance costs, decreased equipment availability, increased customer inconvenience, reduced customer confidence, damaged company reputation, and in some cases potential safety hazards [55]. Intermittent failures are impossible to assess using traditional prediction methods. They can occur under certain environmental conditions, and cure themselves under other conditions. This results in the supply and maintenance chains suffering NFF problems [53], despite advances in built-in tests and automatic test equipment. Prognostics offers a potential solution to mitigate NFF risks. The continuous in-situ monitoring and collection of real-time data provides an overall usage trend, where an anomaly can be identified, critical parameters related to a specific fault model can be isolated, and the root cause can be traced back. Prognostics can help detect intermittent failures, isolate faults, and reduce false alarm rates, which in turn decreases system downtime, and increases system availability. 5) Improve Warranty Service: For many commercial applications, there is no maintenance of the products. Product designers and users do not care about condition-based maintenance (CBM) for such products, but nevertheless designers are concerned with the customer experience and customer satisfaction. In such cases, product warranties and warranty costs are measures of success.

328

Suppliers can improve their product warranties and enhance service with a system prognostics capability. With prognostics, system reliability can be estimated in-situ, and problematic components or structures can be located and isolated from the system based on the monitoring data and usage profile. Warranty engineers can make servicing decisions based on prognostics data. They can determine the level of RUL at which to service a product under warranty, and whether to recall a product. PHM is also useful for determining the root cause of a failure, and take corrective actions quickly. All these advantages will reduce customer complaints, increase market share, and reduce warranty costs [56]. Prognostics can also assist in reducing the liability risk of suppliers and insurance companies for failures. Some failures or unwanted events are caused by misuse, incorrect user operation, or environmental factors, not by system design or manufacturing problems. Prognostics can provide insight into the operating history of a system, and help identify handling and customer-induced failures that could be excluded from warranty coverage. This benefit will reduce the liability of the system supplier. In addition, because the operator has a better awareness of the system’s health status, preventive measures can be stated in user documentation and warranty notices.

D. Benefits in Logistics Support and Maintenance 1) Enable CBM and ALS: Maintenance is traditionally performed either at time-based fixed intervals (i.e., scheduled or preventive maintenance), or as corrective action (reactive maintenance) [54], [57]. Corrective maintenance is passive in nature, and takes place after a failure has already occurred. Scheduled maintenance is active in nature, and occurs at predetermined intervals even if a failure has not yet occurred; this approach can induce defects due to unnecessary handling. Prognostics, however, provides forecasted or predictive maintenance (often called condition-based maintenance (CBM)). Prognostics provides a foundation for CBM that results in minimized unscheduled maintenance, eliminated redundant inspections, reduced scheduled regular maintenance, extended maintenance cycles, improved maintenance effectiveness, decreased ground test equipment requirements, and reduced maintenance costs [9], [10], [61]. As an example, the prognostics system in the ALS concept for the JSF [61] is expected to achieve a 20–40% reduction in maintenance manpower, and a 50% reduction in the logistics footprint of machinery required to support the plane. 2) Optimize Logistics Supply Chain: Prognostics can integrate real-time system health information into the decision making of the logistics model. Predictive logistics is expected to optimize the performance measures of a system, and improve the planning, scheduling, and control of activities in the supply chain. A prognostics solution provides benefits such as improvements in material and human resources logistics, and stock optimization. The main benefit comes from the fact that the operator can use information about the components’ RUL to buy spare parts only when the components are just about to fail. Khalak

IEEE TRANSACTIONS ON RELIABILITY, VOL. 61, NO. 2, JUNE 2012

and Tiemo [62] presented a methodology that can be applied to estimate the supply chain benefits of prognostics applications. This methodology yields an optimal stock level for each node in the supply chain. The stock level is a function of the lead time provided by the prognostics application, taking into account some restrictions, and some prognostics design constraints. Accurate prognostics enables the replacement of parts to be scheduled based on the actual health of the parts. In such a case, fewer spare parts are needed because the lead time before failure is provided by the prognostics capability. Advance knowledge of the required number of spare parts allows them to be delivered “just in time;” thus, in-stock inventory levels are reduced. This result will lead to a substantial simplification of the supply chain. 3) Increase Maintenance Effectiveness: Prognostics can improve troubleshooting, enhance root cause analysis, and help to prepare for maintenance in advance. Through the improvements made possible by prognostics, troubleshooting can accurately identify failure sites and failed components (better fault detection and fault isolation) so that they can be quickly replaced. Accurate fault location can help users to avoid unnecessary activities, and reduce the time required for maintenance tasks and activities. Enhan0ced root cause evaluation can assist maintenance personnel in taking correct, effective maintenance actions. Finally, system failure prognostics and health status reports can be transmitted to the maintenance staff in advance of potential failure events. This approach will provide lead time for maintenance planning, parts procurement, and equipment and manpower preparation. For example, the transmission of failure reports to the ground during aircraft flight allows for the preparation of maintenance tasks to take place while the aircraft is still in flight. Wang and Pecht [63] analyzed the economic benefits of applying canaries to electronic systems with respect to the optimal replacement time of canary-equipped systems. The model developed can inform the user in terms of the expected total cost as to whether the canary should be used or not; and, if a canary is used, the model can help determine the optimal long-term replacement policy of the line replaceable unit (LRU) under a variety of different cost and distribution parameters. 4) Reduce Maintenance and Inspection and Repair-Induced Failures: When a mechanic works on a system to fix or replace a component, he may accidentally cause damage to another component. This result is called a maintenance-induced failure, or collateral damage during repair. Fixing such failures may require additional manpower and spare parts. In addition, if the damage is unnoticed during the maintenance task, additional system downtime or even potential reliability and safety issues can result. Prognostics reduces the need for maintenance activities in the system, thus reducing the occurrence of human-related issues. 5) Avoid Costs in Direct and Indirect Maintenance Manpower: With fewer parts removed, less time will be spent in ground inspections, failure detection, failure isolation, and repairs. Also, there will be less need for a maintenance crew to set up and use ground test equipment. A reduction in the amount of ground support equipment and manpower requirements will directly reduce the support costs of system operation and maintenance. With significant reductions in direct manpower, the

SUN et al.: BENEFITS AND CHALLENGES OF SYSTEM PROGNOSTICS

number of indirect support staff can also be reduced. Less direct maintenance manpower means that fewer people have to be trained, which would reduce training costs as well. E. Benefits in Phase-Out and Disposal New systems are introduced to the market almost every day. Take computers as an example. The average upgrading life for a computer has decreased from 4-5 years at the beginning of the 1990s to 1–2 years currently [64]. This rate means that computers will enter the waste stream much faster than before. During the last decade, product recovery has become an obligation for the sake of the environment and society itself. Recovery is promoted primarily by governmental regulations and customer perspectives on environmental issues [65], [66]. There are two main end-of-life options for products: reuse or remanufacturing, and recycling. For both options, all take-back units are treated equally because no information about the conditions of a system during its useful life is available. For example, all retired computers are treated equally, no distinction being made concerning which units still have healthy hard disks [65], [66]. System recovery aims at minimizing the amount of waste sent to landfills by recovering parts and materials from old or outdated systems by means of disassembly, remanufacturing (including reuse of parts and products), and recycling. With PHM, a system’s full-life-cycle data, including installation, operation, and maintenance, can be managed and used to optimize end-of-life processing operations. Parts removed can be classified for treatment according to their life history and RUL. Zeid et al. [67] presented a sensor-based monitoring and prognostics approach for smart and selective disassembling and refurbishing of systems with known remaining lives. It is believed that a system with prognostics capability can meet the requirements of modern society: energy saving, emission reduction, and a green environment. F. Benefits in Reducing Life-Cycle Costs A system’s total life-cycle costs are often unknown, especially those associated with system operation, maintenance, support, and logistics. Costs associated with research, design, testing, production, and acquisition account for only a small proportion of the total life cycle cost [64]. Prognostics systems offer a great reward in terms of reducing the overall life cycle cost [68], [69], especially the total ownership cost of the operations, maintenance, and logistical support for the critical system. Some aspects of these cost avoidance opportunities are discussed below. 1) Reduce Regular Inspection Costs: Some maintenance tasks, mainly inspections and function checks, can benefit from the monitoring of system variables. This monitoring can be a substitute for traditional maintenance procedures, which in some cases may include the removal of many access panels and the use of ground support equipment. All of these activities will cost money. Scheduled inspections are undertaken at a rate high enough to identify out-of-specification systems prior to an operating period. But with the warning period provided by prognostics, the rate at which inspections are needed can be reduced, re-

329

sulting in decreased costs for maintenance crews and inspection equipment. For example, many automobile manufacturers want to move away from the current model of diagnostics to a model of prognostics, which would allow them to monitor their products to avert sudden and unexpected failures, and perform maintenance activities only when they are required rather than as a matter of routine scheduled maintenance [67]. In this way, the costs for regularly scheduled maintenance will be greatly reduced for the owner of a car. 2) Avoid Costs From Unnecessary Replacement of Components With Remaining Life: Scheduled maintenance actions are often performed on systems that are still in good working order. The reason is that scheduled maintenance must, by its nature, be conservative, and anticipate the worse set of conditions. However, time and money spent on a system that has not failed is unproductive. Prognostics can help reduce the amount of RUL that is wasted, and thus can reduce the cost of part replacement [64]. For example, the total number of batteries replaced for the U.S. Army’s Light Armored Vehicle (LAV) fleet per year is 701, of which 11%, or 77 batteries per year, are misdiagnosed and unnecessarily thrown away [45]. The benefit of prognostics for the LAV fleet would be approximately US$5.8 K per year for avoiding the needless replacement of healthy batteries, plus an additional savings of US$1.5 K for battery disposal cost avoidance per year. Over the design life of the vehicle (13 years), this equates to a savings of US$94.9 K for the avoided replacement and disposal of misdiagnosed batteries [45]. G. Reduce Replacement Cost The replacement costs when a failure takes place are very high. If all or some fraction of the field failures can be avoided, then cost avoidance may be realized by minimizing the frequency of unscheduled maintenance. For example, tactical Stirling cryocoolers are life-limited components used in many infrared sensing systems [70]. Raytheon Company has thousands of systems fielded that include cryocoolers. For a 1,000 unit fleet, the cost to replace the components when impending failure was predicted was US$3.6 M; the cost to replace at failure was US$10 M [70]. The cost benefit analysis indicates that investments in this area are likely to have large payoffs. III. CHALLENGES FOR THE IMPLEMENTATION OF PROGNOSTICS The implementation of a prognostics system generally includes several key processes and technologies, such as data acquisition, data processing, fault diagnostics, prognostics, and decision reasoning (see Fig. 3). Prognostics system implementation has its own life cycle process, including design and development, test and evaluation, verification and validation, production, and application [71]. Although the list of the major benefits of prognostics is impressive, prognostics technologies are still not mature enough. The following subsections discuss challenges facing the implementation of prognostics. A. Implementing Optimum Sensor Selection and Localization Data collection is an essential part of prognostics. It often requires the use of sensor systems to measure the environmental,

330

Fig. 3. Major components of prognostics implementation.

operational, and performance parameters of a system [1]. Inaccurate measurements resulting from improper sensor selection and localization (e.g., making sure the sensor is in the right position to obtain the right data) or inadequate measurements can degrade the prognostic performance. An approach is also needed [72] to conduct trade-offs for various types of sensors (pressure, speed, thermal, humidity, optical, magnetic, electric, sound, gas) in terms of precision, sensitivity, stability, power consumption, reliability, and sensor networking. Knowledge of the appropriate performance parameters or precursors to be monitored will help in the selection of the right sensor for the product. The sensors that are selected should be able to accurately measure the change in the parameters linked to the critical failure mechanisms. The successful implementation of PHM relies on data from the system. When sensors are used to collect these data, the possibility of sensor failures must also be taken into account. Some strategies to improve the reliability of sensors have been presented, such as using multiple sensors to monitor the same system (i.e., redundancy), and implementing sensor validation to access the integrity of a sensor system and adjust or correct it as necessary [73]. B. Selecting Applicable Prognostics Methods When data are transmitted, it is necessary to consider noise and interference in data obtained from sensors, which can influence prognostics performance and accuracy. Therefore, the data used for prognostics need to be preprocessed (through processes such as data filtering and reduction). There are many filters, but a priority selection process has not yet been developed. Generally, methods for prognostics can be grouped into data-driven methods, PoF-based methods, and fusion methods [1], [75], [76]. Data-driven methods are based on machine-learning techniques, and statistical pattern recognition. PoF-based methods utilize knowledge of a product’s life cycle loading and failure mechanisms to assess its reliability. The fusion prognostics approaches combine PoF-based and data-driven approaches [32]–[37]. Prognostics methods can vary widely for different types of products and failure modes. Proper selection of prognostics methods for a particular domain is a key factor that determines whether a prognostics system will be effective. Data-driven methods are based on machine-learning and statistical techniques. These algorithms can be implemented at the system, subsystem, or component levels [77]. In general, machine-learning techniques can be classified into three categories: supervised, unsupervised, and semi-supervised

IEEE TRANSACTIONS ON RELIABILITY, VOL. 61, NO. 2, JUNE 2012

learning approaches. The training data used by supervised and semi-supervised learning need to be classified correctly, which might affect the confidence level of the algorithm. Additionally, optimization and search methods are often employed in these data-driven methods, and their computational complexity and tractability are critical for efficient and effective algorithms. On the other hand, these data-driven methods often address only anticipated faults, in which a fault “model” is a construct or a collection of constructs, such as artificial neural networks (ANNs), support vector machines (SVMs), expert systems, etc., that must be trained first with known prototype fault patterns (data) [2]. For unsupervised learning approaches, the given data have no predefined classes, and do not include any labeled data. The algorithm using unsupervised learning finds clusters by itself from its unlabeled data. There are different ways of dividing the data into clusters, and many different ways to prescribe clusters. The same data might be differently clustered according to its clustering algorithm. Acquisition of labeled input data is costly because an expert needs to distinguish the class of data. Statistical techniques are divided into parametric and non-parametric methods based on whether the information regarding the distribution of the data is assumed or not. These methods are very mature [78]. However, a large amount of failure data is needed to implement these approaches to allow for analysis, and this can be more problematic if the monitored systems exhibit intermittent faults. Most data-driven approaches depend on historical (i.e., training) system data to determine correlations, establish patterns, and evaluate data trends leading to failure. In many cases, there is an insufficient amount of historical or operational data to obtain health estimates, and determine trend thresholds for failure prognostics. This condition is true, for example, for stored, standby, and non-operating systems, which are nevertheless subject to environmental stress conditions. There are also challenges in systems where failures are infrequent [2]. PoF-based prognostic methods utilize knowledge of a product’s life cycle loading and failure mechanism models, control models, or some other phenomenologically descriptive models of the system to assess product reliability, and estimate the system’s remaining life [1], [75]. The advantage of a PoF-based method is often its ability to isolate the root cause(s) that contribute to system failure [77]. However, sufficient information about the product is needed. For example, in PoF models, the materials, geometry, and operational and environmental conditions are required. In complex systems, these parameters may be difficult to obtain. The development of models requires some knowledge of the underlying physical processes that lead to system failure, but in complex systems it is difficult to create dynamic models representing the multiple physical processes occurring in the system [1]. This is one of the limitations of PoF-based approaches. As noted, a requirement of the PoF-based approach is that systemspecific knowledge, such as geometry and material composition, is necessary but may not always be available. Further, failure models, or graph-based models are not suitable for the detection of intermittent system behavior because they are modeled for specific degradation mechanisms, or for the diagnosis of specific faults, respectively. In addition, PoF-based methods

SUN et al.: BENEFITS AND CHALLENGES OF SYSTEM PROGNOSTICS

331

Fig. 5. Schematic illustration of prognostic accuracy concept [2]. Fig. 4. Uncertainties in PoF-based electronic prognostics.

cannot be used on every component in a complex system due to technical and economic considerations. Pecht et al. [32]–[37], [79] developed a fusion prognostics approach that combines the PoF and the data-driven approaches to estimate the RUL of a product in its actual life cycle conditions. The combination of the PoF approach and the data-driven approach provides a means to correlate data trends and precursor events with failure mechanisms, and also to isolate the root cause of failure. The data-driven approach is used to carry out product diagnostics by anomaly detection, while information from the PoF models, product standards, and specifications is incorporated into the data-driven techniques for estimation of RUL. The parameters causing anomalous behavior are isolated using data-driven techniques or knowledge of the PoF. These parameters are used to identify the failure mechanisms and relevant PoF models, and for RUL estimation. This fusion method therefore combines the strengths of the individual approaches to provide more accurate diagnostics and estimation of RUL, as well as information regarding the parameters that indicate product failure, thereby helping with root-cause analysis. C. Addressing Prognostic Uncertainties, and Assessing Prognostic Accuracy Another major challenge for the use of prognostics is the need to develop methods that are capable of handling real world uncertainties that lead to inaccurate predictions. For example, Gu et al. [74] studied various sources of prognostic uncertainty. In their study, they found that measurement inaccuracies of the sensors were one of the main sources leading to uncertainty in their prognostics application. Prognostics errors can lead to unnecessary preventive maintenance due to underestimation of system remaining life (false alarms) on the one hand, and unnecessary system failures and even catastrophic events due to overestimation of RUL on the other hand. Although some methods for uncertainty analysis and assessment using PoF models have been developed, there are several challenges to the implementation of uncertainty into prognostics [80], [81]. Fig. 4 shows some sources of uncertainty for PoF-based electronic prognostics [2]. These uncertainties are generally grouped into three different categories: 1) model (PoF and accumulative damage) uncertainty caused by model simplification and model parameters, 2)

measurement and forecast uncertainty induced by life cycle environmental and operational loads, and 3) uncertainties with the characteristic parameters (geometry and materials) of products mainly caused by the production process. These uncertainties can lead to the significant deviation of prognostics results from the actual situation. For data-driven methods, long prognostic distance (as shown in Fig. 2.) prediction of RUL or time to failure increases the uncertainty bounds due to various sources, such as measurement or sensor errors, future load and usage uncertainty, model assumptions and inaccuracies, loss of information due to data reduction, prediction under conditions that are different from the training data, and so on [1]. Hence, the development of methods that can be used to describe the uncertainty bounds and confidence levels for values falling within the confidence bounds is required. Another research area is uncertainty management, where methods are being investigated to reduce the uncertainty bounds by using system data as more data become available [82], [83]. Prognostic accuracy assessment technologies are necessary for building and quantifying the confidence level of a prognostics system. Methods to impartially evaluate the effectiveness and accuracy of prognostics are required. As mentioned above, uncertainties can affect the prediction of actual failure time for field products, which are characterized by an actual life distribution interval rather than an actual life single-point value [84], [85]. On the other hand, with the consideration that the geometry and material parameters of field products cannot be measured one by one, and that future loads are inherently uncertain, the prognostic life of PoF models should be expressed as a distribution (i.e., prognostic life distribution). Fig. 5 illustrates the difference between an actual life distribution and a prognostic life distribution. An ideal prognostic accuracy and effectiveness case is one where the prognostic life probability density function (PDF) is narrow (i.e., has a small distribution span) and is fit to the actual life PDF (i.e., a prognostic distribution curve consistent with the actual distribution curve). There is no general agreement as to an appropriate and acceptable set of metrics that can be employed effectively to assess the technical performance of prognostic systems [84]. Leão [87] proposed a set of metrics to evaluate the performance of prognostics algorithms, including prognostics hit score, false alarm rate, missed estimation rate, correct rejection rate, prognostics effectivity, etc. Saxena [88] also suggested a list of metrics to assess critical aspects of RUL predictions, such as

332

prognostic horizon, prediction spread, relative accuracy, convergence, horizon/precision ratio, etc. Although efforts have been made to cover most PHM requirements, further refinements in concepts and definitions are expected as prognostics matures. D. Analyzing Cost-Benefit of Prognostics Applications The benefits of prognostics are many, but prognostics also costs money in terms of acquisition and installation costs, implementation costs, and changes in business practices. Apart from those PHM costs, the cost of re-design of host product can be a big investment. For example, to deploy a sensor and microprocessor on a ball bearing or gearbox, the original cables need to be re-wired to supply power to the sensor. The casting must also be re-designed to take in the sensor and protect it from the environment. These implementation costs need to be accounted for. If there is no economic benefit (or too high of a perceived risk, particularly regarding consequential damages), system vendors may not wish to implement PHM. Cost-benefit analysis (CBA) and quantitative assessment are therefore essential for assessing the effectiveness of prognostics [89]–[92]. There are many financial metrics that can be used in a cost benefit quantitative analysis, including net cash flow, cumulative cash flow, payback, return on investment (ROI), net present value (NPV), and internal rate of return (IRR) [45], [93]. Among all these metrics, ROI is one of the most selective metrics. ROI tells us the rate of return on the investment in prognostics, which enables the investment in prognostics to be compared with other competing investments [63], [88]. The benefits as mentioned above help the prospective user of prognostics understand the practical drivers of this technology, but the user still needs more information to justify their investment in the technology. The information that is most useful to the user is a calculated ROI for their particular system that provides a financial assessment of the benefit of the investment. ROI involves an analysis of the cost avoidances made possible by using prognostics technology against the costs associated with the development, manufacture, installation, and implementation of prognostics technology in selected systems [89]. The determination of ROI allows system managers to include quantitative, readily interpretable results in their decisionmaking. ROI analysis may be used to select between different types of prognostics technology, optimize the use of a particular prognostics approach, or determine whether to adopt prognostics versus more traditional maintenance approaches. For instance, the Pacific Northwest National Laboratory (PNNL) conducted an initial CBA assessment to aid in decisions about whether or not to develop a prototype prognostics system for the AGT1500 gas turbine engine on the M1 Abrams Tank [13]. The results of the analysis indicated that the development and deployment of an engine prognostics system with approximately a dozen auxiliary sensors (thermodynamic and vibration sensors installed via a wiring harness) would result in a benefit-to-cost ratio of about 11:1. One of the most significant challenges when conducting a cost-benefit analysis is the iterative process of selecting the appropriate prognostics technology based on assumptions of estimated benefits. For example, a given prognostic technology

IEEE TRANSACTIONS ON RELIABILITY, VOL. 61, NO. 2, JUNE 2012

may be effective for assessing five of the eight possible failure modes for batteries; this technology would account for 62.5% of the failures, but would cost $200 per battery to implement. Another technology may be effective for assessing the other three failure modes, accounting for 35% of the failures, but costing $150 per battery to implement [93]. This technology allows for the evaluation of technologies as a function of cost versus the effectiveness of the technology, because not all technologies have an equivalent capability. For example, Feldman et al. [94] analyzed the ROI of a precursor-to-failure prognostics approach relative to unscheduled maintenance, but it may not be consistent with the ROI of using life consumption monitoring methods (LRU independent methods), and is not specific to a particular precursor to failure device. Another challenge when conducting CBA is determining what values to select for all of the variables in the CBA model. The Applied Research Laboratory (ARL) Trade Space Visualizer (ATSV) provides the ability to iteratively solve for all of the statistically dependent and independent variables in the CBA model, and visually present all of the data for assessment. This tool is used to analyze the benefits of implementing battery prognostics on military ground combat vehicles [93]. The third challenge for determining ROI in prognostics is that it is difficult to quantify the benefits of prognostics results. Standard measures of performance need to be well defined in order to assess and justify the anticipated ROI [59], [94]–[96]. The cost benefit is related not only to prognostics opportunity, the warning lead-time interval, and user requirements, but also to affordable cost and acceptable safety risk. Additionally, the overall logistics support system, spare parts supply, supply chain management, and other related resources are also related to prognostics cost benefit. This system introduces a challenging multi-objective and multi-attribute tradeoff, and a complex decision-making problem that must be dealt with. IV. RESEARCH OPPORTUNITIES In this paper, we have provided explanations of some of the key benefits of prognostics in terms of system life-cycle processes. It is important to highlight the advancements in the field of PHM that will enhance the practical engineering applications of prognostics technologies. However, there are still methodological and technical issues that must be dealt with to provide more effective prognostics systems. Based on our research, suggestions for future efforts include the following. — Establish field prognostics system design and development guidelines, including hardware-related sensor selection (e.g., sensor types, sensor performance, sensitivity, stability, power consumption, reliability, and sensor networking), wireless or wired data transmission, software-related diagnostics, prognostics, and decision reasoning algorithms and programs. — Investigate methods and procedures to cost-effectively integrate prognostics into existing systems. — Determine how to integrate prognostics system design with the host system design process. — Develop metrics and methods to impartially measure and evaluate the performance of a prognostics system.

SUN et al.: BENEFITS AND CHALLENGES OF SYSTEM PROGNOSTICS

— Conduct more studies on the life cycle return on investment attached to the implementation of prognostics technologies. ACKNOWLEDGMENT The authors would like to thank the more than 150 sponsors of CALCE and the team at the CityU Center** for Prognostics and System Health Management for their valuable support. REFERENCES [1] M. G. Pecht, Prognostics and Health Management of Electronics. New York, NY, USA: Wiley-Interscience, 2008. [2] M. G. Pecht and R. Kang, Diagnostics, Prognostics and Systems Health Management (in Chinese). Hong Kong, HK, China: City University of Hong Kong Press, 2010. [3] DoD 5000.02 Policy Document, Operation of the Defense Acquisition System, Section 8—Operations and Support Phase Department of Defense, December 8, 2008. [4] B. Dickson, J. Cronkhite, and S. Bielefeld, Feasibility Study of a Rotorcraft Health and Usage Monitoring System (HUMS) Usage and Structural Life Monitoring Evaluation Army Research Laboratory, ARL-CR-290, February 1996. [5] C. D. Pettit, S. Barkhoudarian, and A. G. Daumann, “Reusable rocket engine Advanced Health Management System Architecture and technology evaluation—Summary,” in Proceedings of the 35th AIAA/ASME/SAE/ASEE Joint Propulsion Conference and Exhibit, Los Angeles, California, June 20–24, 1999, pp. 1–8. [6] Tumer and A. Bajwa, “A survey of aircraft engine health monitoring systems,” in Proceedings of the 35th AIAA/ASME/SAE/ASEE Joint Propulsion Conference and Exhibit, Los Angeles, California, June 20–24, 1999, pp. 1–12. [7] S. L. Andrew and J. Green, Future Direction and Development of Engine Health Monitoring (EHM) Within the United States Air Force Air Force Research Laboratory, ADA347976F, April 24, 1998. [8] G. F. Zhang, S. Lee, and N. Propes et al., “A novel architecture for an integrated fault diagnostic/prognostic system,” in Proceedings of 2002 AAAI Spring Symposium, Palo Alto, California, March 25–27, pp. 1–9. [9] A. Hess and L. Fila, “The Joint Strike Fighter (JSF) PHM concept: Potential impact on aging aircraft problems,” in Proceedings of 2002 IEEE Aerospace Conference, Big Sky, Montana, March 9–16, 2002, pp. 3021–3026. [10] A. Hess and L. Fila, “Prognostics, from the need to reality-from the fleet users and PHM system designer/developers perspectives,” in Proceedings of 2002 IEEE Aerospace Conference, Big Sky, Montana, Mar. 9–16, 2002, pp. 2791–2797. [11] W. Wang and B. Hussin, “Plant residual time modelling based observed variables in oil samples,” Journal of the Operational Research Society, vol. 60, no. 6, pp. 789–796, 2009. [12] J. VerWey, “Airplane product strategy during a time of market transition,” Boeing: e-Newsletter June 2009 [Online]. Available: http:// www.boeingcapital.com/p2p/archive/06.2009 [13] Greitzer, L. Frank, and Hostick et al., “Determining how to do prognostics, and then determining what to do with it,” in Proceedings of AUTOTESTCON, Valley Forge, PA, USA, August 20–23, 2001, pp. 780–792. [14] W. Wang and W. Zhang, “A model to predict the residual life of aircraft engines based on oil analysis data,” Naval Logistic Research, vol. 52, no. 3, pp. 276–284, 2005. [15] “DTI,” The Development of Prognostic Health Management (PHM) Technology for Wind Turbines PROJECT PROFILE NO PP218, Avenca Limited, June 2005. [16] J. D. Kozlowski, “Electrochemical cell prognostics using online impedance measurements and model-based data fusion techniques,” in Proceedings of 2003 IEEE Aerospace Conference, Big Sky, Montana, March 8–15, 2003, pp. 3257–3270. [17] W. Hardman, “Mechanical and propulsion systems prognostics: U.S. navy strategy and demonstration,” JOM, vol. 56, no. 3, pp. 21–27, 2004. [18] G. J. Kacprzynski, A. Sarlashkar, and M. J. Roemer et al., “Predicting remaining life by fusing the physics of failure modeling with diagnostics,” JOM, vol. 56, no. 3, pp. 29–35, 2004.

333

[19] Y. G. Li and P. Nikitsaranont, “Gas path prognostic analysis for an industrial gas turbine,” Insight: Non-Destructive Testing and Condition Monitoring, vol. 50, no. 8, pp. 428–435, 2008. [20] W. Wang, “A model to predict the residual life of rolling element bearings given monitored condition monitoring information to date,” IMA Journal of Management Mathematics, vol. 13, no. 1, pp. 3–16, 2002. [21] A. H. Christer, W. Wang, and J. Sharp, “A state space condition monitoring model for furnace erosion prediction and replacement,” European Journal of Operational Research, vol. 101, no. 1, pp. 1–14, 1997. [22] W. Wang, P. A. Scarf, and M. A. J. Smith, “On the application of a model of condition based maintenance,” Journal of the Operational Research Society, vol. 51, no. 11, pp. 1218–1227, 2000. [23] W. Wang and W. Zhang, “An asset residual life prediction model based on expert judgments,” European Journal of Operational Research, vol. 188, no. 2, pp. 496–505, 2008. [24] J. Gu, D. Barker, and M. Pecht, “Prognostics implementation of electronics under vibration loading,” Microelectronics Reliability, vol. 47, no. 12, pp. 1849–1856, 2007. [25] A. Ramakrishnan and M. Pecht, “A life consumption monitoring methodology for electronic systems,” IEEE Trans. Components and Packaging Technologies, vol. 26, no. 3, pp. 625–634, 2003. [26] N. Vichare, P. Rodgers, V. Eveloy, and M. G. Pecht, “In-situ temperature measurement of a notebook computer—A case study in health and usage monitoring of electronics,” IEEE Trans. Device and Materials Reliability, vol. 4, no. 4, pp. 658–663, 2004. [27] S. Mathew, D. Das, and M. Osterman et al., “Virtual remaining life assessment of electronic hardware subjected to shock and random vibration life cycle loads,” Journal of the Institute of Environmental Sciences and Technology, vol. 50, no. 1, pp. 86–97, 2007. [28] V. Rouet, F. Minault, G. Diancourt, and B. Foucher, “Concept of smart integrated life consumption monitoring system for electronics,” Microelectronics Reliability, vol. 47, no. 12, pp. 1921–1927, 2007. [29] B. Sun, Y. Zhao, and W. Huang et al., “Case study of prognostic and health management methodology for electronics,” (in Chinese) Systems Engineering and Electronics, vol. 29, no. 6, pp. 1012–1016, 2007. [30] C. Wu, C. Yang, S. Lo, N. Vichare, E. Rhem, and M. Pecht, “Automatic data mining for telemetry database of computer systems,” Microelectronics Reliability, vol. 51, no. 2, pp. 263–269, 2011. [31] G. Niu, S. Singh, S. W. Holland, and M. Pecht, “Health monitoring of electronic products based on Mahalanobis distance and Weibull decision metrics,” Microelectronics Reliability, vol. 51, no. 2, pp. 279–284, 2011. [32] L. Lopez, “Advanced electronic prognostics through system telemetry and pattern recognition methods,” Microelectronics Reliability, vol. 47, no. 12, pp. 1865–1873, 2007. [33] D. W. Brown, P. W. Kalgren, C. S. Byington, and J. R. Roemer, “Electronic prognostics—A case study using global positioning system (GPS),” Microelectronics Reliability, vol. 47, no. 12, pp. 1874–1881, 2007. [34] C. S. Byington, P. W. Kalgren, B. K. Dunkin, and B. P. Donovan, “Advanced diagnostic/prognostic reasoning and evidence transformation techniques for improved avionics maintenance,” in Proceedings of 2004 IEEE Aerospace Conference, Big Sky, Montana, March 6–13, 2004, pp. 3424–3434. [35] B. Saha, J. Celaya, P. Wysocki, and K. Goebel, “Towards prognostics for electronics components,” in Proceedings of the 2009 IEEE Aerospace Conference, Big Sky, Montana, March 7–14, 2009, pp. 1–7. [36] M. Pecht and J. Gu, “Prognostics-based product qualification,” in Proceedings of 2009 IEEE Aerospace Conference, Big Sky, Montana, March 7–14, 2009, pp. 1–11. [37] R. Jaai and M. Pecht, “Fusion prognostics,” in Proceedings of Sixth DSTO International Conference on Health & Usage Monitoring, Melbourne, Australia, March 9–12, 2009, pp. 1–12. [38] W. Denson, “History of reliability prediction,” IEEE Trans. Reliability, vol. 47, no. 3-SP, pp. 321–328, 1998. [39] P. Lall, M. Pecht, and E. Hakim, The Influence of Temperature on Microelectronic Device Reliability. Boca Raton, FL: CRC Press, 1997. [40] M. J. Cushing, D. E. Mortin, T. J. Stadterman, and A. Malhotra, “Comparison of electronics-reliability assessment approaches,” IEEE Trans. Reliability, vol. 42, no. 4, pp. 542–546, 1993. [41] S. L. Dreyer, “Autonomic logistics—Developing an implementation approach for an existing military weapon system,” IEEE Instrumentation and Measurement Magazine, vol. 9, no. 4, pp. 16–21, 2006. [42] S. Trimble, “JSF industry team seeks new business model for logistics,” Jane’s Defence Weekly April 21, 2006 [Online]. Available: http://articles.janes.com/articles/Janes-Defence-Weekly-2006/JSFteam-seeks-new-logistics-business-model.html

334

IEEE TRANSACTIONS ON RELIABILITY, VOL. 61, NO. 2, JUNE 2012

[43] S. O. John, Statistical Process Control, Sixth ed. Oxford: Butterworth-Heinemann, 2008. [44] G. Bird, M. Christensen, and D. Lutz et al., “Use of integrated vehicle health management in the field of commercial aviation,” in Proceedings of NASA ISHEM Forum, Napa, California, November 7–10, 2005, pp. 1–12. [45] J. Banks, K. Reichard, and E. Crow et al., “How engineers can conduct cost-benefit analysis for PHM systems,” IEEE Aerospace and Electronic Systems Magazine, vol. 24, no. 3, pp. 22–30, 2009. [46] Q. Miao, L. Liu, Y. Feng, and M. Pecht, “Complex system maintainability verification with limited samples,” Microelectronics Reliability, vol. 51, no. 2, pp. 294–299, 2011. [47] C. Robert, Toyota’s Sudden Unintended Acceleration Caused in Part by Electronic Interference? February 2, 2010 [Online]. Available: http:// spectrum.ieee.org/riskfactor [48] P. Sandborn, “Trapped on technology’s trailing edge,” IEEE Spectrum, vol. 45, no. 4, pp. 42–45, 2008, 54, 56-58. [49] L. Condra, R. Hoad, and D. Humphrey et al., “Terminology for use of parts outside manufacturer-specified temperature ranges,” IEEE Trans. Components, Hybrids, and Manufacturing Technology, vol. 22, no. 3, pp. 355–356, 1999. [50] D. Das, M. Pecht, and N. Pendse, Rating and Uprating of Electronic Products. Maryland: CALCE EPSC Press, University of Maryland, College Park, MD, 2004. [51] B. Sorensen, “Digital averaging—The smoking gun behind ‘No-Fault-Found’,” Air Safety Week February 24, 2003 [Online]. Available: http://www.aviationtoday.com/regions/sa/Digital-Averaging-The-Smoking-Gun-Behind-No-Fault-Found_2120.html [52] B. P. Leao, K. T. Fitzgibbon, and L. C. Puttini et al., “Cost-benefit analysis methodology for PHM applied to legacy commercial aircraft,” in Proceedings of 2008 IEEE Aerospace Conference, Big Sky, MT, March 1–8, 2008, pp. 1–13. [53] V. Shetty, K. Rogers, and D. Das, “Remaining life assessment of shuttle remote manipulator system end effector electronics unit,” in Proceedings of the 22nd Space Simulation Conference, Ellicott City, MD, October 21–23, 2002, pp. 1–13. [54] Lawton and F. George, “Health monitor analysis system successful instrumented design and unexpected benefits,” in Proceedings of IEEE Systems Readiness Technology Conference, Anaheim, CA, September 18–21, 2007, pp. 677–682. [55] H. Qi, S. Ganesan, and M. Pecht, “No-fault-found and intermittent failures in electronic products,” Microelectronics Reliability, vol. 48, no. 5, pp. 663–674, 2008. [56] Y. Ning, P. Rundle, and M. Pecht, “Prognostics and health management’s potential benefits to warranty,” in Proceedings of the Seventh Annual Warranty Chain Management Conference, San Diego, CA, March 15–17, 2011, pp. 1–9. [57] R. Williams, J. Banner, I. Knowles, M. Dube, M. Natishan, and M. Pecht, “An investigation of “cannot duplicate” failures,” Quality and Reliability Engineering International, vol. 14, no. 5, pp. 331–337, 1998. [58] B. Sun, R. Kang, and J. S. Xie, “Research and application of prognostic and health management system,” (in Chinese) Systems Engineering and Electronics, vol. 29, no. 10, pp. 1762–1767, 2007. [59] E. Scanff, K. L. Feldman, S. Ghelam, P. Sandborn, M. Glade, and B. Foucher, “Life cycle cost impact of using prognostic health management (PHM) for helicopter avionics,” Microelectronics Reliability, vol. 47, no. 12, pp. 1857–1864, 2007. [60] Z. M. Yang, D. Djurdjanovic, and J. Ni, “Maintenance scheduling in manufacturing systems based on predicted machine degradation,” Journal of Intelligent Manufacturing, vol. 19, no. 1, pp. 87–98, 2008. [61] W. J. Scheuren, K. A. Caldwell, G. A. Goodman, and A. K. Wegman, “Joint strike fighter prognostics and health management,” in Proceedings of 1998 AIAA Joint Propulsion Conference, Cleveland, OH, July 12–15, 1998, pp. 1–8. [62] A. Khalak and J. Tiemo, “Influence of prognostic health management on logistic supply chain,” in Proceedings of 2006 American Control Conference, Minneapolis, Minnesota, USA, June 14–16, 2006, pp. 3737–3742. [63] W. Wang and M. Pecht, “Economic analysis of canary-based prognostics and health management,” IEEE Trans. Industrial Electronics, vol. 58, no. 7, pp. 3077–3089, 2010. [64] B. S. Blanchard and W. J. Fabrycky, System Engineering and Analysis, Fourth ed. New Jersey, USA: Prentice-Hall, Inc., 2005. [65] L. Moyer and S. M. Gupta, “Environmental concerns and recycling/ disassembly efforts in the electronics industry,” Journal of Electronics Manufacturing, vol. 7, no. 1, pp. 1–22, 1997.

[66] A. Gungor and S. M. Gupta, “Issues in environmentally conscious manufacturing and product recovery: A survey,” Computer and Industrial Engineering, vol. 36, no. 4, pp. 811–853, 1999. [67] A. Zeid, S. Kamarthi, and S. M. Gupta, “Product take back: Sensorsbased approach,” in Proceedings of SPIE—The International Society for Optical Engineering, 2004, vol. 5583, pp. 200–206. [68] W. Wang and A. H. Christer, “Towards a general condition based maintenance model for a stochastic dynamic system,” Journal of the Operational Research Society, vol. 51, no. 2, pp. 145–155, 2000. [69] W. Wang, “A model to determine the critical level and monitoring intervals in condition based maintenance,” International Journal of Production Research, vol. 38, no. 6, pp. 1425–1436, 2000. [70] P. H. Barton and R. Ogden, “Stirling cryocooler prognostics and health management (PHM),” in Proceedings of IEEE 2009 AUTOTESTCON, Anaheim, CA, September 14–17, 2009, pp. 78–81. [71] R. C. Millar, “A systems engineering approach to PHM for military aircraft propulsion systems,” in Proceedings of 2007 IEEE Aerospace Conference, Big Sky, MT, March 3–10, 2007, pp. 1–9. [72] G. Zhang, “Optimum Sensor Localization /Selection in a Diagnostic/ Prognostic Architecture,” PhD Dissertation, Georgia Institute of Technology, , 2005. [73] S. Cheng, M. H. Azarian, and M. G. Pecht, “Sensor systems for prognostics and health management,” Sensors, vol. 10, no. 6, pp. 5774–5797, 2010. [74] J. Gu, D. Barker, and M. G. Pecht, “Uncertainty assessment of prognostics of electronics subject to random vibration,” in Proceedings of AAAI Fall Symposium on Artificial Intelligence for Prognostics, Arlington, VA, USA, November 2007, pp. 50–57. [75] N. M. Vichare and M. G. Pecht, “Prognostics and health management of electronics,” IEEE Trans. Components and Packaging Technologies, vol. 29, no. 1, pp. 222–229, 2006. [76] R. Kothamasu, S. H. Huang, and W. H. Verduin, “System health monitoring and prognostics—A review of current paradigms and practices,” International Journal of Advanced Manufacturing Technology, vol. 28, no. 9, pp. 1012–1024, 2006. [77] J. Gu, N. Vichare, T. Tracy, and M. Pecht, “Prognostics implementation methods for electronics,” in Proceedings of Reliability and Maintainability Symposium (RAMS), Orlando, Florida, January 22–25, 2007, pp. 101–106. [78] X.-S. Si, W. Wang, C.-H. Hu, and D.-H. Zhou, “Remaining useful life estimation—A review on the statistical data driven approaches,” European Journal of Operational Research, vol. 213, no. 1, pp. 1–14, 2010. [79] S. Kumar, M. Torres, Y. C. Chan, and M. Pecht, “A hybrid prognostics methodology for electronic products,” in Proceedings of 2008 International Joint Conference on Neural Networks (IJCNN 2008), Hong Kong, China, June 1–8, 2008, pp. 3479–3485. [80] N. M. Vichare, “Prognostics and Health Management of Electronics by Utilizing Environmental and Usage Loads,” Doctor of Philosophy Dissertation, University of Maryland, , 2006. [81] B. A. Tuchband, “Implementation of Prognostics and Health Management for Electronic Systems,” Master’s Thesis, University of Maryland, , 2007. [82] M. J. Carr and W. Wang, “An approximate algorithm for conditionbased maintenance applications,” European Journal of Operational Research, vol. 211, no. 1, pp. 90–96, 2011. [83] W. Wang, M. Carr, W. Xu, and K. A. H. Kabbacy, “A model for residual life prediction based on Brownian motion with an adaptive drift,” Microelectronics Reliability, vol. 51, no. 2, pp. 285–293, 2011. [84] W. Wang, “An overview of a semi-stochastic filtering approach for residual life estimation with applications in condition based maintenance,” in Proceedings of Institute of Mechanical Engineers, Part O: Journal of Risk and Reliability, 2011, vol. 225, pp. 185–197. [85] M. J. Carr and W. Wang, “Modeling condition based maintenance failure modes using stochastic filtering theory,” IEEE Trans. Reliability, vol. 59, no. 2, pp. 346–355, 2009. [86] G. Vachtsevanos, F. L. Lewis, and M. Roemer et al., Intelligent Fault Diagnosis and Prognosis for Engineering Systems: Methods and Case Studies. Hoboken: John Wiley & Sons, 2006. [87] B. P. Leão, T. Yoneyama, and G. C. Rocha, “Prognostics performance metrics and their relation to requirements, design, verification and costbenefit,” in Proceedings of 2008 International Conference on Prognostics and Health Management (PHM 2008), Denver, CO, USA, October 6–9, 2008, pp. 1–8. [88] A. Saxena, J. Celaya, and E. Balaban et al., “Metrics for evaluating performance of prognostic techniques,” in Proceedings of 2008 International Conference on Prognostics and Health Management (PHM 2008), Denver, CO, USA, October 6–9, 2008, pp. 1–17.

SUN et al.: BENEFITS AND CHALLENGES OF SYSTEM PROGNOSTICS

[89] W. Wang, “A two-stage prognosis model in condition based maintenance,” European Journal of Operational Research, vol. 182, no. 3, pp. 1177–1187, 2007. [90] W. Wang, “A prognosis model for wear prediction based on oil based monitoring,” Journal of the Operational Research Society, vol. 58, no. 7, pp. 887–893, 2007. [91] W. Wang, “Modeling the probability assessment of the system state using available condition information,” IMA Journal of Management Mathematics, vol. 17, no. 3, pp. 225–234, 2006. [92] W. Wang, “Modeling condition monitoring intervals: A hybrid of simulation and analytical approaches,” Journal of the Operational Research Society, vol. 54, no. 3, pp. 273–282, 2003. [93] J. Banks and J. Merenich, “Cost benefit analysis for asset health management technology,” in Proceedings of Reliability and Maintainability Symposium (RAMS), Orlando, Florida, January 22–25, 2007, pp. 95–100. [94] K. Feldman, T. Jazouli, and P. Sandborn, “A methodology for determining the return on investment associated with prognostics and health management,” IEEE Trans. Reliability, vol. 58, no. 2, pp. 305–316, 2009. [95] P. Sandborn and M. Pecht, “Introduction to special section on electronic systems prognostics and health management,” Microelectronics Reliability, vol. 47, no. 12, pp. 1847–1848, 2007. [96] P. Sandborn and C. Wilkinson, “A maintenance planning and business case development model for the application of prognostics and health management (PHM) to electronic systems,” Microelectronics Reliability, vol. 47, no. 12, pp. 1889–1901, 2007.

Bo Sun (M’10) is a Reliability Engineer, and a member of the faculty of systems engineering at the School of Reliability and Systems Engineering at Beihang University in Beijing, China. He received his Ph.D. degree in reliability engineering and systems engineering from Beihang University, and a B.S. degree in mechanical engineering from the Beijing Institute of Mechanical Industry. His current research interests include prognostics and health management, physics of failure, reliability of electronics, failure analysis of electronics, reliability engineering, and integrated design of product reliability and performance.

335

Shengkui Zeng is currently the Vice Dean of the School of Reliability and Systems Engineering at Beihang University in Beijing, China. He has over 15 years of research and teaching experience in reliability engineering and systems engineering. He was a visiting researcher with the Center for Advanced Life Cycle Engineering Electronic Products and Systems Consortium at the University of Maryland in 2005. He is the team leader of the KW-ARMS reliability engineering software platform, the co-author of three books, and a recipient of three Chinese ministry-level professional awards. His recent research interests include prognostics and health management, integrated design of reliability and performance, and reliability-based multidisciplinary design optimization.

Rui Kang (M’10) is currently the Chief Professor in the School of Reliability and Systems Engineering at Beihang University, Beijing, China. He has over 20 years of research and teaching experience in reliability engineering and systems engineering. He is also the Chairman of the Academic Council in the School of Reliability and Systems Engineering. His recent research interests include failurology, systems engineering, prognostics and health management, integrated logistics support, system testability, and diagnosis technologies.

Michael G. Pecht (S’78–M’83–SM’90–F’92) is currently a Visiting Professor in Electronics Engineering at City University of Hong Kong. He has an MS in Electrical Engineering, and an MS and PhD in Engineering Mechanics from the University of Wisconsin at Madison. He is a Professional Engineer, an IEEE Fellow, an ASME Fellow, an SAE Fellow, and an IMAPS Fellow. He was awarded the highest reliability honor, the IEEE Reliability Society’s Lifetime Achievement Award, in 2008. He served as chief editor of the IEEE TRANSACTIONS ON RELIABILITY for eight years, and is now chief editor for Microelectronics Reliability. He is the founder of CALCE (Center for Advanced Life Cycle Engineering) at the University of Maryland, College Park, where he is also the George Dieter Chair Professor in Mechanical Engineering, and a Professor in Applied Mathematics. He has written more than twenty books on electronic products development, use, and supply chain management; and over 400 technical articles. He has been leading a research team in the area of prognostics for the past ten years. He has consulted for over 100 major international electronics companies, providing expertise in strategic planning, design, test, prognostics, intellectual property and risk assessment of electronic products and systems.