What exactly is condition monitoring? How does it differ from other maintenance philosophies? And in what way does it impact the discipline of control engineering in the typical plant? Steve Sabin provides some answers.
For years, conventional wisdom held that the longer a piece of equipment was in service, the more likely it was to fail. Thus, it was believed that as long as maintenance was performed at the correct interval, asset failure could be avoided. While it is true that one of the primary goals of maintenance is to perform maintenance activities at the right time, the historical approach to this challenge was to rely solely on calendar intervals – whether running hours or simply total hours.
As it turns out, the calendar is a very poor basis for scheduling maintenance activities on the majority of assets. Indeed, the data compiled across several decades, dozens of industries, and hundreds of thousands of assets, suggests that a time-based approach to maintenance is appropriate for only 11 percent of the assets in a typical plant.Surprising? To many, yes. But to the people that have actually worked in a reliability or maintenance discipline, probably not. They understand that for many items, the most likely time for failure to occur is immediately after maintenance has been performed – either because the parts have hidden defects or because the work was performed improperly. They understand that certain items, such as brake shoes, chains, and sprockets, wear at a rate that is strongly correlated to usage hours, but other items – most notably electronic instruments, do not behave that way at all.Depicted below are various categories of wear, along with the types of assets typical of each category and the approximately percentage of such assets in a typical plant. Arguably, the single most valuable message conveyed by this table is that for nearly 90 percent of the assets in a typical plant, a calendar-based approach to maintenance is exactly the wrong approach. Choosing technologiesCertain assets lend themselves to certain condition monitoring technologies. Ideally, direct measurement is preferred, but this is often not possible. For example, when looking for broken or shorted turns in a motor or generator, physical examination of the windings requires the unit to be taken out of service and visually inspected. Consequently, indirect methods are used, such as thermal imaging cameras that examine the rotor in-situ, looking for temperature variations indicative of increased resistance (hot spots) or no conduction (cool spots). Thermal imaging can also be used on static assets, such as switchgear and transformers, which do not have moving parts. Another technology used on some large motors and generators measures partial discharge as insulation fails and allows arcing to occur. The key to choosing the correct technology lies in Failure Modes and Effects Analysis (FMEA) studies, where the types of failures are categorized and the failure mode is ascertained. For example, unbalance on a rotating machine is a common problem, as is misalignment. Both manifest as changes in vibration. Other technologies could also detect these problems, but often only after the failure has progressed significantly. For example, severe misalignment will cause premature bearing wear on machines with rolling element bearings, or gear tooth wear on machines that incorporate contacting gear teeth. As the bearings or gear teeth wear, they will typically deposit metallic debris in the lubricating oil. As such, lube oil analysis – although an important condition monitoring technology in its own right – would be considered a much less sensitive indicator of unbalance or misalignment than would vibration because it relies on collateral damage (bearing or gear wear) rather than direct measurement of the forces acting on the machine (vibration).Another outcome of the FMEA study is that it quantifies the effects of failure. While a failure mode helps isolate the particular technology that might be useful, the effects of failure help determine whether it is appropriate to let the machine simply run to failure or whether the ramifications of failure are so significant – such as a total loss of production, an environmental leak, or an explosion – that the most sophisticated and comprehensive suite of monitoring technologies is warranted.It is beyond the scope of this article to provide an in-depth discussion regarding the specific technology to use on a specific type of equipment. However, listed and described below are the most common technologies:Vibration monitoringUsed primarily on rotating machinery because it is an excellent indicator of the primary forces acting on the machine. Many failure types such as unbalance, misalignment, rubs, resonances, instabilities, bearing wear, and loose or broken parts can be detected and differentiated from one another using vibration analysis.It is also a particularly effective technology for machinery protection, allowing operators to automatically shut down the unit when vibration becomes excessive. Because vibration monitoring is often present on machinery for protection purposes, extending its use to condition monitoring purposes, rather than strictly protection, is straightforward and makes it the most prevalent of all condition monitoring technologiesLubrication analysisAnother very common condition monitoring technology, through examining the types of particulates present in machine lubrication oil, such as metallurgy and shape, certain problem can be ascertained and isolated to the affected parts. By looking at the chemistry of the lubricant, contaminants can be found, such as water, which can indicate a different class of problems unrelated to wear. This is a very sophisticated technology and hundreds of commercial lubricant analysis labs exist. It is very commonly used with assets such as heavy construction equipment that does not easily lend itself to permanent monitoring. Instead, oil is manually sampled at regular intervals and sent to a lab for processing. Oil analysis can also be used to assess the condition of electrical oil-filled transformers.ThermographyThis is essentially an infrared camera that records images, looking for temperature anomalies in things such as motor, generator, or transformer windings. Current analysisBy looking at the waveforms of electrical current, voltage, and power in motors and generators, a number of problems can be ascertained where technologies such as vibration monitoring and oil analysis are less effective.Non-destructive testingEddy current transducers can be used to inspect the surface of turbine blades, pressure vessels, and other structures where the presence of hairline cracks may occur, invisible to the naked eye.Ultrasonic detectionHigh-frequency acoustic energy can be useful for detecting some types of failures, such as leaking valves.The foregoing is only a partial list. Other technologies such as pressure measurements in compressor cylinders, partial discharge measurements on motors/generators, temperature spreads on gas turbine exhaust nozzles, and many others exist. The key point is that the condition monitoring technology is chosen as a function of the equipment failure mode(s).Continuous categoriesThe ways in which condition monitoring technologies are applied fall into one of three basic categories:• ContinuousThe measurement is made continuously on the machine in an automated fashion, without the need for human intervention.• Non-continuousThe measurement is performed manually at specific intervals or when another measurement suggest independent verification. Oil sampling is one example of a non-continuous technology. Thermography is another. Vibration, temperature, and pressure measurements can fall into any of the three categories, and the choice is usually a function of the criticality of the machine.• Quasi-continuousThese systems are online and do not require manual intervention; however, usually for cost reasons, the instrumentation architecture will employ a so-called “sensor bus” where a centralized processor will intermittently sample readings from hundreds or thousands of connected sensors. These systems are generally used for assets that warrant more than periodic manual data collection, but cannot justify the expense of a continuous monitoring system.The nature of the architecture is that processing of signals and samples is done in serial rather than parallel, and wiring is shared rather than dedicated to each individual sensor. As a result, overall system cost is lower, but performance capabilities are more limited than a continuous system. While continuous and non-continuous systems have been in use for many years, there is increasing interest is in quasi-continuous systems as they enable assets traditionally been addressed by labor-intensive manual data collection schemes to be addressed with online systems instead. Wireless technology is playing a significant role in reducing the installation costs of quasi-continuous systems. Assessing criticalityThe choice of condition monitoring is not simply choosing the correct technology. It also involves determining whether the asset warrants any condition monitoring at all. This, too, is a function of a properly performed FMEA study. The failure modes of some assets simply don’t lend themselves to a condition-based approach, as they fall primarily in categories A-C in the diagram above. For these, a maintenance strategy that relies upon calendar intervals may well be appropriate. Other assets simply do not represent sufficient safety or financial impact to warrant much more than a run-to-failure approach. As such, there will always be a certain population of assets for which “change the oil every quarter and tighten bolts” will be adequate. The economics for such assets dictate little more than performing minimal maintenance and allowing the machine to run until it fails.Once it has been determined that an asset is an appropriate candidate for condition monitoring, not only are the technologies chosen, but a decision must be made whether the asset warrants continuous, quasi-continuous, or non-continuous monitoring. This is a function of the asset’s criticality, which falls into one of three broad categories: Critical, Non-Essential, Essential.Critical assets represent one or more (but often all) of the following attributes:- substantial or total process interruption if they fail- significant safety risk if they fail, such as a fire, toxic leak, or explosion- significant repair costs and/or long lead times for partClearly, the costs and repercussions of failures on such assets dictates continuous monitoring and multiple condition monitoring technologies. The main air blower for a catalytic cracker unit in a refinery is one such example. Another would be the main turbine-generator trains in a large power plant. For such assets, it is the cost of failure that is of primary concern.Non-essential assets are at the opposite end of the spectrum from critical assets. The primary driving factor for monitoring these assets (if monitored at all) is not the impact of safety or process downtime. Instead, it is simply the ability to reduce maintenance costs by planning maintenance proactively and eliminating root causes rather than just replacing parts at regular intervals. Whereas a typical industrial plant may have –at most – a few dozen critical machines, it will often have thousands of non-essential machines. Combined, the maintenance investment required to fix many small problems when these machines fail unexpectedly can be quite significant. Thus, for such assets, it is the cost of maintenance that is of primary concern.Essential assets are somewhere along the continuum between critical and non-essential. A good example would be the numerous small process pumps in a typical petrochemical facility. Any one pump will not have a significant impact on production – indeed, the pumps may even be spared. However, a seal or bearing failure could result in a leak, overheating, and subsequent fire. Manual data collection has proven to be inadequate, while fully continuous monitoring would be considered “over kill.” A quasi-continuous system is often employed instead, allowing condition to be checked several times an hour, which is generally adequate as the failure mode occurs in tens of minutes, not fractions of a second. For these assets, cost of safety is typically among the primary concerns. The asset failure itself may represent a safety hazard, or the ability to safely collect data manually might be involved, such as on the hot end of a large paper machine.The process linkThere are several reasons why it is becoming increasingly important for control engineers to understand the elements of condition monitoring.Firstly, asset wear is often a function of process conditions, which means that an asset can age more in the span of a few minutes when run at off-spec conditions than during years of normal conditions. One example is process pumps that might inadvertently run in cavitation conditions. Another example occurs in the hydro industry where units that were once run under base load conditions are suddenly used for peaking and run up and down numerous times each day, subjecting the asset to stresses and transient process conditions for which it was not originally designed.The relationship between machine conditions (such as bearing temperatures and vibration) and process conditions are vital, and most modern condition monitoring software using continuous technologies will have a facility for the integration and correlation of process conditions with mechanical conditions. Operators are often the first line of defense when upset conditions occur in process or machinery. Consequently, the process control system typically becomes the single “dashboard” in which condition monitoring alarms are displayed. It is increasingly important for operators and control engineers to understand the adjustments that can be made to the process to mitigate asset malfunctions and to prolong asset life. And control and instrumentation engineers are often called upon to perform system integration between the condition monitoring system, the process control system, and the process historian system. However, it should be noted that the unique data capture requirements of certain condition monitoring technologies warrant that they remain separate and distinct from the basic process control system, while still allowing integration between the two. For example, for machinery engineers to properly diagnose certain malfunctions, a continuous stream of high-resolution vibration data must be collected. The bandwidth and waveform sampling capabilities required are very similar to those employed in the world of consumer audio, which exceeds the capabilities of process control systems and historians. By understanding the unique capabilities of various condition monitoring technologies, the knowledgeable control engineer will be less likely to use a process control or automation system for an application in which it is not well-suited.Increasing emphasisCondition monitoring continues to grow in emphasis for many users – often in direct proportion to the competitive environment within which their industry operates. This article has provided an introduction to basic condition monitoring, showing why it is applied, where it is applied, and how it is applied. It is instructive to reiterate the key points. An FMEA study is typically performed in order to categorize assets according to criticality, to identify the appropriate technologies, and to apply the correct maintenance strategy to the correct assets based on their criticality. Condition monitoring is not warranted for all assets; however, as technological capabilities increase and installation costs decrease, the population of assets for which condition monitoring can be justified continues to increase.Finally, the relationship between process conditions and asset conditions is intimately intertwined, requiring that the control engineer understand how condition monitoring systems and process control and trending systems must work together for optimizing the overall performance of the plant.-----------------Steve Sabin is with the Optimization & Control division of GE Energy (www.ge-energy.com)Effective Early WarningThanks to an online condition-based monitoring system, annual maintenance expenses decreased significantly for this South China Sea oil producer.The CACT Operators Group, an international consortium comprising China National Offshore Oil Corporation (CNOOC), Agip (Italy), Chevron (USA) and Texaco (USA), the CACT Operator’s Group was formed to develop hydrocarbon resources in the Pearl River Basin of the South China Sea. The group’s first exploratory well was drilled in 1984 and today CACT produces more than 100,000 barrels of crude oil each day, destined for refineries in China. Ensuring continuous, safe operation of equipment is a particularly significant challenge for offshore oil producers like CACT. Not surprisingly then, maintenance is a top priority on offshore platforms where costly equipment is typically located hundreds of kilometers from shore. Efficient, non-stop operation of pipeline pumps is key to cost-effective oil production, and initially, preventive maintenance was CACT’s standard approach to keeping the pumps online. However, the effectiveness of CACT’s maintenance system was limited. CACT was forced to react to equipment problems that went undetected during routine inspections. First steps Looking to take a more proactive approach, it was in the mid-1990s when CACT first began to use condition monitoring products for the pumps on three of its offshore platforms, basing the system on portable data collection equipment, namely, Datapac and VISTeC.DataPac collects field data, including process variables and vibration information; VISTeC measures vibration in units of velocity and acceleration and can also take Spike Energy measurements, which can be used for early detection of surface flaws in rolling element bearings. While CACT was pleased with the performance of this offline system, there were limitations. For example, the Datapac portable could detect vibration changes but could not track and analyze the pump’s condition in real-time. Also, the data acquisition process itself was very costly, with CACT having to send maintenance personnel by helicopter to the platforms at a cost of US$2,000 per visit. The offline system was also not able to provide the real-time pump protection outlined by the American Petroleum Institute (API) Standard 670, the global standard for machinery protection. Going onlineRockwell Automation, working with CACT to develop a solution that would meet the oil producer’s needs but still preserve the initial investment, proposed a remotely accessible, condition monitoring solution that would incorporate the existing portable equipment with an Enwatch system and XM modules.Via an onboard Ethernet network, the Enwatch system provides scheduled monitoring of all the pumps on the platform, with measurement parameters including vibration and process variables. On the same network, XM intelligent modules process critical parameters used to assess the current health — and predict the future health — of the pumps in real-time. The XM Series is comprised of DIN rail-mounted measurement, relays and process modules. Ideal for critical machinery, the XM system includes protection capabilities, which can be used to safely shutdown a machine before significant damage occurs. For example, upon detecting vibration outside of set parameters, it will send a signal to the motor control center (MCC) to turn off the relevant motors and protect the pump. Appropriately configured, XM meets the API 670 standard.CACT operators onshore can remotely configure the XM modules via a DeviceNet network and view the equipment status through PlantLink, which provides graphical representation of the health of all the machinery being monitored online. Information from the condition monitoring system is also integrated with Rockwell Software Maintenance Automation Control Center (RSMACC), where appropriate maintenance is scheduled in accordance with equipment requirements.Downtime downWith the online condition monitoring system in place, CACT has eliminated the need for manual data acquisition – and the associated costs. Onshore operators 200 km from the platform collect, configure and analyze data just as they would on a local server. “The Rockwell Automation solution suits CACT. It saves a great deal of manpower and expense,” said Guo Jinwen, Maintenance Supervisor, CACT. “We can monitor equipment hundreds of kilometers away from our office. The solution preserved our initial investment, while meeting our expectations.” Since applying the Enwatch and XM systems, CACT has reduced their unscheduled downtime from 2.43 to 0.67 percent – a significant 72 percent decrease. In fact, during a five-year period, the system has prevented machines from catastrophic failures more than 20 times. And annual maintenance expenses have also decreased dramatically – the drop-off in service time allowing CACT to save US$100,000 in third party annual maintenance costs.Based on information provided by Rockwell Automation (www.rockwellautomation.com/services/conditionmonitoring)
Eye on the FutureThe Center for Intelligent Maintenance Systems (IMS) is working on cutting edge technologies that will ultimately transform maintenance practices from fail-and-fix to predict-and-prevent. By Jay Lee, Masoud Ghaffari & Haixia Wang.The vision of the National Science Foundation Industry/University Cooperative Research Center (I/UCRC) on Intelligent Maintenance Systems (IMS) is to develop a systematic approach in advanced prognostics to enable products and systems to achieve near-zero breakdown reliability and performance.The Center, a partnership among the Universities of Cincinnati (lead institution), the Michigan and Missouri-Rolla, is supported by over 30 global companies including Toyota, GE, GM, Siemens, Boeing, Honeywell, P&G, National Instruments, Advantech, and Omron, to name a few. The Center’s mission is to serve as a center of excellence for the creation and dissemination of a systematic body of knowledge in intelligent maintenance systems and ultimately to impact next-generation product and service systems with six-sigma quality. It provides timely, high-quality, and cost-effective collaborative research projects and validates them through test beds.What is prognostics?Most machine maintenance today is either purely reactive – fixing or replacing equipment after it fails, or blindly proactive – assuming a certain level of performance degradation, with no input from the machinery itself, and servicing equipment on a routine schedule whether service is actually needed or not). Both scenarios are extremely wasteful. It often seems that machines fail suddenly, but in fact they usually go through a measurable process of degradation before they fail. Today, that degradation is largely invisible to human users, even though a great deal of technology has been developed that could make such information visible. It may come as a surprise to many people that most state-of-the-art manufacturing, mining, farming, and service machines (e.g. elevators) are actually quite “smart” in themselves. Many sophisticated sensors and computerized components are capable of delivering data about a machine’s status and performance. The problem is that little or no practical use is made of most of this data. We have the devices, but no continuous and seamless flow of information throughout entire processes. Sometimes this is because the available data is not rendered in a useable, or instantly understandable, form. More often, no infrastructure exists for delivering the data over a network, or for managing and analyzing the data, even if the devices were networked. When smart products and machines are networked and remotely monitored, and when their data is modeled and continuously analyzed with sophisticated systems, it is possible to go beyond mere “predictive maintenance” to intelligent “prognostics” – the process of pinpointing exactly which components of a machine are likely to fail, and when, and autonomously trigger service and the order of spare parts. Figure 1 illustrates the focus on product performance degradation assessment and prediction, as well as the associated key elements of “Intelligent Maintenance Systems (IMS).” The digital doctorIn most products or systems, different sensors measure different aspects of the same physical phenomena. In much the way that human “stereo” vision gives us depth perception, or multiple 2D perspectives can be combined into a 3D view, IMS is working on software to “fuse” available data into a more useable, holistic “image” of the actual state of machine performance behavior. A “digital doctor” inspired by biological perceptual systems and machine psychology theory, the Watchdog Agent comprises embedded computational prognostic algorithms and a software toolbox for predicting degradation of devices and systems. It is being built to be extensible and adaptable to most real-world machine situations.The Watchdog Agent consists of different modules. Each is realized in several different ways to facilitate the use of the Watchdog in a wide variety of products and applications, with various requirements and limitations with respect to the character of signals, available processing power, memory and storage capabilities, limited physical space, power consumption, etc. Sensory processing module – transforms sensor signals into domains that reveal the product’s performance. Time-series analysis or frequency domain analysis could be used to process stationary signals (signals with time invariant frequency content), while wavelet or joint time-frequency domains could be used to describe non-stationary signals (signals with time-varying frequency content). Most real-life signals, such as speech, music, machine tool vibration, acoustic emission, are non-stationary signals, Feature extraction module – extracts the features most relevant to describing the product’s performance. Those features are extracted from the domain into which the sensory processing module transforms sensory signals, using expert knowledge about the application, or automatic feature selection methods such as roots of the autoregressive time-series model, or time-frequency moments and Singular Value Decomposition.Decision-level sensor fusion – is based on separately assessing and predicting process performance from individual sensor readings, and then merging these individual sensor inferences into a multi-sensor assessment and prediction through an averaging technique. Performance evaluation module – evaluates the overlap between most recently observed signatures, and those observed during normal product operation. This overlap is expressed through the so-called Confidence Value (CV), ranging between zero and one, with higher CVs signifying a high overlap, and hence performance closer to normal. If case data associated with some failure modes exists, most recent performance Watch & learnOver time, as new failure modes occur, performance signatures related to each specific failure can be collected and used to teach the Watchdog Agent to recognize and diagnose that failure mode in the future. Thus, the Watchdog Agent is envisioned as an intelligent device that utilizes its experience and human supervisory inputs over time, to build its own expandable and adjustable world model. Even though the performance CV already bares significant prognostic information about the product’s remaining useful life, additional prognostic information can be extracted by capturing the dynamics of the product’s behavior and utilizing it to extrapolate and predict the product’s behavior over time.At Toyota, the Watchdog Agent was able to predict compressor surge and prevent unexpected downtown and damage to compressor. At Harley-Davidson, the Watchdog Agent, implemented on a Grob Aluminum-cutting machine, was able to automatically convert sensor data to health information and asses degradation and recommend maintenance action. And the Watchdog Agent was used to monitor the chiller in the control tower of Hong Kong Airport. A radar chart was generated to show the health condition of all the components including shaft, four bearings, evaporator, condenser, compressor oil and refrigerant circuit of the chiller.Dr Jay Lee is Director of the Center for Intelligent Maintenance Systems (www.imscenter.net); Dr Masoud Ghaffari & Dr Haixia Wang both hold the position of Senior Scientist & Assistant Director.
Beyond the traditional view: asset wear can actually take avariety of formsThe flat spot in the orbit of this vibration plot indicates amachine with a serious malfunction.Portable data collectorsContinuous and quasi-continuous condition monitoring systemsVenkat KannanXM systemCACTFigure 1IMS Center’s 13th Industry Advisory Board















Free Magazine Subscription
Printer-friendly version
Email to a Friend


