Your address will show here +12 34 56 78
Data Visualization

Predictive models are extremely useful in monitoring and optimizing manufacturing processes. Predictive modeling in manufacturing, when combined with an alarm system, can be used to alert changes in processes or equipment performance and prevent downtime or quality issues before they occur.

A process engineer or operator might keep an eye on real-time dashboards or trends throughout the day to monitor the health of processes. Predictive modeling, when combined with a proper alarm system, is an incredibly effective method for proactively notifying teams of impending system issues that could lead to waste or unplanned downtime.

In this article, we’re going to review two examples of predictive modeling in manufacturing. First we’ll describe and build an example of a PLS model, and then we’ll describe and build and example of a PCA model. PLS vs. PCA. Why choose one or the other? We’ll cover that as well.

Identify process issues before they occur. Prevent unplanned downtime, reduce waste, & improve efficiency.

Learn More

What is PLS in Manufacturing

PLS stands for “Partial Least Squares“. It’s a linear model commonly used in predictive analytics.

PLS models are developed by modeling or simulating one unknown system parameter (y) from another set of known system parameters (x’s).

In manufacturing, for example, if you have an instrument that is sometimes unreliable, but you have a span of time in which it was very reliable, it is possible to simulate, or model, that parameter from other system parameters. So, when it moves into an unreliable state, you have a model that will approximate, or simulate, what that instrument should be reading, were it functioning normally.

PLS Model Formula

We promise we’re not going to get too deep into the math here, but this is a PLS model formula:

y = m1x1 + m2x2 + … + mnxn + b

In this formula, the single (y) is approximated from the (x’s) by multiplying each by a coefficient and adding an intercept at the end.

PLS Analysis Use Cases

Some potential uses for PLS models include:

Simulating flow from valve position, power, or delta pressure (dP)

An example provided by one of our customers involved modeling flow from pump amps.

In this particular case, they had a condensate tank in which the flow kept reading zero on their real-time production trend, even though they knew their pump was pumping condensate.

Using dataPARC’s predictive modeling tools, they looked at the historical data and found periods of time when there was a flow reading, and they modeled the flow based on the pump amps during those same periods.

So, when the flow itself got so low that the flow meter wouldn’t register it, they still had a model of flow based on pump amps, because the pump was still pumping and registering pump amps.

Producing discrete test results modeled from a set of continuous process measurements

For example, there may be something you only test every four to six hours. But, you’d like to know, between those tests, if you’re still approximately on-line, or still approximately the same.

If you have continuous measurements that can be used to approximate that value that you’re going to test in four to six hours, you can build a model of those discrete test results based on what those readings were when the previous test was conducted.

Those are just a couple of examples of how you can use PLS for predictive modeling in manufacturing.

How can predictive analytics work for you? Prevent unplanned downtime, reduce waste, & optimize processes with dataPARC’s predictive modeling software.

How to build a PLS Model

So, now let’s look at building a PLS model. We’ll use the example we discussed where we simulate flow using delta pressure data. First we need to identify the tags or variables we’ll be working with.

Identify Variables

Using dataPARC, we build these models from trends. Here we have a trend showing Flow (blue), Square Root of dP (green), and Specific Gravity (pink). This data is being pulled from our data historian software.

A basic trend with the three variables we’ll be using for our PLS model


Flow is the variable we want to model, or predict. It’s the “y” in the formula we described above.

Square Root of dP

Flow is related linearly to the square root of dP. Not to dP itself. So, since the PLS model is a linear model, we’ll create a calculated tag in dataPARC by subtracting the downstream pressure from the upstream pressure and taking the square root of that difference. This will be our Square Root of dP variable that we can use in this linear model.

Specific Gravity (SG)

We’ll use Specific Gravity as our second x. In this example we’re not sure if Specific Gravity is necessary for this model, but it’s really easy to add tags to this equation, determine their importance, and remove them if they’re not needed. We’ll include it for now.

Establish Time Periods for Evaluation

So, on the left we’ll select data from Jan 24 – Feb 3. This is the data we’ll use to build our model. On the right we’ll select data from Feb 8 – Feb 17. This is the data we’ll use to run our model against to evaluate its viability.

On the right side of this split graph, we have the same flow tag, but for a different period of time. This is how we’ll evaluate the accuracy of the model we’ve built.

It is very important to evaluate a model against a time period that is not included in the dataset. To determine if the model is valid going forward. Because as time goes on, it will be using data that it never saw.

Generate the Modeled Data

With dataPARC’s predictive modeling tools, building the model is as simple as adjusting some configuration settings and clicking “Create New PLS Model”. The model will be generated using the data from the tags in our trend we looked at previously. Of course, with more effort, this data can also be produced and managed in Excel as well.

Creating a PLS model with dataPARC

Evaluate the PLS Model

The first thing you want to do when you build a PLS model is clean up the data. Or at least look for opportunities to clean up the data.

T1 vs. T2

First let’s look at the T1 vs. T2 graph. Again, we don’t want to get too deep into the math, but what we’re looking for here is a single grouping of data points within the circles on the graph. A single “clump” of data points indicates we’re looking at a single parameter, or operating regime. If it appeared we had two or more clusters of data points, it’d be a good indication we have multiple operating regimes represented in our model. In that case we’d want to go back and build distinct models to represent each regime.

everything looks good here, though, so let’s proceed.

Looking pretty good so far.

If a lot of your data is outside these circles it’s an indication that your model isn’t going to be very good. Maybe there are some additional tags that you need include in the model, or maybe the time period you selected is not good.

Y to Y

Using a common Y to Y plot, we can view the original y and the predicted y plotted against each other. In this example they’re very close together and you can see that the R-squared value is ridiculously high, which we’d expect when we’re modeling Flow from the Square Root of dP.

Check out that R-squared. 0.994.

So, you’d think with that kind of R-squared value we’d be ready to call it a day, but using dataPARC’s predictive modeling software, we like to take a look at one more thing.

Variable Importance

As expected, our Square Root of dP variable is extremely important, with a value of .942 – roughly 94% important to our model.

Looks like we did good with our Square Root of dP calculation.

If you recall, we added another tag, or variable, into the mix at the beginning – Specific Gravity (SG). Now, this was primarily to illustrate this Variable Importance feature.

As you can see less than 6% of the model is dependent on Specific Gravity. We expected this. Specific Gravity isn’t really useful in this model, and this Variable Importance feature backs that up. To simplify our model and perhaps enable it to run faster, we’d want to eliminate Specific Gravity and any other variables that aren’t highly important.

Save Your PLS Model

Now that our model is complete, we’ll want to save it so we can apply it later. In dataPARC’s PARCmodel predictive modeling software, you get this little dialog here where you can put in a project name and model name.

Saving our model in PARCmodel

How to Apply a PLS Model

So, now that we’ve built our model and saved it. We’re going to want to apply it and see if it works.

Remember earlier, when we chose two time periods for evaluation? Well, now, going back to our trending application, we can import the model we built from our source data and lay that over real data from that second time period to see how accurately it would have predicted the flow for that period of time.

Our predicted data, on the right, in red, falls right in line with real historical production data.

Well, well, well. It appears we have a valid test.

We used an 11-day period in late January (the trend on the left) to create a model, and now, the predicted values of the Flow (the red line on the trend on the right) over an 11-day period in mid February are nearly identical to the actual values from that time period. Perfect!

What is PCA in Manufacturing

PCA is one of the more common forms of predictive modeling in manufacturing. PCA stands for Principal Component Analysis. A PCA model is a way to characterize a system or piece of equipment.

A PCA model differs from a PLS model in that, with a PCA model, there is no “y” variable that you’re trying to predict. A PCA model doesn’t attempt to simulate a single variable by looking at the values of a number of other values (x’s).

Instead, each “x” is modeled from all other x’s. A PCA model is a way of showing the relationship between all the x’s, creating a “fingerprint” of what the system looks like when it’s running.

With a PCA model, you’re trying to say “I have a system or a piece of equipment, and I want to know if it has shifted, or moved into a different operating regime.” You want to know if it is operating differently today than it was during a different period of time.

PCA Analysis Use Cases

Some potential uses for PCA models include:

Diagnosing instrument or equipment drift

For example, you may have an instrument in the field that you know scales up over time, or something that is subject to drift, like a pH meter that you have to calibrate all of the time. When reviewing the values from that instrument, it can sometimes be difficult to know if changes in values are due to drift or if they’re a symptom of more significant equipment or process issues.

If you have a period of time during which you know all of your instruments were good and your process was running optimally, you can use that as your “thumbprint”. This is what you build your PCA model from, and then your PCA statistics that you trend into the future can tell you if something is shifting.

Flagging significant process alterations

A common example here is when a manual valve that is always open or should always be open, somehow gets closed. Since there’s no indication in a DCS or PLC that a manual valve has been closed, all the operator sees is that something is different. They don’t know what it is, but they recognize that something is different.

A PCA model can help here by automatically triggering an alarm or flagging significant changes in a process. The model can’t specifically see that the valve has been closed, but what it does see, for example, is that a pressure reading related to the flow is now different. Or, the control valve used to have x impact on flow or x impact on temperature, and it’s no longer affecting those variables.

PCA can tell you that something in the relationship between components or parts of a process is off, and it can help you get to the root cause of the issue.

How to Build a PCA Model

So, let’s take a look at building a PCA model for a pump. It’s a small system, and we’re going to set up a model to see when it deviates from its normal operating regime.

These steps will be nearly identical to those we covered in how to build a PLS model above. The one major difference is that we don’t have a y value that we’re trying to predict, so we’ll just need to select as many x variabls as we need to represent this particular system.

Identify Variables

We’re going to be using the following tags (x’s) to build our pump model:

  • Amps
  • Flow
  • Speed
  • Specific Gravity (SG)
  • Vibration (Vib)
  • Total Dynamic Head (TDH)
  • Temp

Establish Time Periods for Evaluation

Again, as we did with our PLS model, we’ll have our split trend that shows the data on the left that we’ll use to build our model, and the data on the right that we’ll use to evaluate the accuracy of the model.

Source data from our model on the left, and the data we’ll check it against on the right.

Generate the Modeled Data

A couple clicks here and bam. We have our PCA model.

PARCmodel makes predictive modeling in manufacturing easy.

Evaluate the PCA Model

So, how’s our model shaping up?

T1 vs. T2

Looking at T1 vs. T2 we appear to be off to a good start. All of our data seems to be grouped pretty tightly together, so that’s a good indication we’re looking at a single operating regime here.


Now let’s look at our DModX trend. This is particular to our PCA model.

DModX represents the distance from an observation to the Model in “x” space. “X” meaning how many dimensions we have. So, in this case we have seven x’s, or seven “dimensions.” There are thousands of “observations” that make up this DModX trend.

In our DModX trend, we can see that there are a few observations that are higher than the red line, which we can think of as the point of statistical significance. When we start getting a lot of observations above this line, it’s an indication that our model isn’t very good.

In this case, we have a few points bouncing around the red line, and on occasion going above it, but this is acceptable. This is what an accurate model generally looks like in DModX.

Hotelling’s T-Squared Normalized (HT2N)

Unlike DModX, HT2N isn’t showing us how the model is performing, or how the observations fit within the model. Instead it’s showing us how the observations fit within the range of all the other x’s. HT2N is also particular to our PCA model.

For example, it looks like there was a period of time here were there was something in the system – maybe multiple x’s – that were significantly different in range from all of the other periods of time before and after.

However, if we see a high HT2N it isn’t necessarily an indication that our model is bad. For instance, even though there were some parameters that had an unusual range, this spike in HT2N clearly falls within acceptable parameters of the corresponding DModX trend. As we see below, they fit within the model just fine.

So, sometimes it’s ok to leave a high HT2N set of data in there because you’re leaving the range of your data expanded. And at times there’s a reason you’ll want to do that.

Let’s say one of your x’s is a production rate. The “model set” models production between 500 and 800. And one day, your production rate went above 750. That might result in a spike like we see in the trend above.

How to Apply a PCA Model

Ok. So, we’ve created our pump model and, in our case, saved it using dataPARC’s predictive modeling software. Now we’re going to go back out to our split trend and apply the model to the timeframes we identified earlier.

We’ll use a 4-up view in our PARCview trending application, and isolate the DModX and HT2N tags in the bottom two trends.

PARCmodel automatically adds “limits” to a PCA model when it’s created, so if we turn visibility for limits on in our trending application, we can easily see where our data is going outside of our model.

With the limit data now identified, we can dig in using our favorite analytics toolkit and perform root cause analysis to determine if there’s an issue with this pump assembly.

Predictive Modeling in Manufacturing

So, there you have it. If you’re looking for good examples of applied predictive modeling in manufacturing, PLS and PCA are two common models useful in monitoring and optimizing manufacturing processes.

An engineer or operator might keep an eye on real-time dashboards or trends throughout the day but it can be difficult to spot potential process issues in time to avoid production loss. Predictive modeling software, when combined with an alarm system provides process manufacturers with an incredibly effective and reliable method for identifying issues before they occur – preventing unplanned downtime, reducing waste, and optimizing their manufacturing processes.

Want to Learn More?

Download the datasheet and see how dataPARC’s predictive modeling tools can help you identify process issues before they occur.

Download PDF

Dashboards & Displays, Data Visualization, Process Manufacturing

Most modern manufacturing processes are controlled and monitored by computer based control and data acquisition systems. This means that one of the primary ways that an operator interacts with a process is through computer display screens. These screens may simply passively display information, or they may be interactive, allowing an operator to select an object and make a change which will be then be relayed to the actual process. This interface where a person interacts with a display, and consequently the process, is called a Human-Machine Interface, or HMI.


Process Manufacturing

Overall Equipment Effectiveness, or OEE, has several benefits over simple one-dimensional metrics like machine efficiency. If you are not meeting demand and have a low OEE (equipment is underperforming) then you know you have an equipment effectiveness problem. If equipment is operating at a high OEE but not meeting customer demand, you know you have a capacity problem. Also, OEE lets you understand if you have spare capacity to keep up with changes in demand.


Dashboards & Displays, Data Visualization, Process Manufacturing, Training

New training dates have been added so now is the time to register for your dataPARC training held in Vancouver Washington just across the river from beautiful Portland, Oregon. Whether you need to escape the heat of summer, the cold of winter, or just need to get away from the plant, our hands-on training is your ticket to a welcome escape. Oh, did we mention the training?


Process Manufacturing

All forms of commerce require energy. Industrial processing and manufacturing facilities tend to be the largest consumers, but even service industries such as insurance and banking require large buildings which must be heated, cooled and lit. The newest large energy consuming enterprises are data centers, which are large clusters of computers which store and serve up the data which flows through the internet. Regardless of the end use or the industry, companies strive to minimize production costs by minimizing energy consumption.


Process Manufacturing

Most people are familiar with compressing data files so that they require less memory and they are easier to send electronically. Similar concepts are popular with process data historians. With process data, compression means reducing the number of data points that are stored, while trying to not affect the quality of the data. Compression can be accomplished using one of several algorithms (swinging door, Box Car Back Slope). Each algorithm uses some criteria to eliminate data between points where there is constant change (slope), within some tolerance.


Data Visualization

The purpose of process control alarms is to use automation to assist human operators as they monitor and control processes, and alert them to abnormal situations. Proper process alarm management requires careful planning and has a significant impact on the overall effectiveness of a control system.

Incoming process signals are continuously monitored, and if the value of a given signal moves into an abnormal range, a visual and/or audio alarm notifies the operator of that condition.This seems like a simple concept, almost not worthy of a second thought, and unfortunately, sometimes the configuration of alarms in a control system doesn’t get the attention it deserves.

In this post we’ll talk about the history of process alarms in manufacturing, and discuss best practices for configuring alarms for effective process control.

Early Process Alarm Management

Before digital process control, each alarm indicator required a dedicated lamp and some physical wiring. This meant that:

  1. Due to the effort required, the need for a given alarm was carefully scrutinized, somewhat limiting the total number of alarms
  2. Once the alarm was in place, it had a permanent “home” where an operator could become comfortable with its location and meaning

Working through digitization at your plant? Let our Digital Transformation Roadmap guide your way.

get the guide

The Introduction of Digital Alarms

As control systems became digital, the creation and presentation of alarms changed significantly. First, where a “traditional” control panel was many square feet in size, digital control system human machine interfaces (HMIs) consisted of a few computer monitors which displayed a representation of the process in an area more appropriately measured in square inches than square feet.

Second, creating an alarm event was a simple matter of reconfiguring some software. Multiple levels of alarms (hi & hi-hi, lo & lo-lo) could easily be assigned to a single process value. This led to an increase in the number of possible process alarm notifications.

Finally, when an alarm was activated, it was presented as an icon, or as flashing text on a process schematic screen, and then logged in a dedicated alarm list somewhere within the large collection of display screens. However when the alarm was presented, it lacked the consistency of location and intuitive meaning that the traditional physical lamp had.

The Dilemma With Digital Alarms

The digital alarm systems worked acceptably well for single alarms and minor upsets. But for major upsets the limited visual real estate and the need to read and mentally place each alarm created bottlenecks to acknowledging and properly responding to large numbers of alarms in a short interval of time.

If a critical component in a process fails, for example a lubrication pump on a large induction fan, the result can be a “flood” of alarms occurring over a short time period. The first wave of alarms is associated with the immediate failure, low lube oil pressure, low lube oil flow, and high bearing temperatures.

The second wave is associated with interlocks shutting down the fan, high inlet pressure, low air flow and low downstream pressure. With no ID fan the upstream boiler will soon start to shut down and generate numerous alarms, followed most likely by problems from the process or processes which are served by the boiler.

The ASM Consortium

Analyses of a number of serious industrial accidents has shown that a major contributor to the severity of the accidents was an overwhelming number of alarms that operators were not capable of understanding and properly responding to in a timely manner. As a result of these findings, in 1992 a consortium of companies including Honeywell and several petroleum and chemical manufacturers was established to study the issue of alarm management, or more generally, abnormal situation management.

The ASM Consortium, with funding from the National Institute of Standards and Technology, researched and developed a series of documents on operator situation awareness, operator effectiveness and alarm management. Since then a number of other industry groups and professional organizations, such as the Engineering Equipment and Materials Users Association in the UK and Instrument Society of America have also examined the issue of alarm management and issued best practices papers.

Exploring process controls? Easily implement online SPC/SQC utilizing dataPARC’s integrated limit management tools and alarm/event engine.

Process Alarm Management: Best Practices

The central message of these process alarm management best practices documents is that the alarm portion of a digital control system should be put together with as much care and design and the rest of the control system. It is not adequate to simply assign a high and low limit to each incoming process variables and call it good. There are a number of practices which can improve the usability and effectiveness of an alarm system. Some techniques are rather simple to implement, others are more complex and require more effort.

1. Planning

When designing or evaluating an existing system, start by looking at each alarm. Evaluate whether it is really needed, and is it set correctly? For example, a pump motor may have an alarm which sounds if the motor trips out. However, if there is also a flow sensor downstream of the pump which has an alarm on it, if the pump stops, two alarms will register. Since the real effect on the process is a loss of flow, it makes sense to keep that alarm and eliminate the motor-trip alarm.

2. Prioritization

Alarms should be prioritized. Some alarms are safety related and should be presented to the operator in a manner that emphasizes their importance. High priority alarms should be presented in a fixed location on a dedicated alarm display. This allows operators to immediately recognize them and react in critical situations. It is very difficult to read, understand and quickly react to an alarm which is presented only in a scrolling list of alarms which will be continuously growing during a process upset.

The reduced speed category captures the losses of running below the target (maximum) speed. For information about using the right speed target in the performance factor calculation read 6 Keys to Successful OEE Implementation.

3. Grouping & Suppression

Correctly identifying the required alarms and prioritizing them is a help, but these techniques alone will not stop a surge of alarms during a crisis. In order to significantly reduce the number of presented crisis alarms, methods like alarm grouping and alarm suppression are needed. As mentioned in the ID fan example above, a single point of failure can lead to several abnormal process conditions and thus several alarms.

It is possible to anticipate these patterns and create control logic which handles the situation more elegantly. In the case of the ID fan, if the inlet pressure to the fan goes high and the outlet flow drops it makes sense to present the operator with virtual alarm of “Fan down” rather than a dozen individual alarms, all presented within seconds of each other, that he or she has to deal with. While the operator is trying to comprehend a cluster of individual alarms to deduce that the fan is down, the upstream boiler may trip out.

Check out our real-time process analytics tools & see how better data can lead to better decisions.

Check out PARCview

Hopefully, with a single concise alarm of a lost fan, the operator can take action at the boiler and perhaps keep that unit running at reduced rate until the fan can be restored. All alarms are still registered by the system for diagnosis and troubleshooting, but only condensed, pertinent information is presented to the operator. This type of grouping and suppression can be done manually as well. If there is a process unit that is sometimes taken offline or bypassed, it makes sense to group and suppress all of the alarms associated with that unit’s operation. An operator shouldn’t have to continuously acknowledge a low flow alarm on a line that he knows has no flow in it.

4. Human Administration

Perhaps the most important part of alarm management is the actual human administration of the system. However a system is designed, its intent and use needs to be clearly communicated to the operators which use the system. Training operators on how to use and respond to alarms is as important as good original system design. Process alarm management is a dynamic endeavor, and as operators use the system they will have feedback which will lead to design improvements. The system should be periodically audited to look for points of failure and areas of improvement. As processes change, the alarm configuration will also need to be changed. This ongoing attention to the alarm system will make it more robust and yield a system which will avert serious process related incidents.

Looking Ahead

Configuring and maintaining process alarms properly requires careful planning and has a significant impact on the overall effectiveness of a control system. Process alarm management best practices dictate that the alarm portion of a digital control system should be put together with as much care and design and the rest of the control system.

digital transformation guide

Digital Transformation Guide

Download our Digital Transformation Roadmap and learn what steps you can take to achieve data-driven success in manufacturing.

Download PDF

Data Visualization

One of the challenges facing industrial process manufacturing is the growing number of data sources.

Examples of these data sources could be shift reports, process data historians, laboratory information management systems, or manufacturing execution systems. Being able to easily connect disparate data sources for decision-making is a key challenge in the age of IIOT. The lack of connection of these data sources and the creation of data silos is one version of the “big data” problem we hear about.


Data Visualization

There are many different systems available for storing and analyzing manufacturing process data. A historian is a database application that provides a means of storing the data. Nowadays, historians are a commodity and sites routinely have multiple sources of data – everything from embedded historians in control systems to custom databases for specific purposes. The true value is in how the data is used, not where it is stored.