Your address will show here +12 34 56 78
Data Visualization, Historian Data, Process Manufacturing, Troubleshooting & Analysis

In the process industries, optimization is the key to efficiency. And efficiency is what leads to profit – allowing manufacturers to produce more and waste less. To optimize their processes, many manufacturers use a combination of time series data historian and data visualization software. dataPARC and PI are two of the leaders in this space, and in this article we’ll compare dataPARC vs PI and highlight some of the advantages dataPARC has over PI as a process information management system.

Check out dataPARC’s real-time process data analytics tools & see how better data can lead to better decisions.

Check out PARCview

dataPARC vs PI: Similarities

dataPARC and PI have existed for decades and have large installation bases represented by major manufacturers around the world.

Both dataPARC and PI:

  • Offer a real-time data historian
  • Use a binary, cluster-index, flat file to store history
  • Have an asset structure to address the complexities of large disparate data sources
  • Offer many of the expected analytics & visualization tools: trending, graphics, reports
  • Can connect to various control systems for collecting time-series data in real-time
  • Use a store & forward function in case data connectivity is lost
  • Can work with very large tag-count systems

Now, let’s dive more into their differences and see how dataPARC sets itself apart.

dataPARC vs PI: Differences


We might as well start with what will be one of the key considerations when evaluating these two data historian and process data visualization toolkits.

Long story short, dataPARC’s total cost of ownership is lower when compared to other “like” industry solutions. Both the initial cost and ongoing costs are considerably lower than the PI System.

Unlimited Users

A key reason for this is dataPARC’s unlimited license model, which makes it a great fit for organizations wishing to get production data in front of decision-makers at every level of the plant without worrying about having to purchase additional licenses.

PI uses a per-user pricing model. This tends to work for small organizations with only a few people needing to access the platform, but for larger organizations or enterprise implementations the cost adds up quickly.

With dataPARC, everyone who needs access to the data can have access at no additional cost – putting the power to make data-data decisions in the hands of every employee.

Looking for an alternative to PI’s Data Historian? Get an enterprise plant data historian at a fraction of the cost. Check out dataPARC’s PARCserver historian.

User Experience

When customers are asked about dataPARC’s top 3 to 5 benefits, ease-of-use is always near the top of the list. The reduced complexity of the dataPARC system allows even the least “computer-savvy” person to begin building content and gaining value, and results in wide adoption of the tools within an organization. 

Though there are many features to dataPARC, a new user can learn how to search tags, trend, and navigate within minutes. From there, users quickly learn they can view trend statistics, manage alarm events, export data, create displays such as X/Y Plot, Histogram, or Pareto and much more all form the right click menu.

dataPARC’s trending tools have long been recognized by customers as the number 1 trend solution in the industry. dataPARC’s trend capabilities are faster and far superior to others and fit better in the practical function realm.

No other package allows for a quicker build of a trend matrix, with quick drag & drop from both the tag browser and displays.

dataPARC makes finding and trending tag data super easy.

Many organizations that were set up with the PI Historian and ProcessBook have since chosen to get dataPARC to “sit on top” of their PI historian simply for PARCview; the visualization tools and ease of use speak for themselves.

Diagnostic Analytics

As mentioned earlier, dataPARC’s trending application is considered the best in industry. Not only for its ease of use and quick access to analysis tools but for its speed as well.


dataPARC uses a deliberate data speed strategy with multiple components including an embedded Performance Data Engine (PARCpde) to speed data to the user.  The goal is to meet and exceed the user’s “speed of thought.”  PARCpde is a foundational part of the entire dataPARC system. 

Speed tests comparing dataPARC vs PI and other contemporary historians have shown dataPARC to be anywhere from 10X to 50X faster in delivering large or long-term datasets back to the user. 

Several companies have switched to dataPARC in part because of the data speed.  dataPARC also utilizes an aggregate archive and rollup archive in its architecture which greatly reduces the amount of time wasted when solving problems or investigating opportunities. 

From the trend, users can launch a quick statistics grid, generate a new X/Y Chart or Histogram display. Each chart will pull in the tags from the trend, so users don’t have search for them in Tag Browser again.

The X/Y plot sets two tags up for comparison and a best fit line can be generated – linear, polynomial, etc. The formula generated from the fit can be pulled into a trend or other display. PI can also generate X/Y plots, but they are created from scratch and no best fit line is generated.

Excel Add-in

dataPARC’s Excel add-in was built with a high degree of ease-of-use and speed. 

PI and dataPARC both have in-cell functions that can pull data directly into Excel. The dataPARC add-in has multiple other functions.

There is a sheet that can pull multiple tags in the same time range without dealing with formulas. Users can import tag lists from already created dataPARC displays instead of searching for the tags again.

Besides the value gained in legacy Excel add-in tools, dataPARC’s is highlighted by the following:

  • Drag groups of tags/data into Excel from multiple data sources
  • Filter data based on multiple tags values
  • Cross Correlation/R2 matrix generation
  • CUSUM & MSR charting

Additionally, users can display time series-based data from Excel into PARCview trends and displays. This can be used to trend or compare data from outside the company right next to process data.

Evaluate the top alternatives to Processbook & PI Vision in our PI Server Data Visualization Tools Buyer’s Guide.

Get the Guide

Operations Management

Real-time operations management is necessary to keep a plant running at peak efficiency and to be able to respond quickly to process excursions that result in unplanned downtime or product loss.

This is facilitated by dataPARC in a variety of ways:

  • Graphical process displays
  • KPI and Lab data dashboards
  • Manual data entry (MDE) tools
  • Automated reporting
  • Process alarms & notifications
  • & more

When comparing dataPARC vs PI, both offer the creation of dynamic, information packed graphical dashboards, but only dataPARC has the Centerline display.


Centerline is a powerful monitoring tool unique to dataPARC. It is a real-time display that reports run based statistics for tags. The runs can be Grade or Time based, and the statistics include time average, standard deviation, CpK, min, max, etc.

Centerline displays data for time periods or runs to ensure process conditions are the same run after run.

The purpose of a centerline display is to help determine the best operational settings for production, and to ensure those settings are normally being used during production.

Centerline is one of dataPARC’s powerful data analysis tools for which there is no PI equivalent.

Alarms and Notifications

dataPARC’s alarm and notification system can send emails, text notifications or trigger workflows when an alarm is detected or closed. Once an alarm is detected, an alarm event is created. These events can be viewed and acknowledged in a trend, centerline, graphic or alarm list. Users can acknowledge the event by assigning a reason from the reason tree and/or typing a comment to the event. Quick analysis can be done in dataPARC with the Pareto chart to determine the top reasons saved for an alarm or create a tabular report sorted by reason with all comments visible.

Similarly, PI can create event frames and send notifications. Once event frames are detected and a reason assigned, users can see this data as a table in PI Vision, but further analysis or reporting is required to take place in the PI Excel Add-in DataLink. dataPARC’s Excel Add-in also has features to pull in Alarm event data.

More dataPARC Excel Add-in features are explored in the following section.

Manual Data Entry (MDE)

dataPARC’s MDE display is quick to configure and allows users to enter and save manual data to the database rather than on a piece of paper or in Excel.

Manually entered data is represented by tags, thus they can be used in PARCview trends, dashboards, and displays like any other tag.

Need to get better data into the hands of your process engineers? Check out our real-time process analytics tools & see how better data can lead to better decisions.


When users don’t have the perfect tag to help manage a process, a calc tag or MDE is often used. dataPARC and PI are both able to perform simple calculations such as adding tags, If/Then statements, or unit conversions.

with PI Vision, PI no longer supports VB scripting. VB scripting opens the doors for custom solutions and dataPARC leverages VB scripting for applications such as database reads, file parsing, web service calls, and much more.

Predictive Analytics

dataPARC’s PARCmodel offers a degree of predictive analysis with PLS (Partial Least Square) and PLC (Principal Component Analysis) modeling capabilities.


The PLS package has been described by one of the world’s top practical modeling engineers as “…bar-none, better than anything I’ve ever seen before.”  In the processing industry, one of the applications for PLS modeling is in building inferential property predictors (IPPs).

Control engineers in operating companies report that a PLS model generation for one IPP can take more than 8 hours to re-model (longer for the initial model) using multiple tools and off-line activity. dataPARC integrates it all into one tool and the re-model effort can be as little as 5 minutes.

This snappy model generation allows multiple solutions to be generated for comparison to find the best option. The speed of remodeling allows for wider application and benefit of PLS.  Practical engineering methods and even process “hunches” can now be backed with a quick validation by a PLS mathematical session in 2 to 5 minutes. 

dataPARC’s predictive modeling tools

dataPARC delivers huge time savings, better learning environment, better collaboration environment, more useful applications – these all accelerate value to the company’s key business drivers. 


PCA uses the same modeling advantages that dataPARC’s PLS offers, allowing for easy model generation.  The difference between the two modeling methods is that PLS seeks to model and mimic a single variable using adjacent variables as model inputs.  PCA doesn’t model a single variable but models a whole process. 

The value comes when comparing the current process with the modeled process.  PCA gives the user the ability to know when the current process is off (when compared to the modeled process) and identifies the “offending” process variable(s). 

PCA makes use of two parameters (available to the PLS model as well): DMODX (error from model) and HT2N (Hotelling T2 Normalized – off norm). The PCA model input variables are all graded and staff can see which variable(s) is/are causing the problem.  PCA can be used as an early warning system to help operations see a problem before it happens. 

PARCmodel is separately licensed but incorporated into PARCview and easily accessed in the trend right click menu. PI does not have similar analytics tools.

Looking to replace ProcessBook See why PARCview is regarded as the #1 ProcessBook alternative.

Customer-Centric Development & Support

At dataPARC, above everything is the customer and their very real, timely, practical needs. dataPARC’s strategy involves a high attentiveness to the customer’s needs and solving problems quickly.

dataPARC employs many SMEs serving in key process engineering support roles for operating companies in the industry. Over the years dataPARC’s user features and overall system architecture has been shaped by the SMEs and customers. dataPARC is built by end users for end users.

At dataPARC we sell more than software, we sell our services to help build trends, graphics and other displays to get your system off the ground running. Our Engineers and Support staff are available to help implement new projects and off continual support.

With PI, to get the same displays created, customers would have to outsource to a 3rd party. dataPARC is a one stop shop.


dataPARC and PI have a lot in common, however dataPARC has the upper hand where it counts – user experience, speed of data, and cost. dataPARC is simple, fast, and effective.

The advantages to dataPARC vs PI continue to grow with every new feature and update. Features that are driven by users and customers.

pi processbook alternatives guide

Download the Guide

Discover top alternatives to PI’s ProcessBook and PI Vision analytics toolkits.

Download PDF

Data Visualization, Historian Data, Process Manufacturing, Troubleshooting & Analysis

Where do we start in digitizing our manufacturing operations? one may ask. While there is no easy answer, the solution lies in starting not from the top down, but from the ground up, focusing on the digital transformation roles and responsibilities of the key people in your plant.

Digital transformation in process manufacturing is not only a priority, but now an essential step forward as the world encounters and adapts to a more digital world. To put it simply if you do not adjust your processes to embrace digital change, your competitors will (and may already have) outproduce, outshine and outsell you.

Integrating manufacturing data at your plant? Let our Digital Transformation Roadmap guide your way.

get the guide

Transformation Teams

Digital change has been slow until now though it has been steady. PLC and DCS systems were manufacturing’s digital beginnings and thankfully there is so much more available now to further digitize operations and minimize downtime, improve your process, enhance data management, data sharing reporting and increase profitability. A truly connected enterprise will be adaptable and agile, allowing it to keep abreast of changes in the operating environment.

Plant roles play an essential part in the digitization of process manufacturing and all can contribute to a seamless digital transformation within your facility. Each role embraces digital change and transforms the process from the inside out. By focusing on these roles and the duties and responsibilities within each of them, plant digitization can lead to a well-oiled machine whose comprehensive outcomes depend on and benefit from.

Where do we start in digitizing our operations? one may ask. While there is no easy answer, the solution does lie in starting not from the top down, but from the ground up, with each role’s responsibilities and contributions enhancing the other, adding to and building on the next, for a comprehensive digital enterprise and solid, data-based reporting.

Integrating sources of plant data is a good place to start, along with the processes themselves becoming digitized for maximum outcomes. In this article we will focus on the various roles in the plant, their responsibilities and how each one can contribute to digital transformation.

Digital Transformation Roles & Responsibilities

The Operator

The Operator’s Role in Digital Transformation

Checking process conditions (temperatures, pressures, line speed, etc.) are an essential task for an operator. These process conditions could have readings directly on the machine with valves or buttons to adjust as needed. With more and more digital transformation in manufacturing these process variables are being set up with PLCs to create a digital tag. This tag can be read through an OPCDA server and visualized throughout the plant on computers, in offices, control rooms and meeting rooms. They can also be set up with a DCS to control the process from the control room rather than having to walk the floor to adjust speeds or valves.

The process variables need to be monitored to produce quality products. There are ranges for each process variable and additive when making a product, if these get out of range, the final product could be outside the final specification. Limits can be drawn on gauges, written in an SOP (Standard Operating Procedure) or set up as limits for alarming. These alarms could appear either on the DCS or data visualization screen to alert the operator a variable needs attention.

To consistently make quality product, operators must communicate with the lab tech to verify the product is within spec. This communication between the lab and operators has been traditionally done through verbal communication, walkie talkies, phone calls, etc. To digitize this process, the lab tech enters tested values into a data visualization program or a lab information management system (LIMS) database. These values can be displayed on dashboard with the specifications next to them. The operator can then see when specification values are out of spec and adjust the process, or when values are trending up/down and adjust the process to keep the product within specification before making bad quality product.

Operators are also responsible for keeping track of a product and lot being produced. This can be done manually with pen and paper or entered digitally into a database.

At the end of the shift operators need to pass key information to the next shift. This can be done with a hand off meeting to verbally discuss, a physical notebook to log key points or a digitalized version of a notebook. With digitalized versions of reports there is opportunity to relay information to multiple control rooms or locations of the company’s operations at once.

The Lab Technician

The Lab Technician’s Role in Digital Transformation

Lab quality testing is an essential part of process manufacturing. Thorough testing of each batch quality results allows for production of the scheduled product. Because other roles such as process engineer and operator rely on the outcomes of lab testing, getting the lab quality data seamlessly disseminated is essential to smooth operations.

Testing multiple variables of the product and comparing it to specifications, manually testing the product, recording the result, and manually comparing the finished product to specifications are among the lab technician’s duties. If the lab tech is entering data into a digital system, limits can typically be saved for different products, speeding things up.

The lab tech would manually test the product, enter the results in a program, and the LIMS system would flag if the result were out of spec. Furthermore, a lab tech can set up the test, a machine conducts the test, the result is then fed to the LIMS system where the value would be flagged if the test is out of spec. performing these tasks digitally is a tremendous time saver and process.

In summary, lab techs are ultimately responsible for testing the final product and passing or failing it to be sold. Digitizing these tests and the corresponding data streamlines and accelerates the entire lab test process.

On the road to digital transformation? Get our Free Digital Transformation Roadmap, a step-by-step guide to achieving data-driven excellence in manufacturing.

The Process Engineer

The Process Engineer’s Role in Digital Transformation

Process Engineers, often called by other titles including chemical engineers, often have a range or duties including product development, process optimization, documentation of SOPs, setting up automatic controls/PLCs, ensuring equipment reliability, communicating with superintendents, operators, lab techs, maintenance managers and customers.

Process engineers monitoring the entire manufacturing process on a daily, weekly, and monthly basis to identify improvement opportunities and evaluate the condition of the assets and processes.

Most sites have an existing system for maintenance requests. A physical system may exist where staff hand writes the issue, area, and other important information and hand deliver it to the maintenance department. Alternately, there could be a system set up to email the maintenance department with pictures attached. A program may be used to submit maintenance requests. This system would provide a unique ticket number, automated status updates, and other key information. Such a program would allow engineers or the maintenance department to see history thus being able to identify repetitive issues, such as a part needing replacement. Digitizing maintenance can help create a preventative maintenance schedule, to replace the part before it is no longer performing, resulting in sub-par product quality.

Another way for engineers to monitor the process is through data visualization. When data is stored, the history can be viewed, and users can identify irregularities, trends, and cycles in the process to help identify root cause when upsets occur. Engineers might set up their own alarms, separate from operator alarms, to keep track of events and determine if an optimization project is possible.

Process optimization and product development are important tasks for project engineers. Engineers may develop and conduct trials to continually optimize the process and develop new products. They often use the Six Sigma DMAIC (Define, Measure, Analyze, Improve, Control) method to do this. The Define step is typically completed by a stakeholder, a superintendent or plant manager. Once the project is defined the engineer moves into the measure step.

The measure step can take many forms, physically measuring, counting, or documenting a process. Collecting necessary data can be time-consuming. With more of the data being digitized, data collection is already done.

Check out our real-time process analytics tools & see how better data can lead to better decisions.

Check out PARCview

Engineers need to organize and collect data to analyze it. Once the data is collected it can put into Excel, Minitab, or other programs to be analyzed. By doing comparisons and statistical analysis, with the help of process knowledge, an improvement plan can be created.

Engineers will work with operators and lab techs to work through their improvement plan. Typically, the plans will include information that the operators and lab techs will have to record to give back to the engineer to determine if an improvement was made. The plans can be printed off and hand to those involved, and the necessary data collected on sheets of paper.

If a program/graphic/database was being used then the engineer could create an improvement plan within said program, then the operator/lab tech can enter necessary values directly, making the data accessible instantly to the engineer. After the project is complete and an improvement was made, a SOP is written and saved.

In his role, the engineer needs to communicate this change to all necessary personnel. The SOP could be saved locally on each computer, in a shared file, on SharePoint, or as a link within a program that has versioning so users can go back and see what changes were made and when. To alert others of the changes, an email can be sent out to supervisors to communicate to their shift, or if a digital notebook is available, a message can be sent to the necessary areas with a link to the newly updated SOP.

As mentioned above, engineers can be responsible for writing and maintaining SOPs. SOPs can be stored in binders in the control room, saved on control room computers, or a shared folder. There are also programs that can save versions of documents so users can see what changed and when. Operators and lab techs would then use the SOPs when performing a task or testing. It is important for operators to be notified of changes made to the SOP. This could be the engineer sending out an email, or a program with a preset list sending updates to emails. Engineers could also have a notification set up on the operator’s computer.

The Plant Manager

The Plant Manager’s Role in Digital Transformation

Plant managers wear many hats and the hats they wear continue to multiply as plants face complexities and pressure to produce more with increased profitability.

Hiring good people – the key to running a digital forward organization is staffing with people in mind. Good, productive people run plants with data, not hunches or best guesses. They make data driven decisions that are the best for the organization and identify root causes through careful anomaly detection and analysis.

Good leaders know that to truly digitize operations at a plant you must start from the bottom and that every role is an important component to the whole and every person’s contribution important.

Ron Baldus, CTO at dataPARC, advises “Clean data” is the key to successful digital operations. What exactly does clean data mean, one might ask? Clean data is the pure data, data-driven data, not hunch-driven data and the one version of the truth. With clean data plant managers and those who work for them can continue to make data and profit driven decisions. A good data visualization software that connects all data sources is a good place to start. With this connected software, extensive reports pulling on many data sources can be run to give the plant manager a key report with important information visible. If there is a problem in the operations, this reporting can allow the plant manager to identify the problem and task his engineers and operators with getting to the source and making the necessary adjustments, all based on fact and not best guesses.

Plant managers know that there are many important moving parts to a plant operation and getting reliable data is the lifeblood of a successful, profitable operation. The more digital the plant becomes, the cleaner data flows to all departments and roles and allows troubleshooting, reporting, and forecasting to be more and more seamless.

Another advantage to digitization at the plant manager level is transferring of skill, information, and expertise at the subject matter expert SME level. Many SMEs are getting close to retirement and in them a wealth of information, experience and methodology that is at risk of being lost. Through the digitization of reports and operations, the methods can be preserved and passed on to the next person assuming the role and responsibility, whether it an operator or an engineer or other essential role.

Looking Forward

Whether it is the operator, the engineer, the lab tech or the plant manager, all digital transformation roles and responsibilities in manufacturing contribute to the transformation of the plant. From the bottom up with effective communication and consistent data, downtime can be minimized, golden runs more common and seamless operations a daily reality.

digital transformation guide

Want to Learn More?

Download our Digital Transformation Roadmap and learn what steps you can take to achieve data-driven success in manufacturing.

Download PDF

Data Visualization, Historian Data, Process Manufacturing, Troubleshooting & Analysis

Integrating manufacturing data in a plant is necessary for many reasons. Among the most important is getting relevant data to various departments quickly. In doing so, downtime is reduced, anomalies are identified and corrected, and quality is improved.

So often integrations are delayed due to fears around losing data quality during integration or simply finding the time in a 24/7 environment. There are pros and cons to each integration type. In this article we will walk you through the different integrations and what to look out for as well as tips and best practices.

Integrating manufacturing data at your plant? Let our Digital Transformation Roadmap guide your way.

get the guide

Integrating Historian & ERP Data

Enterprise Resource Planning (ERP) is software used by accounting, procurement, and other groups to track orders, supply chain logistics and accounting data. By adding historian data, ERP systems have a fuller picture of the comprehensive plant operations.

combining erp and historian data on a trend

Integrating your historian and ERP data can provide great insight into which processes are affecting quality.

ERP users have access to more information about finished goods such as the exact time of any major production step or if there were an issue with production. For instance, if the texture of newsprint is slippery and not up to spec and because of that cannot be cut properly on the news producer’s rollers, the specific lot can be identified and the challenge of finding out which lot produced poor quality paper is no longer a roadblock.

In a nutshell, The Historian to ERP integration means departments outside of production get all the data they need without engaging another resource. The challenges include a time -consuming integration where erroneous values can have a wide-ranging impact, so double-checking values is essential.

Integrating Historian & MES Data

Connecting a historian to an MES (Manufacturing Execution System) expands the capabilities of the MES. Manufacturing execution systems are computerized systems used in manufacturing to track and document the transformation of raw materials to finished goods, obviously an essential component of manufacturing data capture. The historian provides a historical log of all production data rather than only being able to see current values or near-past values. Being able to pull large amounts of historical data along with data from an MES when needed, allows for projections that are not possible without this long-term perspective and additional data

A relevant example of a MES to historian benefit is an ethanol plant that would like to examine seasonal -winter vs. summer- variability on fermentation rates. The historian has all this data and the MES allows the user to pull out data only for relevant times.

integrating historian and mes data on a trend

Integrating MES data into a historian provides access to years and years of data and allows for long-term analysis.

Using product definitions from the MES and the comprehensive history of production runs for a given product line or product type without manual filters of all historical data is key to fast troubleshooting with this integration type. Historian to MES integrations help to reduce waste and decrease the time it takes to solve an issue. Like the Historian to ERP solution, the Historian to MES integration takes significant work and resources but the benefits are immediately evident and realized.

On the road to digital transformation? Get our Free Digital Transformation Roadmap, a step-by-step guide to achieving data-driven excellence in manufacturing.

Integrating LIMS & ERP Data

A Laboratory Information Management System (LIMS) contains all testing and quality information from a plant’s testing labs. Plants often have labs for quality testing. Federal regulations and standards often dictate a test’s values or results which a batch’s success and quality is ultimately dependent upon. Testing is done at various stages of production and this includes the final stage which is the most important stage. Certificates of analysis are common documents that ensure the safety and quality of tested batches. LIMS to ERP integration is especially important for the food and beverage industry as they depend upon testing to ensuring their product is safe for human consumption.

integrating LIMS and ERP data in a trend

By integrating LIMS and ERP data it’s easy to identify a specific out-of-spec batch or product run for root cause analysis.

Batch quality data from LIMS systems allows ERP users who could be accounting or procurement departments to build documents and reports that share data to certify the quality of shipped product. This integrated data also gives the customer reps immediate access to data about shipped product. The LIMS to ERP integration is very important as so many LIMS departments still rely on a paper trail which can be a tremendous hold up to production. As with the historian to ERP integration the LIMS to ERP integration must have accurate data to provide site-wide value, so double checking is necessary.

Integrating LIMS & Historian Data

Just like historical process data from assets, testing data is very useful information when troubleshooting a production issue. The LIMS system as explained earlier, stores all of the testing data from the lab. Sending LIMS data to the historian allows users to have a greater understanding of the production process and lab values to provide a fuller picture and greater analysis of the issue.

integrating LIMS and historian data in a trend

Integrating LIMS and historian data is one of the most effective ways to analyze how a process affects product quality.

An example could be when a paper brightness is out of spec, lab data can shed light and bring attention to the part of the process that needs adjustment. Alerts within the historian can be set up, giving engineers more time to adjust the process to meet quality. Past testing values are useful when comparing production runs and bring awareness to patterns that production data alone may not have. As with any integration, LIMS to Historian requires planning, a team that is engaged and milestones to check in on the progress and success of the integration.

Check out our real-time process analytics tools & see how better data can lead to better decisions.

Check out PARCview

Integrating CMMS (Computerized Maintenance Management Systems) & ERP Data

Maintenance is a large and necessary part of plant operations. Maintenance records and work order information are often stored in a Computerized Maintenance Management System (CMMS). The CMMS system has comprehensive information that by itself cannot be accessed by departments that may need to learn more details about the specifics of the maintenance.

By connecting the CMMS to an ERP system, ERP users will have access to more data about the finished product. Users can check to see if there were any maintenance issues around the time of the production. Facilities with lengthy scheduled shutdowns like an oil refinery will need to plan out how much gasoline or other fuel to keep in storage to meet their customer obligations.

integrating CMMS and ERP data in a trend

By integrating maintenance and ERP data we’re able to investigate and out-of-spec product run and note that there was a maintenance event that likely caused the issue.

Knowing about shutdowns both planned and unplanned allows the user to better plan out both customer orders and shipping. Anticipating the schedule for planned repairs is also useful for financial planning and forecasting. Users with access to historical work order information can better understand any issues that might come up and gives a bigger glimpse into the physical repair and the associated costs and impact. Integrating these systems can prove to be some of the hardest integrations simply because the data types can vary so much.

The key to a successful CMMS to ERP integration is getting necessary leadership on board and having a detailed roadmap and plan with regular teams check- ins so that obstacles can be addressed immediately.

Integrating Field Data Capture System & Historian Data

The remote nature of field data capture systems means that this data is often siloed and very difficult and slow to access. Field data is just that, captured in the field and often must be pieced together from manual entries, often on paper. Various roles collect this data, and it must be utilized collectively to have any value. Field data types such as temperature, quality and speed must be consistent when entered and even more so when moved on to a historian.

Though often cumbersome to collect, compile and enter, field data in a historian can be enormously empowering to an engineer. For example, oil wells in the Canadian oil sands can be 50 to 200 miles from the nearest human operator. The more data the operator knows about these wells, the less travel they spend checking up on each well.

Integrating Field data and Historian Data

Integrating field data into a historian provides reliable access to long-term data from previously siloed wells.

Connecting field data to a historian also increases the amount of data an engineer can use during troubleshooting. The data in the field is vital to reducing downtime and managing product quality. Sharing that data with the historian gives the data a broader audience where comparison and analysis can be made, resulting in less downtime and greater productivity.

Looking Forward

When integrating manufacturing data, the overriding theme and result is digital data empowerment. When important plant data can flow seamlessly from one person, system or department, better decisions can be made through better analysis which ultimately leads to better operations, less downtime and greater profitability. It is important to understand the full data management and connectivity options available and the pros and cons of each. Various brands of each solution are on today’s market. Ideally, all sources of plant data can be connected and disseminated effectively for maximum efficiency and profitability.

digital transformation guide

Want to Learn More?

Download our Digital Transformation Roadmap and learn what steps you can take to achieve data-driven success in manufacturing.

Download PDF

Dashboards & Displays, Data Visualization, Process Manufacturing, Troubleshooting & Analysis

The digital Transformation – everyone and everything is a part of it in some way. In the 20th century, breakthroughs in technology allowed for the ever-evolving computing machines that we now depend upon so totally, we rarely give them a second thought. Even before the advent of microprocessors and supercomputers, there were certain notable scientists and inventors who helped lay the groundwork for the technology that has since drastically reshaped every facet of modern life.


Data Visualization

Predictive models are extremely useful in monitoring and optimizing manufacturing processes. Predictive modeling in manufacturing, when combined with an alarm system, can be used to alert changes in processes or equipment performance and prevent downtime or quality issues before they occur.

A process engineer or operator might keep an eye on real-time dashboards or trends throughout the day to monitor the health of processes. Predictive modeling, when combined with a proper alarm system, is an incredibly effective method for proactively notifying teams of impending system issues that could lead to waste or unplanned downtime.

In this article, we’re going to review two examples of predictive modeling in manufacturing. First we’ll describe and build an example of a PLS model, and then we’ll describe and build and example of a PCA model. PLS vs. PCA. Why choose one or the other? We’ll cover that as well.

Identify process issues before they occur. Prevent unplanned downtime, reduce waste, & improve efficiency.

Learn More

What is PLS in Manufacturing

PLS stands for “Partial Least Squares“. It’s a linear model commonly used in predictive analytics.

PLS models are developed by modeling or simulating one unknown system parameter (y) from another set of known system parameters (x’s).

In manufacturing, for example, if you have an instrument that is sometimes unreliable, but you have a span of time in which it was very reliable, it is possible to simulate, or model, that parameter from other system parameters. So, when it moves into an unreliable state, you have a model that will approximate, or simulate, what that instrument should be reading, were it functioning normally.

PLS Model Formula

We promise we’re not going to get too deep into the math here, but this is a PLS model formula:

y = m1x1 + m2x2 + … + mnxn + b

In this formula, the single (y) is approximated from the (x’s) by multiplying each by a coefficient and adding an intercept at the end.

PLS Analysis Use Cases

Some potential uses for PLS models include:

Simulating flow from valve position, power, or delta pressure (dP)

An example provided by one of our customers involved modeling flow from pump amps.

In this particular case, they had a condensate tank in which the flow kept reading zero on their real-time production trend, even though they knew their pump was pumping condensate.

Using dataPARC’s predictive modeling tools, they looked at the historical data and found periods of time when there was a flow reading, and they modeled the flow based on the pump amps during those same periods.

So, when the flow itself got so low that the flow meter wouldn’t register it, they still had a model of flow based on pump amps, because the pump was still pumping and registering pump amps.

Producing discrete test results modeled from a set of continuous process measurements

For example, there may be something you only test every four to six hours. But, you’d like to know, between those tests, if you’re still approximately on-line, or still approximately the same.

If you have continuous measurements that can be used to approximate that value that you’re going to test in four to six hours, you can build a model of those discrete test results based on what those readings were when the previous test was conducted.

Those are just a couple of examples of how you can use PLS for predictive modeling in manufacturing.

How can predictive analytics work for you? Prevent unplanned downtime, reduce waste, & optimize processes with dataPARC’s predictive modeling software.

How to build a PLS Model

So, now let’s look at building a PLS model. We’ll use the example we discussed where we simulate flow using delta pressure data. First we need to identify the tags or variables we’ll be working with.

Identify Variables

Using dataPARC, we build these models from trends. Here we have a trend showing Flow (blue), Square Root of dP (green), and Specific Gravity (pink). This data is being pulled from our data historian software.

A basic trend with the three variables we’ll be using for our PLS model


Flow is the variable we want to model, or predict. It’s the “y” in the formula we described above.

Square Root of dP

Flow is related linearly to the square root of dP. Not to dP itself. So, since the PLS model is a linear model, we’ll create a calculated tag in dataPARC by subtracting the downstream pressure from the upstream pressure and taking the square root of that difference. This will be our Square Root of dP variable that we can use in this linear model.

Specific Gravity (SG)

We’ll use Specific Gravity as our second x. In this example we’re not sure if Specific Gravity is necessary for this model, but it’s really easy to add tags to this equation, determine their importance, and remove them if they’re not needed. We’ll include it for now.

Establish Time Periods for Evaluation

So, on the left we’ll select data from Jan 24 – Feb 3. This is the data we’ll use to build our model. On the right we’ll select data from Feb 8 – Feb 17. This is the data we’ll use to run our model against to evaluate its viability.

On the right side of this split graph, we have the same flow tag, but for a different period of time. This is how we’ll evaluate the accuracy of the model we’ve built.

It is very important to evaluate a model against a time period that is not included in the dataset. To determine if the model is valid going forward. Because as time goes on, it will be using data that it never saw.

Generate the Modeled Data

With dataPARC’s predictive modeling tools, building the model is as simple as adjusting some configuration settings and clicking “Create New PLS Model”. The model will be generated using the data from the tags in our trend we looked at previously. Of course, with more effort, this data can also be produced and managed in Excel as well.

Creating a PLS model with dataPARC

Evaluate the PLS Model

The first thing you want to do when you build a PLS model is clean up the data. Or at least look for opportunities to clean up the data.

T1 vs. T2

First let’s look at the T1 vs. T2 graph. Again, we don’t want to get too deep into the math, but what we’re looking for here is a single grouping of data points within the circles on the graph. A single “clump” of data points indicates we’re looking at a single parameter, or operating regime. If it appeared we had two or more clusters of data points, it’d be a good indication we have multiple operating regimes represented in our model. In that case we’d want to go back and build distinct models to represent each regime.

everything looks good here, though, so let’s proceed.

Looking pretty good so far.

If a lot of your data is outside these circles it’s an indication that your model isn’t going to be very good. Maybe there are some additional tags that you need include in the model, or maybe the time period you selected is not good.

Y to Y

Using a common Y to Y plot, we can view the original y and the predicted y plotted against each other. In this example they’re very close together and you can see that the R-squared value is ridiculously high, which we’d expect when we’re modeling Flow from the Square Root of dP.

Check out that R-squared. 0.994.

So, you’d think with that kind of R-squared value we’d be ready to call it a day, but using dataPARC’s predictive modeling software, we like to take a look at one more thing.

Variable Importance

As expected, our Square Root of dP variable is extremely important, with a value of .942 – roughly 94% important to our model.

Looks like we did good with our Square Root of dP calculation.

If you recall, we added another tag, or variable, into the mix at the beginning – Specific Gravity (SG). Now, this was primarily to illustrate this Variable Importance feature.

As you can see less than 6% of the model is dependent on Specific Gravity. We expected this. Specific Gravity isn’t really useful in this model, and this Variable Importance feature backs that up. To simplify our model and perhaps enable it to run faster, we’d want to eliminate Specific Gravity and any other variables that aren’t highly important.

Save Your PLS Model

Now that our model is complete, we’ll want to save it so we can apply it later. In dataPARC’s PARCmodel predictive modeling software, you get this little dialog here where you can put in a project name and model name.

Saving our model in PARCmodel

How to Apply a PLS Model

So, now that we’ve built our model and saved it. We’re going to want to apply it and see if it works.

Remember earlier, when we chose two time periods for evaluation? Well, now, going back to our trending application, we can import the model we built from our source data and lay that over real data from that second time period to see how accurately it would have predicted the flow for that period of time.

Our predicted data, on the right, in red, falls right in line with real historical production data.

Well, well, well. It appears we have a valid test.

We used an 11-day period in late January (the trend on the left) to create a model, and now, the predicted values of the Flow (the red line on the trend on the right) over an 11-day period in mid February are nearly identical to the actual values from that time period. Perfect!

What is PCA in Manufacturing

PCA is one of the more common forms of predictive modeling in manufacturing. PCA stands for Principal Component Analysis. A PCA model is a way to characterize a system or piece of equipment.

A PCA model differs from a PLS model in that, with a PCA model, there is no “y” variable that you’re trying to predict. A PCA model doesn’t attempt to simulate a single variable by looking at the values of a number of other values (x’s).

Instead, each “x” is modeled from all other x’s. A PCA model is a way of showing the relationship between all the x’s, creating a “fingerprint” of what the system looks like when it’s running.

With a PCA model, you’re trying to say “I have a system or a piece of equipment, and I want to know if it has shifted, or moved into a different operating regime.” You want to know if it is operating differently today than it was during a different period of time.

PCA Analysis Use Cases

Some potential uses for PCA models include:

Diagnosing instrument or equipment drift

For example, you may have an instrument in the field that you know scales up over time, or something that is subject to drift, like a pH meter that you have to calibrate all of the time. When reviewing the values from that instrument, it can sometimes be difficult to know if changes in values are due to drift or if they’re a symptom of more significant equipment or process issues.

If you have a period of time during which you know all of your instruments were good and your process was running optimally, you can use that as your “thumbprint”. This is what you build your PCA model from, and then your PCA statistics that you trend into the future can tell you if something is shifting.

Flagging significant process alterations

A common example here is when a manual valve that is always open or should always be open, somehow gets closed. Since there’s no indication in a DCS or PLC that a manual valve has been closed, all the operator sees is that something is different. They don’t know what it is, but they recognize that something is different.

A PCA model can help here by automatically triggering an alarm or flagging significant changes in a process. The model can’t specifically see that the valve has been closed, but what it does see, for example, is that a pressure reading related to the flow is now different. Or, the control valve used to have x impact on flow or x impact on temperature, and it’s no longer affecting those variables.

PCA can tell you that something in the relationship between components or parts of a process is off, and it can help you get to the root cause of the issue.

How to Build a PCA Model

So, let’s take a look at building a PCA model for a pump. It’s a small system, and we’re going to set up a model to see when it deviates from its normal operating regime.

These steps will be nearly identical to those we covered in how to build a PLS model above. The one major difference is that we don’t have a y value that we’re trying to predict, so we’ll just need to select as many x variabls as we need to represent this particular system.

Identify Variables

We’re going to be using the following tags (x’s) to build our pump model:

  • Amps
  • Flow
  • Speed
  • Specific Gravity (SG)
  • Vibration (Vib)
  • Total Dynamic Head (TDH)
  • Temp

Establish Time Periods for Evaluation

Again, as we did with our PLS model, we’ll have our split trend that shows the data on the left that we’ll use to build our model, and the data on the right that we’ll use to evaluate the accuracy of the model.

Source data from our model on the left, and the data we’ll check it against on the right.

Generate the Modeled Data

A couple clicks here and bam. We have our PCA model.

PARCmodel makes predictive modeling in manufacturing easy.

Evaluate the PCA Model

So, how’s our model shaping up?

T1 vs. T2

Looking at T1 vs. T2 we appear to be off to a good start. All of our data seems to be grouped pretty tightly together, so that’s a good indication we’re looking at a single operating regime here.


Now let’s look at our DModX trend. This is particular to our PCA model.

DModX represents the distance from an observation to the Model in “x” space. “X” meaning how many dimensions we have. So, in this case we have seven x’s, or seven “dimensions.” There are thousands of “observations” that make up this DModX trend.

In our DModX trend, we can see that there are a few observations that are higher than the red line, which we can think of as the point of statistical significance. When we start getting a lot of observations above this line, it’s an indication that our model isn’t very good.

In this case, we have a few points bouncing around the red line, and on occasion going above it, but this is acceptable. This is what an accurate model generally looks like in DModX.

Hotelling’s T-Squared Normalized (HT2N)

Unlike DModX, HT2N isn’t showing us how the model is performing, or how the observations fit within the model. Instead it’s showing us how the observations fit within the range of all the other x’s. HT2N is also particular to our PCA model.

For example, it looks like there was a period of time here were there was something in the system – maybe multiple x’s – that were significantly different in range from all of the other periods of time before and after.

However, if we see a high HT2N it isn’t necessarily an indication that our model is bad. For instance, even though there were some parameters that had an unusual range, this spike in HT2N clearly falls within acceptable parameters of the corresponding DModX trend. As we see below, they fit within the model just fine.

So, sometimes it’s ok to leave a high HT2N set of data in there because you’re leaving the range of your data expanded. And at times there’s a reason you’ll want to do that.

Let’s say one of your x’s is a production rate. The “model set” models production between 500 and 800. And one day, your production rate went above 750. That might result in a spike like we see in the trend above.

How to Apply a PCA Model

Ok. So, we’ve created our pump model and, in our case, saved it using dataPARC’s predictive modeling software. Now we’re going to go back out to our split trend and apply the model to the timeframes we identified earlier.

We’ll use a 4-up view in our PARCview trending application, and isolate the DModX and HT2N tags in the bottom two trends.

PARCmodel automatically adds “limits” to a PCA model when it’s created, so if we turn visibility for limits on in our trending application, we can easily see where our data is going outside of our model.

With the limit data now identified, we can dig in using our favorite analytics toolkit and perform root cause analysis to determine if there’s an issue with this pump assembly.

Predictive Modeling in Manufacturing

So, there you have it. If you’re looking for good examples of applied predictive modeling in manufacturing, PLS and PCA are two common models useful in monitoring and optimizing manufacturing processes.

An engineer or operator might keep an eye on real-time dashboards or trends throughout the day but it can be difficult to spot potential process issues in time to avoid production loss. Predictive modeling software, when combined with an alarm system provides process manufacturers with an incredibly effective and reliable method for identifying issues before they occur – preventing unplanned downtime, reducing waste, and optimizing their manufacturing processes.

Want to Learn More?

Download the datasheet and see how dataPARC’s predictive modeling tools can help you identify process issues before they occur.

Download PDF

Data Visualization, Historian Data, Process Manufacturing, Troubleshooting & Analysis

Deviation analysis is a routine form of troubleshooting performed at process manufacturing facilities around the world. When speed is imperative, a robust deviation detection system, along with a good process for analyzing the resulting data, is essential for solving problems quickly.

A properly configured deviation detection system allows nearly everyone involved in a manufacturing process to collaborate and quickly identify the root causes of unexpected production issues.

In a previous post we wrote about time series anomaly detection methods, and how to set up deviation detection for your process. In this article, we’re going to be focusing on how to actually analyze the data to pinpoint the source of a deviant process.

deviation detection webinar signup

Watch the webcast to see us use deviation detection to troubleshoot process issues.

Watch the Webcast

Deviation Analysis: Reviewing the Data

So, if you read our other article about anomaly detection methods, we covered setting up deviation detection, including the following steps:

  1. Selecting tags
  2. Filtering downtime
  3. Identifying “good” operating data
  4. Identifying “bad” operating data

The fifth step is to actually analyze the data you’ve just produced, so you can identify where your problem is occurring.

But, before we get into analysis, let’s review the data we’ve produced.

The examples below show the data we’ve produced with dataPARC’s process data analytics software, but the analysis process would be similar if you were doing this in your own custom-built Excel workbook.

Selecting Tags

Here we have the tags we identified. In our case, we were able to just drag over the entire process area from our display graphic and they all ended up in our application here. We could have also added the tags manually or even exported the data from our historian and dumped it into a spreadsheet.

deviation analysis - getting the tags

We pulled data from 363 tags associated with our problematic process.

Good Data

Next, we have our “good” data. The data when our process was running efficiently. You’ll see that the values here are averages over a one-month period.

deviation analysis - good data example

Average data from a month when manufacturing processes were running smoothly.

Bad Data

This is our problem data. Narrowed down to a specific two-day period where we first recognized we had an issue.

deviation analysis - bad data example

Bad doggie! I mean… Bad data. Bad!

Check out our real-time process analytics tools & see how better data can lead to better decisions.

Check out PARCview

Methods of Deviation Detection

Again, you can refer to our article on anomaly detection methods for more details, but in this next part we’ll be using 4 different methods of analysis to try and pinpoint the problem.

The four deviation detection methods we’ll be using are:

  1. Absolute Change (%Chg) – The simplest form of deviation detection. Comparing a value against the average.
  2. Variability (COVChg) – How much the data varies or how spread out the data is relative to the average.
  3. Standard Deviation (SDCgh) – A standard for control charts. Measures how much the data varies over time.
  4. Multi-Parameter (DModX) – Advanced deviation detection metric showing the difference between expected values and real data, to evaluate the overall health of the process. The ranges are often rate-dependent.

In the image below you’ll see the deviation values for each method of calculation. Here red means a positive change, and blue means a negative change.

deviation analysis methods

Our four deviation detection methods. Red is positive change in values. Blue is negative value change.

So, if we’re looking for a trouble spot within our manufacturing process, the first thing we’re going to want to do is start to look at the deviation values.

By sorting by the different detection methods, we can begin to identify some patterns. And, we can really pare down our list of potential culprits. Just an initial sort by deviation values eliminates all but about a dozen of our tags as suspects.

So, let’s look at tags where the majority of the models show high deviation values. That gives us a place to begin troubleshooting.

Applied Deviation Analysis

For instance, here we have our Cooling Water tag, and in three of the four models we’re seeing that it has a fairly high deviation value. It’s a prime suspect.

deviation analysis - cooling water data

So, let’s analyze that, and take a closer look.

Need to get better data into the hands of your process engineers? Check out our real-time process analytics tools & see how better data can lead to better decisions.

Within our deviation detection application we can just select the tag and click the “trend” button to bring up the data trend for the Cooling Water tag.

Looking at the trend, it’s definitely going up, and deviating from the “good” operating conditions. But we also know our process. And we know that the cooling water comes from the river, and we know that the river temperature fluctuates with the seasons. So, we’ll add our River Temp tag to the trend, and sure enough – it looks like it’s just a seasonal change.

cooling water vs river temp image

Pairing our Cooling Water Tmp tag with our River Temp tag. Nope, that’s not it!

So, the Cooling Water isn’t our culprit. What can we look into next? This 6X dT tag looks like a problem, with multiple indications of high variation. This represents the temperature change across the sixth section of the extraction train.

deviation analysis - looking at the 6xt data

This looks like the source of our problem.

It’s likely that this is going to be our problem tag. Putting our heads together with the rest of the team, we can pretty quickly get anecdotal evidence to either confirm or deny that, say, maintenance was performed in this part of the process recently. If it’s still unclear, we can pull it up on a trend, like we did with our Cooling Water tag, and see if we are indeed seeing some erratic behavior with the values from this tag.

Looking Ahead

Really, this is routine troubleshooting that is done daily at process facilities around the world. But, when speed is imperative, and you need a quick answer for management when they’re asking why their machine is down or the product quality is out-of-spec, having a robust deviation detection system in place, and a good process for analyzing the resulting data, can really help make things clear quickly.

deviation detection webinar signup

Watch the webcast

In this recorded webcast we discuss how to use deviation detection to quickly understand and communicate issues with errant processes, and in some cases, how to identify problems before they even occur.

Watch the Webcast
deviation detection webinar signup

Data Visualization, Historian Data, Process Manufacturing, Troubleshooting & Analysis

One of the problems in process manufacturing is that processes tend to drift over time. When they do, we encounter production issues. Immediately, management wants to know, “what’s changed, and how do we fix it?” Anomaly detection systems can help us provide some quick answers.

When a manufacturing process deviates from its expected range, there are several problems that arise. The plant experiences production issues, quality issues, environmental issues, cost issues, or safety issues.

One or more of these issues will present itself, and the question from management is always, “what changed?” Of course, they’d really like to know exactly what to do to go and fix it, but fundamentally, we need to know what changed to put us in this situation.

Usually the culprit is either the physical equipment – maybe maintenance that’s been performed recently that threw things off – or it’s in the way we’re operating the equipment.

From a process engineer or a process operator’s perspective, we need to quickly identify what changed. We’re possibly in a situation where the plant is losing money every minute we’re operating like this, so operators, engineers, supervisors… everyone is under pressure to fix the problem as soon as possible.

In order to do this, we need to understand how the value has changed, and the frequency of those changes. Or rather, how big are the swings and how often are they occurring?

deviation detection webinar signup

Watch the webcast to see us use deviation detection to troubleshoot process issues.

Watch the Webcast

Time Series Anomaly Detection Methods

Let’s begin by looking at some time series anomaly detection (or deviation detection) methods that are commonly used to troubleshoot and identify process issues in plants around the world.

Absolute Change

time series anomaly detection - absolute change

This is the simplest form of deviation detection. For Absolute Change, we get a baseline average where things are running well, and when we’re down the road, sometime in the future, and things aren’t running so hot, we look back and see how much things have changed from the average.

Absolute change is used to see if there was a shift in the process that has made the operating conditions less than ideal. This is commonly used as a first pass when troubleshooting issues at process facilities.


time series anomaly detection - variability

Here we want to know if the variability has changed in some way. In this case, we’ll show the COV change between a good period and a bad period. COV is basically a way to take variations and normalize them based on the value. So high values don’t necessarily get a higher standard deviation than low values because they’re normalized.

Variability charts are commonly used to identify less consistent operating conditions and perhaps more variations in quality, energy usage, etc.

Standard Deviations

time series anomaly detection - standard deviation

Anyone who’s done control charts in the past 30 years will be familiar with standard deviations. Here we take a period of data, get the average, calculate the standard deviation, and put limits up (+/- 3 standard deviations is pretty typical). Then, you evaluate where you’re out based on that.

Standard deviation is probably the most common way to identify how well the process is being controlled, and is used to define the operating limits.


time series anomaly detection - multi-parameter

This is a more advanced method of deviation detection that we at dataPARC refer to as PCA Modelling. Here we take all the variables and put them together and model them against each other to narrow the range. Instead of having flat ranges, they’re often rate-dependent.

The benefit of PCA Modelling over the other anomaly detection methods, is that it gives us the ability to narrow the window and get an operating range that is specific to the rate and other current operating conditions.

Check out our real-time process analytics tools & see how better data can lead to better decisions.

Check out PARCview

Setting up Anomaly Detection

Now that we have a basic understanding of some methods for detecting anomalies in our manufacturing process, we can begin setting up our detection system. The steps below outline the process we usually take when setting anomaly detection up for our customers, and we typically advise them to take a similar approach when doing it themselves.

1. Select Your Tags

Simple enough. For any particular process area you’re going to have at least a handful of tags that you’re going to want to review to see if you can spot the problem. Find them, and, using your favorite time series data trending application (if you have one), or Excel (if you don’t), gather a fairly large set of data. Maybe a month or so.

At dataPARC, we’ve been performing time series anomaly detection for customers for years, so we actually built a deviation detection application to simplify a lot of these routine steps.

For instance, if we want, we can grab an entire process unit from a display graphic and drag it into our app without having to take the time to hunt for the individual tags themselves. Pretty cool, right?

If we just pull up the process graphic for this part of the plant…

…we can quickly compile all the tags we want to review.

2. Filter out Downtime

This is a CRITICAL step, and should be applied before you even identify your good and bad periods. In order to accurately detect anomalies in your process data, you need to make sure to filter out any downs you may have had at your plant that will skew your numbers.

anomaly detection - filter downtime


dataPARC’s PARCview application allows you to define thresholds to automatically identify and filter out downtime, so if you’re using a process analytics toolkit like PARCview, that’ll save you some time. If your analytics tools or your historian doesn’t have this capability, you can also just filter out the downs by hand in Excel. Regardless of how you do it, it’s a critical step.

Need to get better data into the hands of your process engineers? Check out our real-time process analytics tools & see how better data can lead to better decisions.

3. Identify Good Period

Now you’re going to want to review your data. Look back over the month or so of data you pulled and identify a period of time that everyone agrees the process was running “good”. This could be a week, two weeks… whatever makes sense for your process.

anomaly detection - good time series data

Things are running well here.

4. Identify Bad Period

Now that we have the base built, we need to find our “bad” period. Whether we’re waiting for a bad period to occur, or we’re proactively looking for bad periods as time goes on.

anomaly detection - bad time series data

Here we’re having some trouble.

5. Analyze the Data

Yes, it’s important to understand the different anomaly detection methods, and yes, we’ve discussed the steps we need to take to build our very own time series anomaly detection system, but perhaps the most critical part of this whole process is analyzing the data after we’ve become aware of the deviations. This is how we pinpoint which tags – which part of our process – is giving us problems.

Deviation Analysis is a pretty big topic that we’ve covered extensively in another post.

Looking Ahead

Anomaly detection systems are great for being able to quickly identify key process changes, and really the system should be available to people at nearly level of your operation. For effective troubleshooting and analysis, everyone from the operator, the process engineer, maintenance, management… they all need to have visibility into this data and the ability to provide input.

Properly configured, you should be able to identify roughly what your problem is, within 5 tags of the problem, in 5 minutes.

So, when management asks “what’s changed, and how do we fix it?”, just tell them to give you 5 minutes.

deviation detection webinar signup

Watch the webcast

In this recorded webcast we discuss how to use deviation detection to quickly understand and communicate issues with errant processes, and in some cases, how to identify problems before they even occur.

Watch the Webcast
deviation detection webinar signup

Dashboards & Displays, Data Visualization, News, Uncategorized

If you’re like us, you’ve likely been sent an email or told in meetings that part or much of your company staff will now work remotely. Testing for remote computer access and data volume traffic are ongoing as plans are being worked out for this new structure. For most companies, that means VPN or other remoting methods. Virtual meetings are replacing face-to-face ones and pseudo to full quarantines are on the rise. Phone conversations will go on but this won’t fully suffice to cover staffing roles. And besides, you’re talking to neighbors and wondering if you should take one more trip to the grocery store. In the midst of all the chaos, your company still needs you to not only do your job but to excel at it.


Dashboards & Displays, Data Visualization, Process Manufacturing

Most modern manufacturing processes are controlled and monitored by computer based control and data acquisition systems. This means that one of the primary ways that an operator interacts with a process is through computer display screens. These screens may simply passively display information, or they may be interactive, allowing an operator to select an object and make a change which will be then be relayed to the actual process. This interface where a person interacts with a display, and consequently the process, is called a Human-Machine Interface, or HMI.