Data Integrity Is of Prime Importance

Written by Brian Crandall on January 06, 2021

Data integrity is of prime importance to companies in the process industries because data is valuable and therefore must be managed and protected. This encompasses information security (infosec), which considers issues like authentication and authorization to ensure that only the right people can get into the system and see the proper data. 

Tracking changes to calculations of data is another key factor, especially in the regulated industries of pharmaceutical, food, and beverage. In these environments, which often store what is referred to as validated data, administrators must provide evidence that data is used properly to make decisions before releasing a production run, changing parameters during the manufacturing process, or making other adjustments. 

The main challenge to address these and other issues is providing secure access to live, streaming data—while providing controls for traceability and auditing. Seeq provides the required features, along with architecture IT recommendations to meet the needs of infosec, Good Manufacturing Practices (GMP), validation, and overall data integrity.

Data Integrity Approach Recommendations

Seeq recommends either of two approaches to achieving acceptable data integrity. The first is a Single-System Architecture where the engineering environment exists on the same system as the GMP environment.  The engineering system is sometimes called the production system and it is used for ad-hoc creation of content. The second option is a Two-System Architecture with a separate Seeq Server for engineering and another validated for GMP decisions. The following table lists the aspects of each option, along with the pros and cons.


Single System



Content is managed on a single Seeq Server. Seeq’s folder structure and authorization system are used to provide a separation between ad-hoc content and GMP content for auditing.

Content is managed on two Seeq Servers. Ad-hoc, engineering content is moved to the GMP server using a Standard Operating Procedure (SOP). Content is audited on the GMP server only.


Single Seeq server with data connections to the source data.

Two Seeq Servers with independent, but similar, connections to the source data. Requires two mirrored sets of Seeq Remote Agents.

System Load

High, with frequent access by engineering and GMP 

Mixed, with frequent access by engineering and sporadic access by GMP

SOP Focus

  1. Managing user credentials, authentication, and authorization.
  2. Managing user content within Seeq’s folders, e.g. Engineering, GMP.
  1. Managing user credentials, authentication, and authorization.
  2. Managing user content between the Seeq Servers: Engineering and GMP.

Validation Controls

Separate folder and user login credentials for GMP content. Audit Trail enabled for system.

Separate login credentials on each server. Audit Trail enabled on GMP server only.


  1. Simple architecture
  2. Less IT maintenance
  1. Audit Trail reports are very simple; few changes are made on the GMP system.


System is critical; 24/7 uptime required

Increased IT support for two systems

Table 1: Data Integrity System Comparison

The decision on which architecture approach is best depends on the company’s internal IT support, QA preferences, and other factors. The above table acts as a quick guide to start the discussions. More details on each approach are provided below.

Single System Details

The engineering and GMP content exist on the same system. The barrier between the ad-hoc and GMP-approved Seeq items (Analyses, Topics, Projects) is provided by Seeq’s folders with proper user authorization. 

Single System Architecture for Data Integrity

Figure 1: Single System Architecture for Data Integrity

After the Seeq server is operational, SOPs are used to move content from the engineering (ENG) folders to the GMP folders in a manner maintaining the integrity of the GMP content. Each folder could, as an option, have a separate login—even for the same person at the company. With this approach, when a “GMP User” logs in, they will see only the GMP-approved content. GMP content is not usually modified, this controls versioning and ensures compliance. If content needs modification, GMP users have the ability to copy, move, and update existing items as new GMP content. 

A common SOP example is a Workbench Analysis created by an engineering user and deemed useful for GMP decisions. First, the content is copied or moved to a “GMP Proposal” folder. Both engineering and GMP users can access and review this content. Next the GMP user approves this content by moving it to the GMP folder. At this point the Analysis is locked down with the Audit Trail enabled, a system-wide feature.

Two-System Details

The Engineering and GMP Seeq Systems independently connect to the underlying data sources, which should be validated for any proposed GMP content built from these data sources.  Seeq provides live connections to the data sources that are queried on demand without duplicating the validated data sources, simplifying the effort to just validate the connection to the validated data source, and not requiring validation and management of duplicate databases. 

Due to Seeq’s deployment methods, this requires separate remote agents on-premises because a remote agent can only connect to a single Seeq Server. Sometimes, the data sources are split between engineering and GMP as well. 

In either case, there must be a set of identical mirrored signals, conditions, scalars, and assets so that investigations on each system may be referred to each other. See the figure below for a high-level two-system Seeq architecture with connected data sources. 

Two-System Architecture for Data Integrity

Figure 2: Two-System Architecture for Data Integrity

After the Seeq servers are operational, the users must have a SOP to move content between them. A common example is a Workbench Analysis created on the Engineering Server that is useful for executing a GMP batch release. 

First, there should be a means to review and approve content. Next, Seeq provides a mechanism for the user, with the proper authentication and authorization, to copy the Analysis to the GMP Server. From there it connects to the underlying, live data using the proper mapping so that the same signals are used as in the Engineering Server. At this point, on the GMP Server, access to the Workbench Analysis is locked down with the Audit Trail enabled. Again, the data sources themselves could be separate, but Seeq can do this mapping at the time of the copy. 

Moving Content Between Seeq Servers

Figure 3: Moving Content Between Seeq Servers

Once the content is moved to the GMP Server, the quality team can use the Audit Trail to track the list of calculation changes to meet validation requirements. Also, the user roles can be tightened so that changes are limited to individuals with proper access. In this way, Seeq can be used to provide infosec and data integrity requirements for advanced analytics.

The Outlook on Data Integrity Moving Forward

One of the biggest challenges for data integrity in highly integrated software systems is enabling validation and ad-hoc analysis in live, streaming systems. Gone are the days of setting up completely different, disconnected systems for data validation. Such an arrangement is highly onerous due to the need to investigate and audit based on constantly updating data sources. 

At Seeq, we have two recommended approaches, each with pros and cons, to meet the needs of our customers’ IT and QA organizations. 

We would like to hear opinions from our community of Seeq users. Do you feel like these approaches are sound? Any comments or changes? Please e-mail me at