Guideline for Excel spreadsheet validation

  • Published on: Apr 19, 2024

Excel spreadsheet validation involves developing and testing spreadsheet applications and confirming that the information generated through the spreadsheet is accurate, complete, intended, and reproducible.

While working in a GMP facility, I had to use many Excel spreadsheets developed with formulas, macros, and many other functions.

As I entered raw data, these spreadsheets processed it and generated useful information. I relied on the accuracy of that information to decide whether to accept or reject a pharmaceutical batch.

We had to ensure that those Excel spreadsheets were processing GMP data accurately and reproducibly so that our decision did not adversely affect the patient’s health. 

This article will provide a practical framework and approach for the Excel spreadsheet validation used in direct or indirect GMP-related activities, empowering you in your decision-making process.

We will show how to validate Excel spreadsheets designed and used for making decisions in GMP environments, which directly impact product quality, safety, and identity of medicinal products.

Why is Excel spreadsheet validation required?

Excel spreadsheets are widely used in pharmaceutical operations, from simple data entry in the registers to complex data processing using functions and formulas.

Excel spreadsheet offers many functions to make our lives easy. We use statistical and mathematical functions routinely. Some of those are ingrained in our minds, such as auto-sum or showing the median, mode, average, range, trend, standard deviations, etc.

Inside an Excel spreadsheet, you can create Lookup functions such as ‘Vlookup,’ ‘Hlookup,’ and ‘And Index,’ or create logical functions like If, And, Or, Not, etc., to generate useful information from the initial data points.

You can also create macros, conditional formatting, data validation, pivot tables, and many other features as a handy tool for presenting data.

When Excel spreadsheets facilitate our decision-making on product quality, it is critical to validate them to ensure the information is accurate, consistent, complete, and true.

Accuracy is crucial to guarantee the data entered and the calculations performed are correct.

Any inaccuracy in the formula can lead to incorrect conclusions, severely affecting product quality.

The results generated through Excel spreadsheets must be consistent. Consistency ensures that data and calculations are uniform across the spreadsheet.

For example, if a column is programmed to generate results based on the data from two other columns, this calculation should be consistently applied across all rows.

Data integrity ensures the relationships between different data sets within the spreadsheet are preserved. For example, if a column references values in another column, all the referenced values should exist.

Lastly, data trustworthiness is also critical to complying with regulatory requirements. Validated data is more reliable and can be trusted to make informed decisions.

Regulatory requirements for Excel spreadsheet validation

Many of us regularly use Excel spreadsheets to record and manipulate GMP data (i.e., change, delete, add). Therefore, it is necessary to manage these spreadsheets to ensure the data’s continued integrity and security.

While some spreadsheets are no more than fancy calculators, others are embedded with special features and use intricate logic in Macros, which are, in effect, computer programs.

Good Automated Manufacturing Practice (GAMP) by ISPE provides a detailed framework for Excel spreadsheet validation, which is primarily adopted by international regulatory authorities.

Excel spreadsheets can be classified into GAMP categories depending on their use and contents.

GAMP categorization dictates the level of applicable qualification and validation effort, just as with other computerized systems.

Below are the GAMP categories commonly interpreted for spreadsheets.

Spreadsheets with different levels of GAMP category

GAMP 3: These spreadsheets are used solely to produce paper documents.

GAMP 4: These spreadsheets are more complex, such as those used as templates that incorporate process steps.

GAMP 5: These spreadsheets use macros, lookup tables &/or sophisticated logic, etc.

Spreadsheets that fall into GAMP category 3 guidelines do not require full validation activities but should be held under appropriate security conditions.

Spreadsheets with calculations fully explained on the spreadsheet that undergo verification with each use do not need to be fully validated.

Spreadsheets that fall into GAMP category 5, having complex formulas, macros, lookup tables, or sophisticated functions and logic, will require complete data validation effort to ensure these spreadsheets generate results free from error and bias.

Where spreadsheets are used as templates, you should maintain appropriate security conditions, including password protection and secure storage.

240 SOPs, 197 GMP Manuals, 64 Templates, 30 Training modules, 167 Forms. Additional documents included each month. All written and updated by GMP experts. Checkout sample previews. Access to exclusive content for an affordable fee.

How do you plan for Excel spreadsheet validation?

When you plan to validate a complex Excel spreadsheet, you must follow a framework similar to those used in computer system validation.

However, the validation scope is much narrower and specific to spreadsheet development and use.

Step 1: Develop functional requirements of Excel spreadsheet

An Excel spreadsheet’s functional requirements are prepared considering user requirements and functional specifications. The developer should write the functional requirements.

The functional requirements should be approved before the initial design of the Excel spreadsheet.

Functional requirements can also be developed retrospectively after a prototype spreadsheet has been prepared.

The functional requirements document should contain the following areas:

– The purpose of the spreadsheet application (such as calculating results, re-ordering results for management reports, etc.).

– The formulas included in the spreadsheet application, including logical formulas used for branching or calculations.

– A description of macros that will be included.

– A description of any column or row rearrangements/sorting required to format a new report from included data.

– A description of the report that will be generated.

– A statement defining electronic storage requirements for the executed spreadsheet -application (results).

– Where applicable, issues such as data formats and numerical precision, numeric rules such as rounding or floating points, flagging of limits, boundary conditions, filters, or outliers.

– Performance requirements for the application and program (i.e., calculations or reports within ‘X’ seconds, applications/program availability, etc.).

– Workflow needs such as the source of the data, a statement regarding raw data, and “who will perform what actions and make what decisions based on what data and criteria” may be required.

Step 2: Develop the Excel spreadsheet application

Sometimes, internal employees (e.g., Senior laboratory analysts) are responsible for developing spreadsheet applications.

However, you should make sure that the employees are adequately trained and qualified to work with Excel spreadsheets.

Could you make sure the employees know the internal work process, which should be correctly reflected in the new application?

The developer should be able to devise the spreadsheet parameters, such as how the table should look, data formats, data sorting parameters, adding fields to the spreadsheet, reports, formulas, and interfaces with other programs and instrumentation.

i. Consider how raw data will be entered into the spreadsheet

The spreadsheet developer must define how raw data will be entered.

Data can be fed directly into a spreadsheet from adjoining instruments, or data can be entered manually.

Raw data will be fed directly from the instrument to the spreadsheet application through the digital interface for automated operations.

If data sets are entered manually, a second individual must review all entered data to verify the correctness.

The initials of the reviewer and the date of review should be indicated.

The transfer link between the spreadsheet application and the data source must be validated if data sets are entered automatically through an electronic interface.

On the other hand, data from a spreadsheet must only be exported to a validated computer system or application if the transfer link has been validated and the importation process is approved via change control for the receiving system.

ii. Develop necessary formulas in the spreadsheet

Each formula described in the requirements documents must be embedded in the spreadsheet application.

The formulas must be checked for errors during development.

Diagnostic tools that allow error checking during development are sometimes available with spreadsheet programs.

Examples of error detection tools include:

– Cell error messages that are created when cell rules are violated

– Cells referred to by a formula in another cell may be audited for alignment, with the spreadsheet displaying tracer arrows between cells.

– When a formula returns an error, another auditing tool can be used to track it back to its source. This tool is sometimes called an error tracer.

Excel formula example
Excel formula example

240 SOPs, 197 GMP Manuals, 64 Templates, 30 Training modules, 167 Forms. Additional documents included each month. All written and updated by GMP experts. Checkout sample previews. Access to exclusive content for an affordable fee.

iii. Rearrange the spreadsheet columns and rows

Spreadsheets are also used to sort and rearrange data for reports and presentations.

Spreadsheets are not limited to a single source of data. They can handle information from various programs, databases, and even instrument systems, such as reporting data from a chromatographic software system.

I’d like you to describe the procedures for sorting or rearranging the data using the spreadsheet application.

iv. Program of Macro within the Excel spreadsheet

One of the most powerful features of Excel is the ability to automate repetitive tasks.

By recording these tasks as macros, you can play them back whenever necessary, saving you time and effort.

Macros may be associated with toolbar buttons, keystrokes, and menu entries.

Manual sorting and rearranging of data are actions that might be automated. Still, if the macro is recorded, the actions must be carefully documented during development and tested during the qualification process.

Excel marcos example
Excel marcos example

v. Generate reports from the spreadsheet

As the developer, your crucial role is to create or modify the spreadsheet. This is essential for generating automated reports, which are based on the calculations performed by the spreadsheet application.

A typical report from the spreadsheet may contain some of the following attributes but are not limited to:

– The method (i.e., Laboratory Analytical) used during data creation.

– A unique identifier for the analyst and the checker responsible for the correctness of entries into the spreadsheet application.

– The date the spreadsheet application was executed.

– The version of the spreadsheet application program.

– Version control for the report.

– The unique name of the electronic version of the file.

When data is received through an electronic interface, the report must identify the instrument system or software program that communicated the data.

This information is necessary for comprehensive reporting.

Step 3: Qualifications for Excel spreadsheet validation

After you develop the spreadsheet, the file will be transferred to the validation team for qualification testing.

The validation engineer must prepare qualification test protocols for installation, operational, and performance qualifications.

i. Installation qualification of Excel spreadsheet

Installation qualification (IQ) will provide documented evidence that the Excel spreadsheet has been designed, developed, supplied, and installed following user requirements and functional specifications.

The installation qualification tests aim to demonstrate that all the Excel spreadsheet’s primary features are available as designed.

The equipment is one step closer to satisfactory operation.

The IQ test protocol will include the spreadsheet name, version number, location, and access control. During the initial installation of the secure directory, implement appropriate security restrictions.

The validation team needs to store a master copy of the spreadsheet in a secure environment.

Each time the spreadsheet is required for use on a computer in the user area, it must be opened from the read-only master copy.

Lock cells containing formulas to protect their contents. Protect the spreadsheet with a password. Maintain the access control information (including the passwords) securely.

Successful installation qualification will help you achieve the following:

– Verify that all necessary features, add-ins, or extensions are installed and operational.

– The spreadsheet is compatible with the memory, storage, processing speed, and other necessary hardware components.

– Verification that the network configurations are correct if the spreadsheet needs to access remote data or if multiple users need to collaborate on the same document

– Ensure that only authorized users can access the spreadsheet. Please verify that data is encrypted and that an audit trail is active.

– Ensures that there are systems in place to back up the spreadsheet and recover data in case of any failures

ii. Operation qualifications of Excel spreadsheet

The operational qualification (OQ) will confirm that the correctly installed spreadsheet can operate repeatedly according to its functional specifications.

An Excel spreadsheet’s operational qualification (OQ) includes the following tests.

You should establish acceptance criteria for each test and write it into the operation qualification protocol.

240 SOPs, 197 GMP Manuals, 64 Templates, 30 Training modules, 167 Forms. Additional documents included each month. All written and updated by GMP experts. Checkout sample previews. Access to exclusive content for an affordable fee.

1. Calculation test or functional test

Calculation tests are performed to verify that correct formulas, macros, and functions are entered.

Enter sample values in all fields included in the input range of all mathematical functions.

Use different values for each field to restrict the input to whole numbers for aggregate functions (sum, max, etc.).

The tests also verify the data input mechanisms (e.g., dropdown lists and data validation rules) are functioning correctly.

Any visualization aids, such as charts, graphs, etc., are rendered correctly and updated with data changes.

Please document the step, include a printout of the spreadsheet indicating the test values, and compare the printout to that obtained from a calculator.

Acceptance criteria for functional tests

Calculated values from the spreadsheet application must be identical to those obtained from a calculator (preferably with printout capabilities).

2. Logical values test

Logical values tests are performed to verify Boolean algebra formulas.

Logical values often refer to conditions that are evaluated as TRUE or FALSE.

In the context of Excel spreadsheets validation, you might be testing cells or formulas that rely on logical functions (like IF, AND, OR) to see if they produce the expected results under various scenarios, including extreme or boundary conditions.

Print the spreadsheet cells containing Boolean (logical values) and calculate the results manually. Print the results from the spreadsheet application calculation.

Acceptance criteria for logical value tests

For logical values (e.g., Pass/Fail or True/False), verify the displayed value is the same as the expected value, calculated manually.

3. Range/precision test

Range/precision tests are performed to verify calculations will still be accurate when extremely large or small values are entered.

Enter extremely large values at the correct precision (e.g., enter 99.99 for 0.01 precision).

For aggregate functions, enter 0 in one or more fields with non-zero values in other fields.

Could you duplicate the formula calculations using a calculator and compare the results/printouts?

Repeat this test with extremely small values and report the results.

Acceptance criteria for precision tests

Calculated values from the spreadsheet application must be identical to those calculated manually using a calculator (with correct precision and consistent with the agreed rounding principle).

4. Challenge condition (boundary and extreme condition) test

The challenge condition test is also called the Boundary and Extreme Condition Testing.

The test involves inputting values at the extreme ends of expected ranges to ensure the spreadsheet handles them appropriately.

For instance, if a cell expects a value between 1 and 100, what happens if you input 0, 101, or a text string?

The challenge condition test can help detect errors in formulas or functions that might not be evident under regular conditions.

Acceptance criteria for challenge condition

The spreadsheet should either correctly process boundary and extreme inputs or produce appropriate error messages.

No crashes or unintended behaviors should occur when such values are entered.

5. Performance test

Suppose your spreadsheet is required to handle large datasets. In that case, it’s essential to check its performance to ensure that calculations, data processing, or visualizations stay consistent and prevent the software from crashing.

Acceptance criteria for performance test

For large datasets or complex calculations, the spreadsheet should complete tasks within an acceptable timeframe (e.g., recalculations or data processing should be completed within 10 seconds). The software should not crash or freeze during these operations.

6. Error handling and recovery test

Error handling and recovery test involves introducing intentional errors to see how the spreadsheet responds.

For example, what happens if a required input is left blank?

It’s essential to check if appropriate error messages are displayed and if the spreadsheet can recover without data loss or corruption.

Acceptance criteria error handling test

When intentional errors occur, the spreadsheet should either manage them gracefully (e.g., by substituting a default value) or provide a clear error message.

No data corruption should occur due to these errors.

7. Integration and interface test

If your spreadsheet pulls data from external sources, databases, or other software, it’s essential to verify that these integrations work correctly.

This involves checking the accuracy of imported data and ensuring timely data refreshes.

Acceptance criteria for integration and interface test

Data imported from external sources should match the source data wholly and accurately.

There should be no errors or significant delays during data imports/refreshes.

8. Audit trail and version control test

Since your spreadsheets will be used in regulated environments such as GMP, it is necessary to ensure a robust audit trail is implemented for the Excel spreadsheets.

The audit trail and version control test verify that changes, edits, or deletions are appropriately logged and that a traceable history of spreadsheet versions is maintained.

Acceptance criteria for audit trail

All changes, edits, or deletions should be logged with a timestamp, user ID, and a change description.

Previous versions of the spreadsheet should be retrievable, and changes between versions should be traceable.

9. Backup and restoration test

The backup and restoration test involves intentionally deleting or corrupting the spreadsheet and trying to restore it from backups to ensure that backup mechanisms are effective.

Acceptance criteria for backup test

After intentional deletion or corruption, the spreadsheet should be fully restorable from backups.

Restored data should match the original data wholly and accurately without any loss.

240 SOPs, 197 GMP Manuals, 64 Templates, 30 Training modules, 167 Forms. Additional documents included each month. All written and updated by GMP experts. Checkout sample previews. Access to exclusive content for an affordable fee.

iii. Performance qualification of Excel spreadsheet 

Performance qualification (PQ) during Excel spreadsheet validation ensures that the tool consistently delivers accurate results in real-world scenarios.

By testing under actual user conditions and with genuine data sets, PQ validates that the spreadsheet meets theoretical specifications and performs reliably in daily operations.

Performance qualification (PQ) confirms that the spreadsheet fits its intended purpose.

Following is a list of performance qualification tests you may try, including the acceptance criteria.

1. End-user operation test

The end-user operation test verifies that the spreadsheet performs as expected when used by its intended users.

This would involve typical day-to-day tasks, such as data entry, calculations, generating reports, or any other regular function the spreadsheet is intended for.

Acceptance criteria

The spreadsheet must complete all standard operational tasks (e.g., calculations, report generation) without errors within defined time frames when end-users operate.

2. Long-term stability and performance test

This test verifies how the spreadsheet behaves with prolonged use.

If data is added to the spreadsheet over time, does it continue to perform well, or do issues arise as it becomes more populated?

Acceptance criteria

The spreadsheet should maintain consistent performance levels over an extended period (e.g., six months) without showing signs of lag, crashes, or errors even as data accumulates.

3. Real-world data test

This test involves entering actual, real-world data sets in the spreadsheet rather than test or synthetic data. This ensures that it can handle the nuances and complexities of genuine data.

Acceptance criteria

The spreadsheet should correctly process and display real-world data sets, maintaining data integrity and producing expected outputs or results.

4. Scalability test

The scalability test checks how the spreadsheet performs as data grows.

This applies to spreadsheets that are expected to handle increasing amounts of data over time; ensuring that performance doesn’t degrade is vital.

Acceptance criteria

The spreadsheet should not crash, freeze, or significantly slow down as data volume or complexity increases to the maximum expected level.

Calculations or data manipulations should still be completed within acceptable time frames.

5. Environmental conditions test

While this test is not common for standard office environments, the environment test is critical in certain scenarios.

For example, if the spreadsheet is used in manufacturing environments with the potential for network interruptions or power fluctuations, how does the spreadsheet behave?

Acceptance criteria

The spreadsheet should maintain consistent performance and data integrity under varying environmental conditions, such as network fluctuations or power interruptions.

Any auto-save or recovery features should work as intended.

6. User acceptance test

Users work with the spreadsheet during this test to ensure it meets their needs and expectations.

Feedback from these users can provide valuable insights into potential improvements or adjustments.

Acceptance criteria

End-users should confirm that the spreadsheet meets or exceeds their operational needs. There should be no critical feedback points that would hinder the regular use of the spreadsheet in its intended environment.

240 SOPs, 197 GMP Manuals, 64 Templates, 30 Training modules, 167 Forms. Additional documents included each month. All written and updated by GMP experts. Checkout sample previews. Access to exclusive content for an affordable fee.

Step 4: Execution of qualification protocols for Excel spreadsheet

After the validation, the team will develop the installation, operational, and performance qualification protocols as detailed above, execute each testing step appropriately, and record the results.

If the qualification protocol has not been followed correctly, raise a protocol deviation report.

Evaluate each deviation with in-depth root cause analysis. Identify and implement the corrective and preventive actions.

If the protocol needs to be changed, update the protocol revision register. The revised protocol must be approved before execution.

Compare all results against the pre-determined acceptance criteria. If any tests fail, document the results on a protocol deviation report with all the evidence.

Could you conduct an error analysis and make the necessary changes on the spreadsheet to resolve the failure? Typically, all tests must then be repeated.

If you find the error resulted from an incorrect formula, the spreadsheet should be used for error analysis to determine the cause of the error.

For example, Excel has error messages for cells and auditing tools for determining precedents and dependent tracers.

Could you perform an impact assessment of the error? Then, please modify the template appropriately, receive approval, and repeat all tests.

Step 5: Prepare final report on Excel spreadsheet validation

After the execution of qualification tests, prepare a final report on the Excel spreadsheet validation summarizing the testing results. Explain any deviations during the testing and provide rationales for resolving any deviations.

Could you attach all the associated records and submit the final report for review and approval?

Validated spreadsheet
Validated spreadsheet

How to manage validated Excel spreadsheets?

You must take appropriate measures to prevent unwanted modifications to validated spreadsheets that may compromise the file’s validation status.

i. Prevent unauthorized access to the validated spreadsheet

You must protect the master copy of the validated unauthorized modification. For example, the spreadsheet shall be available to authorized users as “read-only.”

Authorized users will be given access after they have been trained on the use of the spreadsheet.

Where networks are employed for the spreadsheet, it will typically be stored in a network directory that allows authorized users to access it.

In this instance, storing a copy of the spreadsheet application on the local computer’s hard drive must be prohibited.

ii. Training of users

Please train your employees on the proper use of validated spreadsheets. The objective of the training should include:

– How to use the spreadsheet application, including how to print the spreadsheet application report

– The procedure for storing spreadsheet application reports (if applicable)

iii. Storage of completed electronic files

When you enter new data in the spreadsheet, it processes it according to the formula, and the calculated results become a new electronic file.

The new electronic file should be stored under a new name in a designated area of the LAN server.

Follow these rules when storing completed electronic files.

– Files should be named using a convention approved by the quality assurance.

– The file name of the completed spreadsheet electronic report should be recorded in a notebook or other similar documentation system as a cross-reference.

– During the review of notebook or other similar documentation system data, the reviewer should ensure that the electronic file has been stored correctly.

Re-validation of Excel spreadsheet formulas

Review the spreadsheet at least once every three years to ensure the effectiveness of the validation parameters and the integrity of the formula.

Re-validation may also be triggered by installing an upgrade of the spreadsheet program (Change Control System) or by changes in the LAN (e.g., system crashes).

If changes are required in a validated spreadsheet, it should be managed by a change committee in line with the change control procedures.

240 SOPs, 197 GMP Manuals, 64 Templates, 30 Training modules, 167 Forms. Additional documents included each month. All written and updated by GMP experts. Checkout sample previews. Access to exclusive content for an affordable fee.

Conclusion

For many of us, Excel spreadsheets are indispensable tools in the pharmaceutical industry.

Depending on the use and criticality, some complex Excel spreadsheets require rigorous validation to ensure their outcomes are accurate, reliable, and consistent.

But what does this “validation” entail, and why is it so critical for professional applications?

Validation of Excel spreadsheets ensures that they perform as intended, are free from errors, and comply with regulatory standards.

Whether the spreadsheet is used for laboratory data analysis, batch production records, or inventory management, validation is crucial to maintaining the integrity and accuracy of the information and outcomes it presents.

It is important to develop clear and meaningful functional requirements for a complex spreadsheet before it is handed down to the experts to develop the entire spreadsheet application.

Once the spreadsheet is created using complex formulas, macros, lookup tables, or sophisticated functions and logic, it falls into the GAMP 5 category, which requires a full set of qualification tests.

For instance, installation qualification (IQ) ensures that the spreadsheet has been designed, developed, supplied, and installed according to user requirements and functional specifications.

Operational qualification (OQ) tests the spreadsheet’s functionalities under defined conditions to verify that it operates as designed.

The performance qualification (PQ) verifies that the spreadsheet consistently delivers accurate results in real-world scenarios, offering confidence in its day-to-day operations.

Once an Excel spreadsheet is validated, the users should be trained on its correct use. User access should be controlled, and the file should be stored securely to avoid accidental modifications.

A validated spreadsheet should be re-validated at least once every three years to check the effectiveness of validation parameters.

By understanding and applying this Excel spreadsheet validation approach we have established in this article, you can safeguard your work from potential pitfalls and errors, ensuring that decisions made using the spreadsheet are sound and reliable.

Picture of Author: Kazi Hasan

Author: Kazi Hasan

Kazi is a seasoned pharmaceutical industry professional with over 20 years of experience specializing in production operations, quality management, and process validation.

Kazi has worked with several global pharmaceutical companies to streamline production processes, ensure product quality, and validate operations complying with international regulatory standards and best practices.

Kazi holds several pharmaceutical industry certifications including post-graduate degrees in Engineering Management and Business Administration.

Leave a Reply

Your email address will not be published. Required fields are marked *