Guideline for the validation of Excel spreadsheets

  • Published on: Dec 18, 2017

Validation of Excel spreadsheets involves the development and testing of spreadsheet applications and confirming that the information generated through the spreadsheet is accurate, complete, intended, and reproducible.

In this article, we will provide a framework and approach for the validation of Excel spreadsheets used in direct, or indirect, GMP-related activities.

We will show how to validate Excel spreadsheets that are designed and used for making decisions in GMP environments which directly impact product quality, safety, and identity of medicinal products.

Why is validation of Excel spreadsheets required?

Excel spreadsheets are widely used in pharmaceutical operations from simple data entry in the registers to complex data processing using functions and formulas.

Excel spreadsheet offers to use many functions to make our life easy. We use statistical and mathematical functions routinely. Some of those are ingrained in our mind such as auto-sum or to show median, mode, average, range, trend, standard deviations, etc.

Inside an Excel spreadsheet, you can create Lookup functions such as ‘Vlookup’, ‘Hlookup’, ‘Index’, or create logical functions like If, And, Or, Not, etc. to generate useful information from the initial data points.

You can also create macros, conditional formatting, data validation, pivot tables, and many other features as a handy tool for presenting data.

When Excel spreadsheets are used to facilitate our decision-making on product quality it is critical to validate those Excel spreadsheets to ensure the information is accurate, consistent, complete, and true.

Accuracy is crucial to guarantee that the data entered, and the calculations performed are correct. Any inaccuracy in the formula can lead to incorrect conclusions, which can have severe repercussions on product quality.

The results that you generate through Excel spreadsheets must be consistent. Consistency ensures that data and calculations are uniform across the spreadsheet.

For example, if a column is programmed to generate results based on the data from two other columns, this calculation should be consistently applied across all rows.

Data integrity ensures the relationships between different sets of data within the spreadsheet are preserved. For example, if there is a column that references values in another column, all the referenced values should exist.

Lastly, the trustworthiness of data is also critical to comply with regulatory requirements. Validated data is more reliable and can be trusted for making informed decisions.

Regulatory requirements for validation of Excel spreadsheets

Many of us regularly use spreadsheets to record and manipulate GMP data (i.e., change, delete, add). Therefore, there is a necessity to manage these spreadsheets in a manner to ensure the continued integrity and security of that data.

While some spreadsheets are no more than fancy calculators’ others are embedded with special features and use intricate logic in the form of Macros that are in effect computer programs.

Good Automated Manufacturing Practice (GAMP) by ISPE provides a detailed framework for the validation of Excel spreadsheets which is primarily adopted by the international regulatory authorities.

Excel spreadsheets can be classified into GAMP categories depending on the use and contents of the spreadsheet. GAMP categorization dictates the level of qualification and validation effort that is applicable just as with other computerized systems.

Below are the GAMP categories commonly interpreted for spreadsheets.

Spreadsheets with different levels of GAMP category

GAMP 3: These spreadsheets are used solely to produce paper documents.

GAMP 4: These spreadsheets are more complex such as those used as templates incorporating process steps.

GAMP 5: These spreadsheets use macros, lookup tables &/or sophisticated logic, etc.

Spreadsheets that fall into GAMP category 3 guidelines do not require full validation activities but should be held under appropriate security conditions.

Spreadsheets that have calculations that are fully explained on the spreadsheet, and which undergo verification with each use do not need to be fully validated.

Spreadsheets that fall into GAMP category 5 having complex formulas, macros, lookup tables, or sophisticated functions and logic will require complete data validation effort to ensure these spreadsheets generate results that are free from error, and bias.

Where there are spreadsheets that are used as templates, you should maintain an appropriate level of security conditions, including password protection and secure storage.

240 SOPs, 197 GMP Manuals, 64 Templates, 30 Training modules, 167 Forms. Additional documents included each month. All written and updated by GMP experts. Checkout sample previews. Access to exclusive content for an affordable fee.

How do you validate an Excel spreadsheet?

When you validate a complex Excel spreadsheet you have to follow a specific framework similar to those in computer system validation. However, the scope of validation is much narrower and specific to spreadsheet development and use.

Step 1: Develop functional requirements of Excel spreadsheet

The functional requirements of an Excel spreadsheet are prepared keeping in mind both user requirements and functional specifications. The functional requirements should be written by the developer.

The functional requirements should be approved prior to the initial design of the Excel spreadsheet.

Functional requirements can also be developed retrospectively after a prototype spreadsheet has been prepared.

The functional requirements document should contain the following areas:

– The intended purpose of the spreadsheet application (such as to calculate results, to re-order results for management reports, etc.)

– The formulas that will be included in the spreadsheet application, including logical formulas used for branching or for calculations.

– A description of macros that will be included.

– A description of any column or row rearrangements/sorting that will be required to format a new report from included data.

– A description of the report that will be generated.

– A statement defining electronic storage requirements for the executed spreadsheet -application (results).

– Where applicable, issues such as data formats and numerical precision, numeric rules such as rounding or floating points, flagging of limits, boundary conditions, filters, or outliers.

– Performance requirements for the application and program (i.e., calculations or reports within ‘X’ seconds, applications/program availability, etc.).

– Workflow needs such as the source of the data, a statement regarding raw data, and “who will perform what actions and make what decisions based on what data and criteria” may be required.

Step 2: Develop the Excel spreadsheet application

Sometimes internal employees are given the responsibility to develop the spreadsheet applications (e.g., Senior laboratory analysts).

However, you should make sure that the employees are adequately trained and qualified to work with Excel spreadsheets. Ensure the employees have sufficient knowledge of the internal work process which should be correctly reflected on the new application.

The developer should be able to devise the spreadsheet parameters, such as the way the table should look, data formats, data sorting parameters, the addition of fields to the spreadsheet, reports, formulas, and interfaces with other programs and instrumentation.

i. Consider how raw data will be entered into the spreadsheet

The spreadsheet developer must define how raw data will be entered.

Data can be fed directly into a spreadsheet from adjoining instruments or data can be entered manually.

For automated operations, raw data will be fed directly through the digital interface from the instrument to the spreadsheet application.

If data sets are entered manually, a second individual must review all entered data to verify the correctness. The initials of the reviewer and the date of review should be indicated.

If data sets are entered automatically through an electronic interface, then the transfer link between the spreadsheet application and the source of the data must be validated.

On the other hand, data from a spreadsheet must not be exported to a validated computer system or application unless the transfer link has been validated and the importation process is approved via change control for the receiving system.

ii. Develop necessary formulas in the spreadsheet

Each formula described in the requirements documents must be embedded in the spreadsheet application.

The formulas must be checked for errors during development.

Diagnostic tools that allow error checking during development are sometimes available with spreadsheet programs.

Examples of error detection tools include:

– Cell error messages that are created when cell rules are violated

– Cells that are referred to by a formula in another cell may be audited for alignment, with the spreadsheet displaying tracer arrows between cells.

– When a formula returns an error, another auditing tool can be used to track it back to its source. Sometimes called error tracer.

Excel formula example
Excel formula example

240 SOPs, 197 GMP Manuals, 64 Templates, 30 Training modules, 167 Forms. Additional documents included each month. All written and updated by GMP experts. Checkout sample previews. Access to exclusive content for an affordable fee.

iii. Rearrange the spreadsheet columns and rows

Spreadsheets are also used to sort and rearrange data for reports and presentations.

The source of the data may be from another program or database. The source can also be an instrument system such as reporting data from a chromatographic software system.

The procedures to be used for subsequent sorting or rearranging of the data via the spreadsheet application should be described.

iv. Program of Macro within the Excel spreadsheet

You can automate the frequently performed tasks by recording them as macros and playing them back whenever necessary.

Macros may be associated with toolbar buttons, keystrokes, and menu entries.

Manual sorting and rearranging of data are actions that might be automated. Still, if the macro is recorded, the actions must be carefully documented during development and tested during the qualification process.

Excel marcos example
Excel marcos example

v. Generate reports from the spreadsheet

It is the responsibility of the developer to create or modify the spreadsheet to generate automated reports with the use of the results of calculations performed by the spreadsheet application.

A typical report from the spreadsheet may contain some of the following attributes but are not limited to:

– The method (i.e., Laboratory Analytical) used during the creation of data.

– A unique identifier for the analyst and the checker who are responsible for the correctness of entries into the spreadsheet application.

– The date the spreadsheet application was executed.

– The version of the spreadsheet application program.

– Version control for the report.

– The unique name of the electronic version of the file.

If data is received through an electronic interface, the report must identify the instrument system or software program communicating the data.

Step 3: Qualifications for Excel spreadsheet validation

After you finish developing the spreadsheet the file is transferred to the validation team for qualification testing. The validation engineer needs to prepare qualification test protocols for installation, operational, and performance qualifications.

i. Installation qualification of Excel spreadsheet

Installation qualification (IQ) will provide you with documented evidence that the Excel spreadsheet has been designed, developed, supplied, and installed in accordance with user requirements and functional specifications.

The installation qualification tests aim to demonstrate that all the Excel spreadsheet’s primary features are available as designed. The equipment is one step closer to satisfactory operation.

The IQ test protocol will include the spreadsheet name, version number, location, and access control. Implement appropriate security restrictions at the time of initial installation to the secure directory.

The validation team needs to store a master copy of the spreadsheet in a secure environment. Each time the spreadsheet is required for use on a computer in the user area, it must be opened from the read-only master copy.

Lock cells containing formulas to protect the contents of the cells. Protect the spreadsheet with a password. Maintain the access control information (including the passwords) securely.

Successful installation qualification will help you achieve the following:

– Verification that all necessary features, add-ins, or extensions are correctly installed and operational.

– The memory, storage, processing speed, and any other necessary hardware components are compatible with the spreadsheet.

– Verification that the network configurations are correct if the spreadsheet needs to access remote data or if multiple users need to collaborate on the same document

– Provide assurance that only authorized users can access the spreadsheet. Verifies that data is encrypted if necessary and an audit trail is active.

– Ensures that there are systems in place to back up the spreadsheet and recover data in case of any failures

ii. Operation qualifications of Excel spreadsheet

The operational qualification (OQ) will confirm that the correctly installed spreadsheet is capable of operating according to its functional specifications all the time, repeatedly.

The operational qualification (OQ) of an Excel spreadsheet includes the following tests.

You should establish acceptance criteria for each test and write it into the operation qualification protocol.

240 SOPs, 197 GMP Manuals, 64 Templates, 30 Training modules, 167 Forms. Additional documents included each month. All written and updated by GMP experts. Checkout sample previews. Access to exclusive content for an affordable fee.

1. Calculation test or functional test

Calculation tests are performed to verify that correct formulas, macros, and functions, are entered.

Enter sample values in all fields that are included in the input range of all mathematical functions.

Restrict the input to whole numbers at this time for aggregate functions (sum, max, etc.) using different values for each field.

The tests also verify the data input mechanisms (e.g., dropdown lists, and data validation rules) are functioning correctly. Any visualization aids such as charts, graphs, etc. are rendered correctly and updated with data changes.

Document the step, include a printout of the spreadsheet indicating the test values, and compare the printout to that obtained from a calculator.

Acceptance criteria

Calculated values from the spreadsheet application must be identical to those obtained from a calculator (preferably with printout capabilities).

2. Logical values test

Logical values tests are performed to verify Boolean algebra formulas.

Logical values often refer to conditions that are evaluated as TRUE or FALSE.

In the context of validation of Excel spreadsheets, you might be testing cells or formulas that rely on logical functions (like IF, AND, OR) to see if they produce the expected results under various scenarios, including extreme or boundary conditions.

Print the spreadsheet cells containing Boolean (logical values) and calculate the results manually. Print the results from the spreadsheet application calculation.

Acceptance criteria

For logical values (e.g. Pass/Fail or True/False) verify the displayed value is the same as the expected value, calculated manually.

3. Range/precision test

Range/precision tests are performed to verify calculations will still be accurate when extremely large or small values are entered.

Enter extremely large values at the correct precision (e.g., enter 99.99 for 0.01 precision).

For aggregate functions, enter 0 in one or more fields with non-zero values in other fields.

Duplicate the formula calculations using a calculator and compare the results/printouts.

Repeat this test with extremely small values and report the results.

Acceptance criteria:

Calculated values from the spreadsheet application must be identical to those calculated manually using a calculator (with correct precision and consistent with the agreed rounding principle).

4. Challenge condition (boundary and extreme condition) test

The challenge condition test is also called the Boundary and Extreme Condition Testing.

The test involves inputting values at the extreme ends of expected ranges to ensure the spreadsheet handles them appropriately.

For instance, if a cell expects a value between 1 and 100, what happens if you input 0, 101, or a text string?

The challenge condition test can help detect potential errors in formulas or functions that might not be evident under regular conditions.

Acceptance criteria:

The spreadsheet should either correctly process boundary and extreme inputs or produce appropriate error messages.

No crashes or unintended behaviors should occur when such values are entered.

5. Performance test

If your spreadsheet is required to handle large datasets, it’s essential to check its performance to ensure that calculations, data processing, or visualizations don’t lag excessively or cause the software to crash.

Acceptance Criteria:

For large datasets or complex calculations, the spreadsheet should complete tasks within an acceptable timeframe (e.g., recalculations or data processing should be completed within 10 seconds). The software should not crash or freeze during these operations.

6. Error handling and recovery test

Error handling and recovery test involves introducing intentional errors to see how the spreadsheet responds.

For example, what happens if a required input is left blank?

It’s important to check if appropriate error messages are displayed and if the spreadsheet can recover without data loss or corruption.

Acceptance criteria:

On encountering intentional errors, the spreadsheet should either manage the error gracefully (e.g., by substituting a default value) or provide a clear error message.

No data corruption should occur due to these errors.

7. Integration and interface Test

If your spreadsheet pulls data from external sources, databases, or other software, it’s essential to verify that these integrations work correctly.

This involves checking the accuracy of imported data and ensuring timely data refreshes.

Acceptance Criteria:

Data imported from external sources should match the source data in accuracy and completeness.

There should be no errors or significant delays during data imports/refreshes.

8. Audit trail and version control test

Since your spreadsheets will be used in regulated environments such as GMP, it is necessary to ensure a robust audit trail is implemented for the Excel spreadsheet.

The audit trail and version control test verify that changes, edits, or deletions are appropriately logged and that a traceable history of spreadsheet versions is maintained.

Acceptance criteria:

All changes, edits, or deletions should be logged with a timestamp, user ID, and a description of the change.

Previous versions of the spreadsheet should be retrievable, and changes between versions should be traceable.

9. Backup and restoration test

The backup and restoration test involves intentionally deleting or corrupting the spreadsheet and trying to restore it from backups to ensure that backup mechanisms are effective.

Acceptance criteria:

After intentional deletion or corruption, the spreadsheet should be fully restorable from backups.

Restored data should match the original data in accuracy and completeness without any loss.

240 SOPs, 197 GMP Manuals, 64 Templates, 30 Training modules, 167 Forms. Additional documents included each month. All written and updated by GMP experts. Checkout sample previews. Access to exclusive content for an affordable fee.

iii. Performance qualification of Excel spreadsheet 

Performance qualification (PQ) during the validation of Excel spreadsheets ensures that the tool consistently delivers accurate results in real-world scenarios.

By testing under actual user conditions and with genuine data sets, PQ validates that the spreadsheet not only meets theoretical specifications but also performs reliably in daily operations.

Performance qualification (PQ) confirms that the spreadsheet is fit for its intended purpose.

Following is a list of performance qualification tests you may try including the acceptance criteria.

1. End-user operation test

End-user operation test verifies that the spreadsheet performs as expected when used by its intended users.

This would involve typical day-to-day tasks, such as data entry, calculations, generating reports, or any other regular function the spreadsheet is intended for.

Acceptance Criteria:

The spreadsheet must successfully complete all standard operational tasks (e.g., calculations, report generation) within defined time frames without errors when end-users operate.

2. Long-term stability and performance test

This test verifies how the spreadsheet behaves with prolonged use.

If data is added to the spreadsheet over time, does it continue to perform well, or do issues arise as it becomes more populated?

Acceptance Criteria:

Over an extended period (e.g., six months), the spreadsheet should maintain consistent performance levels without showing signs of lag, crashes, or errors even as data accumulates.

3. Real-world data test

This test involves entering actual, real-world data sets in the spreadsheet, rather than just test or synthetic data. This ensures that it can handle the nuances and complexities of genuine data.

Acceptance Criteria:

The spreadsheet should correctly process and display real-world data sets, maintaining data integrity and producing expected outputs or results.

4. Scalability test

The scalability test checks how the spreadsheet performs as the amount of data grows.

This is applicable for spreadsheets that are expected to handle increasing amounts of data over time, it’s vital to ensure that performance doesn’t degrade.

Acceptance Criteria:

As data volume or complexity increases to the maximum expected level, the spreadsheet should not crash, freeze, or significantly slow down.

Calculations or data manipulations should still be completed within acceptable time frames.

5. Environmental conditions test

While this test is not common for standard office environments, in certain scenarios, the environment test is critical.

For example, if the spreadsheet is used in manufacturing environments with the potential for network interruptions or power fluctuations, how does the spreadsheet behave?

Acceptance Criteria:

The spreadsheet should maintain consistent performance and data integrity under varying environmental conditions, such as network fluctuations or power interruptions.

Any auto-save or recovery features should work as intended.

6. User acceptance test

During this test, actual users work with the spreadsheet to ensure it meets their needs and expectations.

Feedback from these users can provide valuable insights into potential improvements or adjustments.

Acceptance Criteria:

End-users should confirm that the spreadsheet meets or exceeds their operational needs. There should be no critical feedback points that would hinder the regular use of the spreadsheet in its intended environment.

240 SOPs, 197 GMP Manuals, 64 Templates, 30 Training modules, 167 Forms. Additional documents included each month. All written and updated by GMP experts. Checkout sample previews. Access to exclusive content for an affordable fee.

Step 4: Execution of qualification protocols for Excel spreadsheet

After the validation team will develop the installation, operational, and performance qualification protocols as detailed above, execute each testing step appropriately, and record the results.

If the qualification protocol has not been followed correctly, raise a protocol deviation report. Evaluate each deviation with in-depth root cause analysis. Identify and implement the corrective and preventive actions.

If changes are required on the protocol, update the protocol revision register. Get approval of the revised protocol before execution.

Compare all results against the pre-determined acceptance criteria. If any of the tests fail, document the results on a protocol deviation report with all the evidence.

Conduct error analysis and make necessary changes on the spreadsheet to resolve the failure. Typically, all tests must then be repeated.

If you find the error has resulted due to an incorrect formula, the spreadsheet should be used for error analysis to determine the cause of the error. For example, Excel has error messages for cells and has auditing tools for determining precedents and dependent tracers.

Perform an impact assessment of the error. Then, if required, modify the template appropriately, receive approval, and repeat all tests.

Step 5: Prepare final report on validation of Excel spreadsheets

After the execution of qualification tests prepare a final report on the validation of Excel spreadsheets summarizing the testing results. Explain any deviations that occurred during the testing and provide rationales for the resolution of any deviations.

Attach all the associated records and submit the final report for review and approval.

Validated spreadsheet
Validated spreadsheet

How to manage validated Excel spreadsheets?

You have to take appropriate measures to prevent the validated spreadsheets from unwanted modifications which may compromise the validation status of the file.

i. Prevent unauthorized access to the validated spreadsheet

You must protect the master copy of the validated unauthorized modification. For example, the spreadsheet shall be available to authorized users as “read-only”.

Authorized users will be given access after they have been trained on the use of the spreadsheet.

Where networks are employed for the use of the spreadsheet, it will typically be stored in a network directory that allows access by authorized users. In this instance, the storage of a copy of the spreadsheet application on the hard drive of the local computer must be prohibited.

ii. Training of users

Train your employees on the proper use of validated spreadsheets. The objective of the training should include:

– How to use the spreadsheet application, including how to print the Spreadsheet Application Report

– The procedure for storing Spreadsheet Application Reports (if applicable)

iii. Storage of completed electronic files

When you enter new data in the spreadsheet, it processes those as per the formula and the calculated results become a new electronic file.

The new electronic file should be stored under a new name in a designated area of the LAN server. Follow these rules when storing completed electronic files.

– Files should be named using a convention approved by the quality assurance.

– The file name of the completed spreadsheet electronic report should be recorded in a notebook or other similar documentation system as a cross-reference.

– During the review of notebook or other similar documentation system data, the reviewer should ensure that the electronic file has been stored correctly.

Re-validation of Excel spreadsheet formulas

Review the spreadsheet at least once every 3 years to address the effectiveness of the validation parameters which includes the integrity of the formula.

Re-validation may also be triggered by the installation of an upgrade of the spreadsheet program (Change Control System), or by changes in the LAN (e.g., system crashes).

If changes are required in a validated spreadsheet, it should be managed by a change committee in line with the change control procedures.

240 SOPs, 197 GMP Manuals, 64 Templates, 30 Training modules, 167 Forms. Additional documents included each month. All written and updated by GMP experts. Checkout sample previews. Access to exclusive content for an affordable fee.

Conclusion

For many of us Excel spreadsheets are indispensable tools used in the pharmaceutical industry.

Depending on the use and criticality, some complex Excel spreadsheets require rigorous validation to ensure their outcomes are accurate, reliable, and consistent.

But what does this “validation” entail, and why is it so critical for professional applications?

Validation of Excel spreadsheets ensures that they perform as intended, are free from errors, and in compliance with regulatory standards.

Whether the spreadsheet is used for laboratory data analysis, batch production records, or inventory management, validation of a spreadsheet is crucial to maintain the integrity and accuracy of the information and outcomes it presents.

It is important to develop clear and meaningful functional requirements for a complex spreadsheet before it is handed down to the experts to develop the entire spreadsheet application.

Once the spreadsheet is created with the use of complex formulas, macros, lookup tables, or sophisticated functions and logic, it falls into the GAMP 5 category which requires a full set of qualification tests.

For instance, installation qualification (IQ) ensures that the spreadsheet has been designed, developed, supplied, and installed in accordance with user requirements and functional specifications.

While operational qualification (OQ) tests the spreadsheet’s functionalities under defined conditions to verify it operates as designed.

The performance qualification (PQ) verifies that the spreadsheet consistently delivers accurate results in real-world scenarios, offering confidence in its day-to-day operations.

Once an Excel spreadsheet is validated, the users should be trained on its correct use. User access should be controlled, and the file should be stored securely to avoid accidental modifications.

A validated spreadsheet should be subjected to re-validation at least once in three years to check the effectiveness of validation parameters.

By understanding and applying this validation process which we have established in this article, you can safeguard your work from potential pitfalls and errors, ensuring that decisions that are made using the spreadsheet are sound and reliable.

Picture of Author: Kazi Hasan

Author: Kazi Hasan

Kazi is a seasoned pharmaceutical industry professional with over 20 years of experience specializing in production operations, quality management, and process validation.

Kazi has worked with several global pharmaceutical companies to streamline production processes, ensure product quality, and validate operations complying with international regulatory standards and best practices.

Kazi holds several pharmaceutical industry certifications including post-graduate degrees in Engineering Management and Business Administration.

Leave a Reply

Your email address will not be published. Required fields are marked *