All.Net

Fri Apr 8 06:49:41 PDT 2016

Control Architecture: Change management: How are changes to ICS and supporting infrastructure handled?

Options:

Option 1: Just let change happen and hope you can handle it.
Option 2: Make backups before changes in case you have to revert.
Option 3: Use change control prior to changes but move forward and count on your expertise to make it work.
Option 4: Use sound change control with full reversion capabilities.
Option 5: Recertify systems in context whenever significant changes are made.
Option 6: ICS-specific change controls should be used.

Decision:

Risk	Maturity	Change approach
High	Managed+	Use sound change control with full reversion capabilities. AND Recertify systems in context whenever significant changes are made. AND ICS-specific change controls should be used.
High	Defined-	Do not operate at high risk with this level of maturity.
Medium	Managed+	[Based on risk management and executive decision-making, Use change control prior to changes but move forward and count on your expertise to make it work OR Use sound change control with full reversion capabilities OR Make backups before changes in case you have to revert.] AND ICS-specific change controls should be used.
Medium	Repeatable or Defined	Make backups before changes in case you have to revert. AND ICS-specific change controls should be used.
Medium	Initial-	Initial is not mature enough to operate at medium risk - increase maturity level
Low	Repeatable+	Make backups before changes in case you have to revert.
Low	Initial-	Just let change happen and hope you can handle it.

Change management architecture

ALSO Specific sound change control and recertification requirements should be used for specific ICS systems.

For SCADA systems, risk is the aggregate of all risks of all direct and indirect controlled components including aggregation risks and common mode failure risks. Normally this leads them to be placed in the sound change control and recertification realm. This then implies that they should not be connected to non ICS networks except through the change control process, and that after changes are made, the full set of recertification tests should be undertaken.
For PLCs, ladder logic and other code changes should not be permitted except in maintenance periods, and this should be enforced by a "maintenance mode" switch on the PLC which permits such changes. The PLC logic should prohibit changes to memory areas where control functions are performed and disable such changes except then the maintenance mode setting is active. Changes to all data values should also be limited to acceptable process values as a function of state. Ladder logic should include the state of the maintenance switch in input from non DCS components and should limit operating time in maintenance mode requiring manual actions to continue controlling (sending output) in maintenance modes. PLCs should be tested against the full suite of acceptance and regression tests associated with recertification prior to being put into an operational mode after maintenance involving connections to external systems.
DCS devices, intelligent sensors and actuators, and similar equipment should not permit reprogramming except in maintenance mode of their controlling PLCs or SCADA systems. Changes should also be limited at the PLC and, if possible with redundant controls, in the intelligent DCS, actuator, or sensor. Recertification should be done after any significant changes to DCS devices.

Basis:

Just let change happen and hope you can handle it.
Fast and loose is often the best approach for smaller businesses or for portions of businesses where risks are low. The cost of sound change control is often on the order of twice the cost of not having it, so low risk and low cost make sense together. But for medium and high risk ICS, change control is absolutely required.

Make backups before changes in case you have to revert.
Backups are used to provide a modicum of recoverability, and changes are made with the knowledge that reversion is, at least theoretically, possible. It is prudent to also have a tested recovery procedure and to verify that the backups are a workable solution for recovery in desired time frames.

Research and development separated from change control separated from production, testing at each step

Use change control prior to changes but move forward and count on your expertise to make it work.
In forward-only environments, such as financial transaction systems, many enterprises choose to never go back once a substantial change is made. This means that they accept the risk of failure and save the cost of full reversion. It places additional pressure on testing and process to get it right, and means that experts have to be available to deal with the issues if and when they arise, and in real time.

Use sound change control with full reversion capabilities.
Sound change control implies:

A system for requesting, specifying, implementing, testing, and implementing changes,
A method for tracking and backing out of changes,
Separation of duties between research and development, testing, change control, and operations,
Databases that track these different elements of the process,
Approval processes and work flows to assure operational execution,
Integration of changes into the detection and response process to prevent false positives and potentially harmful responses,
Notification of audit so they can adapt their auditing to meet the new requirements,
Updated documentation to reflect operational changes and user changes,
Training to adapt the people to the changes,
HR and legal approval of changes impacting those areas, and
Policies, standards, and procedures must be followed along the way.

Recertify systems in context whenever significant changes are made.
In high-consequence systems, particularly where process failures can cause harm that is not rapidly and automatically detected and readily mitigated, significant changes (i.e., anything other than well-tested variations in parameters) should result in a system recertification in context. This includes both a thorough retest of the systems undergoing changes equivalent to the acceptance tests performed initially, regression testing of all known and resolved subsequent issues, and tests equivalent to acceptance and regression tests of interoperation to assure that changes don;t have adverse effects on related systems.

ICS-specific change controls should be used.
For ICS systems that directly contact and control physical systems, changes are very closely related to the specifics of the physical system under control. As such, issues like stability and the ability to maintain control over the process are central to change management. As such, these are engineering decisions that rely on engineering calculations and analysis. Such changes cannot be made in bulk or based on a generic patch management approach, and cannot be tested with common system testing methods. Plant shutdown is often required for such changes, and simulations may also used to test some classes of changes.