TechArch: Workflows: How are workflows used, controlled, and assured?
The following workflow controls are suggested:
-work certification and completion
-authentication appropriate to the risk
-control points integrated with enterprise authorities
-changes of personnel, processes, and authorities
-separation of duties
-approval processes appropriate to risks controlled
-risk aggregations limited per risk management
-exceptions, overrides, appeals, and escalations processing
-authorization and context limitations
-proper prioritization of work to be done
-timeliness and deadlines integrated into process controls
-integrated into provisioning |
-notification requirements and notices
-documentation of history and basis for decisions
-process and timeliness failure notices
-outputs protected from alteration and availability assured
-integrated with inventory and process mechanisms |
-automated / manual / mixed
-audited and reviewed [frequency and diligence appropriate to risks controlled]
-support process analysis and improvement
-risk aggregation limits on workflow system itself
-workflow system change controlled and synchronized to the protection program
-engines(s) controlled, verified, validated, tested, reviewed, and tracked
-surety matches risk controlled -changes of personnel, processes, and authorities corrected during work flow processes
Protection process is typically implemented in terms of a set of work flows; standardized event sequences with inputs, state, outputs, and systems that take state and input to produce output and next state; with the explicit purpose of carrying out the processes identified for protection. There are many work flow systems available. They typically handle help desk operations or other similar ticketing systems, and similar mechanisms have been around for many years in the legal profession, medical systems, in aerospace, and in other fields. Manual work flow systems were commonplace up until the last several years and many continue to persist and will for a long time to come.
The advantages of automated work flow systems for security come in several areas. They (1) assure that work gets done in the proper sequence, (2) can act to assure that approvals are properly undertaken prior to actions, (3) can provide automated provisioning integration for automatable work flows like adding user identities based on roles and similar functions, (4) can document the entire process, (5) allow verification, (6) help to reduce the work load for audit, and (7) provide support for process improvement. However, because of their central role in operational aspects of protection they also form risk aggregation points that pose significant risk. For example, identity management solutions that automate some limited components of security work flow associated with access controls, can be attacked to cause all access to cease, to grant access to unauthorized individuals, to destroy the information functions of an organization, or to disrupt operations in automated manufacturing or processing facilities. Providing adequate surety for these systems and disaggregating risks by creating sets of these systems with zones of control and potentially overlapping authorities is complex and problematic, but necessary for the enterprise that wishes to succeed in light of the realities of threats in the information world.
Work to be done:
Many facets of information protection exist and the work that has to be done for all of these facets comprises a very significant portion of the total effort in information protection. Work has to be described and standardized in order to fit into work flow systems and this itself is a very substantial effort. There are some partial work flow systems that exist for security but they are nowhere near the level of completeness required for an enterprise and they cover only a small subset of the overall work flow of the enterprise security operation. Hundreds of processes may all have to be codified into work flows in order for them to be properly handled in a systematic manner for an enterprise. For the small or medium sized business this can be combined with metrics to form a set of checklists for many of the common functions, and they have been used for that purpose.
Process for completion and options:
For each item of work to be done a process for completion should be defined including the conditions for its invocation, times associated with different actions to be undertaken, primary and auxiliary contacts for performing the identified tasks, optional processes for emergency, standard, and exceptional conditions including appeals processes and overrides, and enough details to allow any authorized and properly trained and competent person to carry out the work. The processes should identify points for workers to certify that work has been done, and for those who certify work to do so and notify the system of the verification. Timeliness requirements are also often critical for legal, regulatory, or contractual reasons.
Control points and approval requirements:
Most processes have control points of one sort or another. For example, a worker may prepare all of the elements for a building to be wired for electrical systems, but until the building inspector comes and approves of the plan of the building ready to be wired, the wiring waits. In information protection there are similar control points defined, typically when risks beyond thresholds of the level of the current worker are reached. The approval process should identify someone with adequate authority and knowledge to make a reasonable and prudent decision about the risk, identify the risk and the options to the authorized person or people, and seek their approval or rejection or optional paths. In some cases multiple approvals or more complex voting systems may be used and timeliness issues may require actions be taken urgently. These are sometimes called multi-person controls. Presumably the overall system has to be able to handle this in order to be effective in these cases.
Appeals processes and escalations:
Work flows have to have suitable provisions for appeals and escalations when something that one person wants to have done is at odds with someone in the approval path. While most processes don't get appealed in hierarchical systems because of the nature of the structure, in matrix organizations there may be many paths to getting work done. In networked organizations the organic nature of the process often allows many paths to getting something done. But even in a hierarchical process there will be times when escalation is used, for example, when timeliness is an issue and normal approval paths are not available in a timely enough fashion.
Authentication requirements & mechanisms:
The quality and quantity of authentication associated with different functions typically varies across a wide spectrum. For example, a simple lookup of the work to be done might require only a user identity and password, while the ability to change a work order may require an additional authentication such as the presentation of a time variant password from a secure token. For some actions physical presence may be required and this may mandate a third party authentication to certify presence along with biometric data and other similar methods. The work flow system has to support the use of different authentication mechanisms to support the different levels of surety required to perform different operations.
Authorization and context limitations:
Authorizations associated with identified subjects under different levels of authentication may change with context (see details of context elsewhere) and different situations within work flows. The work flow system has to be capable of handling complexities associated with the specific identified needs of data owners for access to the resources necessary to do work, and in some cases, alternative sources with different authentication requirements may be sought because of circumstance. For example, if time is of import and any two of eight approvers are adequate to the need for a process to continue (i.e., multi-person control), the work flow might request responses from all eight authorizers and notify the authorizer's that the work has been approved once two have approved, so that they don't have to look at the issue if it is already settled. Similarly, context may change during the process, thus changing approval requirements. Appropriate methods must be used to properly deal with these situations. A submit-commit cycle in which one person or mechanism is used to submit a request for action and a separate mechanism of person commits (approved prior to execution) the action is often used to authorize higher risk actions. For example, this is commonly used in financial transactions over specified thresholds The work flow system should also help to prioritize work so that more important or time critical work is given proper priority.
Work flow documentation and audit:
The work flow system should provide documentation of what was done and what is to be done and allow this information of be read for audit, transparency, and chain of custody purposes as appropriate. Detailing should be available to the specific actions taken by specific individuals at specific times, the approvals required and obtained. The work flow requirements of the situation at the time should be documented so that all of the information needed to validate an action after the fact can be made available to the reviewer or auditor. Thus everything needed to determine what was done, why, when, how, where, and under what situational circumstances should be available to check on any specific process undertaken or all of the processes of the system.
Control and validation of the engine(s):
Whether work flow the mechanisms are manual or automated, the mechanisms that control the processes have to be controlled, verified, validated, tested, reviewed, and tracked to assure that they do what they are supposed to do. This includes both the normal operation of these mechanisms and all of the exception conditions and malicious sequences that might circumvent the system at every level of its operation. For example, if work flows are implemented using a paper system to cover regular backups of systems, the process will typically involve the use of a piece of paper that indicates what to do on a given shift. The shift workers then use the checklist, perhaps doing a backup and reflecting that on the checklist with date, time, tape number, and initials. The verification may be done by going to the proper tape number and restoring its contents to a test system to verify that it has the data it should have from that time and date and that it properly restores. Verification of this activity by random sample will validate that the mechanism is being used and operating properly. Additional malicious abuse testing might include seeing whether making a false entry causes a backup to not be done (for example a worker could claim to have done the work on a prior shift even though they did not do it and cause backups to go undone) or by taking away the sheets of paper and determining whether a work around is used to still do the backups and how the escalation process works in that circumstance.
Risk aggregation in the engine(s):
Automated work flow systems tend to aggregate risk by centralizing and unifying the processes that the system supports, by combining the information and capabilities of the work flows into a single computer or at a single location, by unifying the administrative aspects of managing those systems, by using common operating environments with common mode failure mechanisms, by combining previously separate mechanisms, and by creating dependencies on the work flow system for proper execution of work. At the same time these systems reduce costs, increase efficiency, improve auditability and accountability, reduce time to get many tasks done by using computer communications to replace paper processes, provide for more efficient and effective backups of the work flows, and so forth. The question for executive and risk management to answer is how much risk can be aggregated before additional protective measures are required. As a rule of thumb, and based on the notion that the surety should match the risk, as risk gets to the medium level, medium surety techniques should be used. As the work flow system reaches to risk levels where single individuals can no longer be permitted to make decisions, multi-person control must be added, and risk disaggregation by multiple work flow systems or the use of other compensating controls must be used.
Integrated into provisioning:
Provisioning systems automate the deployment of controls across complex environments. For example, when enabling those with a role to follow a new rule, the deployment of that rule for that role may involve many different steps in many different systems. When workflow is integrated into provisioning, steps associated with carrying out the work required to implement changes are automated.
This has the advantage of assuring that all of the steps are undertaken, can produce detailed audit trails and exceptions, can allow the work to be done on many systems very quickly, and tends to be more reliable and consistent for large numbers of similar systems. What once took person months of effort are done automatically in seconds to minutes.
This has the disadvantage that, if and to the extent provisioning is imperfect, systems are not in the state they should be in when the provisioning happens, or insiders act maliciously, this represents a potential hazard that is amplified by the same automation. It is possible for a single act by an individual to disrupt an enormous organization if that individual can cause the provisioning system to function in undesired ways at large scale. Thus the risk of integrated provisioning is the aggregate of the risk of all potential effects on all provisioned systems.