© Raymond Pompon 2016

Raymond Pompon, IT Security Risk Control Management, 10.1007/978-1-4842-2140-2_24

24. Post Audit Improvement

Raymond Pompon

(1)Seattle, Washington, USA

Everything flows and nothing abides. Everything gives way and nothing stays fixed.

—Heraclitus

So now you’re done. Your security program is up and running, you’ve made it through your audit, and everyone is happy. Take a vacation and rest. You can forget about security, forever. Yeah, you know I’m kidding. As long as there are threats in the world, the security team can never close their eyes.

Now that you’ve seen how your security architecture has stood up against adversity and auditors, it’s time to tackle the final step of the Plan-Do-Check-ActCycle. This is when the ISMS committee tweaks and augments the program. However, you needn’t wait until after the audit for this. There’s nothing wrong with making changes to running processes in the middle of an audit or business cycle, it can just be a little more work on the paperwork side as controls transition from one state to another. In reality, the end of an audit is just an illusion, as most organizations run back-to-back audit periods. One audit period ends at midnight on December 31st and the next one starts at 12:01 a.m. on January 1st. Like security, the auditing truly never ends.

Nevertheless, people often use the end of an audit or a business year as a natural break point to reevaluate and adjust. Whenever you do it, you need to take a hard look at your program at least once a year. Over time, your IT security program must evolve and adapt, or it will fall into the trap of Fort Pulaski and be blindsided by new technology and new threats. You may also find that many controls aren’t working as well as you had hoped and process improvements could save money or provide new advantages. Security is an iterative process. The security program you built may end up looking radically different in a few years. There is never a single right solution and there are plenty of wrong ones.

Reviewing Everything

Before you make any big decisions, it’s a good idea to look at everything in perspective. We often have a strong bias to look at only the failures. We naturally focus on audit findings, security incidents, and outages. When security is working perfectly, nothing happens and business runs as usual. Aberrations and failures stick out like smoke plumes across a golden meadowland. There will be plenty of analysis of those problems, but don’t forget what didn’t happen and why it didn’t happen.

Reviewing What Worked

The best evidence of your success is your passing audit report. Every control without auditor comment or finding counts as a win. These are the processes that are working and have a good chance of continuing to work in the near future. These are the positive examples that you can build upon for other controls, or that you can expand as needed. Not every control survives unaltered over time, but at least you know something worked this time. Some technologies may work better in your organization than others. Sometimes the success comes from the design; sometimes it comes from the people involved; and sometimes it’s the process used. If the team was particularly successful in implementation or operation , consider spreading their expertise by having them train, mentor, lead, or document their lessons learned.

Near misses are when you almost had an audit finding or a serious security incident, but made it through with minimal impact. It’s likely you’ve had a few of those over the audit period. This could be a failed control that a secondary control caught, a security policy violation reported before it caused any damage, or a malware infection that was stopped mid-install because of a particular technological tool. These are also good things to examine, replicate, expand, or share throughout the enterprise.

Similarly, you should look at averted and potential attacks. This can come from log analysis work looking for attacks thwarted by timely patches, malware blocked by filters, and firewall blocks. You are looking for potential new threats or rising attack trends. Sometimes these kinds of successes are better looked at in aggregate with statistics, so let’s get down to numbers.

Chapter 12 talked about the quality of controls . Control quality breaks down into control coverage (assets covered) and control effectiveness (risk stopping power). Here’s a list of the ways that you can measure control quality:

  • Asset managementeffectiveness and coverage: Actual inventory vs. inventory records (do an annual hand count).

  • Vulnerability scan coverage: Scanned hosts vs. actual hosts; last scan date.

  • Vulnerability management effectiveness: Percentage of critical systems with no high/medium vulnerabilities.

  • Vulnerability managementeffectiveness: Vulnerabilities exploited in pen-test and/or hacks.

  • Vulnerability management coverage: Percentage of servers meeting hardening standard.

  • Vulnerability management effectiveness: Average time to close vulnerabilities (break down by severity, server type, etc.).

  • Disaster recovery coverage: Percentage of critical systems with recovery plan/tested recovery plan.

  • Scope containment: Pieces of scoped data found outside of scope.

  • Incident response effectiveness: Amount of time to detect; amount of time to respond vs. goals.

  • Access control effectiveness: Number of unauthorized users found.

  • Change control coverage: Number of authorized changes vs. changes.

  • Documentation coverage for a control: Policy, standard, procedures in place per control.

This is just a small list of ideas. You should deploy metrics and measures alongside the controls so that you can collect and review them at this time. Over time, you can see how a control matures and becomes more effective (or not). These are the kinds of statistics that are useful when you need to do executive presentations (here’s what we did this past year), customer demonstrations (here’s how we protect you), and budget justifications (here’s how we spent your money; we’d like some more).

Reviewing What Didn’t Work

Now the bad news. You need to look at where the wheels came off. Like successes, there are many rich sources of problems to review. If you really want to improve and future-proof your security program, then you should broaden your idea of what went wrong to include those aforementioned near misses, audit observations, and anywhere you fell short of expectations . The purpose isn’t to flog people for blame, but to look for weakening seals and rusty gears. Here’s a list of things to examine:

  • Audit findings (control failures, control objective failures)

  • Auditor observations

  • Security incidents

  • Security policy issues such as including violations, policy exceptions, and difficult enforcements/reminders

  • Unplanned outages and service failures

  • Vulnerability scan and penetration test results

  • Scoped information leakages

  • Scope changes without analysis and approval

  • Deadline violations, such as not patching within policy required period

It can be useful to keep a standardized register of all of these things. You should already be keeping your security incidents in a register, but you can add in these other things and create a summary timeline for analysis. The ISMS committee will want to review things at a higher level rather than rehash every detail. Table 24-1 shows an example of how to do this.

Table 24-1. Sample Security Incident Register

Date

Relevant Process

Incident

Impact

Feb

HR termination

Employee left firm— IT did not remove account

Internal audit finding

Apr

Backup

Service outage—backups failed

No restore capability for 3 days

May

Log review

IDS logs not reviewed for two days

Audit finding

May

Vuln scanning

Quarterly vuln scan 1 month late

Audit control failure in report

June

IT

Lost laptop at airport

Cost of laptop $1,000

July

HR onboarding

New-hire checklist records lost/not filled out

Audit finding

Sep

Access controls

Two-termed users not removed after two months

Internal audit finding

Nov

Change controls

Change control ticket missing approval

Audit finding

Dec

Access controls

Two servers had wrong password length settings

Audit finding

Dec

HR training

Two employees did not attend training

Audit finding

You can even take this a step further and use a formal incident framework like the Vocabulary for Event Recording and Incident Sharing (VERIS)1 to track and analyze incidents.

Analyzing the Data

All control failures are not alike, and more importantly, they are often just the tip of the iceberg. You should treat every major audit finding and control failure as a symptom of a larger problem. Sure, dumb luck is going to turn against you every now and then, but your controls should be resilient enough to withstand random misfortunes. Remember, there are people out there working relentlessly on creating bad luck for their victims. There are many reasons why a control might have failed. Here are some ideas and examples :

  • Failure to follow defined process… Why?

    • The importance of the process was not communicated well by management.

    • The documentation was unclear, insufficient, unavailable, and not used.

    • The training was insufficient; people weren’t trained.

    • Missing resources (didn’t have the people to do this).

    • The process was not practical, was incomplete, and/or was too difficult.

  • Technological failures… Why?

    • Lack of needed maintenance

      • Too much maintenance needed for our resources

    • Lack of needed oversight on technology

      • Too much oversight needed to work technology

      • Too complicated to operate, which yielded poor performance

    • Improperly deployed/installed

      • Insufficient training/documentation

      • Difficult to deploy/install

      • Does not provide feedback on status

    • Software bug/component failure

    • Missing feature

    • Misunderstood feature

  • Insufficient Design… Why?

    • No applicable control to cover this risk adequately

    • Process/attacker bypassed existing controls

    • Missing needed actions/steps/inputs

    • Did not provide adequate feedback on effectiveness

In addition to the possible causes, how the problem unfolded is also relevant. Was it a singular occurrence (or short bursts of occurrences)? Or does the problem occur every now and then? How was the problem detected? Do all the relevant processes and controls have sufficient mechanisms to provide feedback on their status? Did the maintenance needs of a control or process change unexpectedly or silently? How well did the organization recover from the incident, including both time to recover and effort to recover. All of this information can be useful in doing a root cause analysis (as described in Chapter 20).

There is a saying: No matter how it looks at first, it’s always a people problem. You should look at how your organization’s personnel acted before, during, and after the problem. Did they need help but not ask for it? Were they prevented from asking for help or providing a warning by internal culture, a lack of awareness, or no clear channel to communicate? When uncertainty or external pressure ramps up, people often retreat into their documented roles and make defensive decisions. If the process doesn’t call out for doing something, no matter how rational, employees may not deviate or speak up for fear of retribution. This can be a process design issue but also an education issue. The organization and the people involved need to understand the reason for the process and work toward the common goal.

Looking for Systematic Issues

When doing this analysis, look for what permitted the problem to happen and offer a solution at the right layer. Designs and systems can have different shearing layers,2 which means that they age and function at different rates. These layers can include organizational culture, physical facilities, IT infrastructure, ongoing IT projects, user behavior, and business process. All of these layers interact and change at different rates. Often, technological layers go obsolete faster than the people process.

Sometimes the problem is conceptual. The control design itself doesn’t match the reality of the process it’s protecting. This could be because of the activity’s bad model, a control that doesn’t fit the process (a common problem with off-the-shelf technology), or a system that has grown so complex that it no longer works well.

Sometimes the problem is resource constraints. This goes beyond having enough people or money. It can mean that the people involved do not have the skill set or correct mental framework to understand the process that they need to navigate. It could mean that the technology is being pushed past its limit, both in terms of load or feature set. There could be policy constraints , which goes beyond the current published policies but could also involve issues with cultural issues, regulatory mismatches, or informal processes baked into the culture.

Look for Things that Aren’t Broken yet, but Will Be

Beyond looking for existing problems, you should also be on guard for emerging problems. You need to keep an eye on processes that are slowly growing or requiring more resources. Without high-level oversight and intervention, some solutions can encroach on other useful work and become more trouble than they’re worth.

A likely suspect for potential future problems is persistent temporary solutions and work-arounds . These are the good enough solutions that someone slapped on until a better solution could be found. They worked well enough and people got busy, so the temporary solution is still there, years later. What began as a state of exception has become persistent, embedded, and worst of all, expanding. You do not want these duct tape solutions to become part of the permanent infrastructure.

You should review all major security policies as well. Although policies are supposed to be high-level and not very specific, it is possible that they have drifted from the original goals. Policies need to reflect the current practices and needs of the organization. If a policy goes slowly out of alignment, there could be control failures or wasted resources.

Lastly, you should ensure that the security critical analyses are updated for the new period. This includes asset analysis, threat analysis, impact analysis, and compliance analysis. All of these analyses should converge into your updated risk analysis. Your chosen risk analysis framework should have processes to ensure regularly updates. Like policy, if the risk analysis has not kept abreast of the organization’s environment, there could be serious problems.

Making Changes

The final part of the ISMS PDCA process is doing corrective action on the IT security program for processes that aren’t meeting goals. Now that you’ve collected data and done your analysis, it’s time to consider changes.

Look Before You Leap

Before you start making changes, you should first consider whether the change is worth it. You don’t have to solve every problem at once. If a control is mostly working, be sure to brainstorm on if the change will make things better or worse. Remember the thresholds of risk acceptance; your controls may work fine up to a point and any more work comes with a diminishing return. Consider the problem of lost laptops. Let’s say that you lost three this year, but because of your controls, they were all encrypted. You could do more security awareness training and add laptop-tracking controls. Perhaps this will drive down the number of lost laptops—but at what cost? Also, don’t forget to consider the opportunity costs. The money you spend on new controls is also the same money that the organization could use to further its mission.

Just like your original proposed controls, every control change or new control should include a business case describing the intended benefits and both the direct and indirect costs. When making a change to running control, you should enumerate your assumptions about the change being made and the impacts. If feasible, you can do trials and testing on the new control or process to check the results. Don’t forget the possibility of unintended consequences and how the organization itself may react to the change. Remember that there may be some lag time before the effects (both positive and negative) of changes become apparent. In addition to risk and control cost, you can use the confidence in the effectiveness of the change to prioritize your efforts.

Improving the Controls

When looking to improve your controls, look at the trade-off from administrative controls vs. technical controls. A control could be more effective if you replaced it with a technical control, or perhaps the opposite. Don’t limit yourself to just expanding and tweaking an existing process. Sometimes the best way forward is to redesign a control from scratch. There are other, simpler changes that you can try to improve things, such as additional training, reassigning responsibility of the control, adding new verification steps, or modifying the frequency of human interactions.

Over time, your controls need to get stronger and your analyses more thorough. Your adversaries will only be improving their attack technology and methods, so you need to keep improving too. Most successful organizations are constantly growing, so your security program needs to consider this as well. Can your security program stretch and manage twice its current load? Large-scale expansion is a good group exercise for the ISMS committee to consider. The exercise may also uncover new ideas to improve control in stopping power, coverage, and the cost to run and deploy.

For SSAE 16 SOC 1 audits, you can also look at redesigning entire control objectives. It’s not unheard of to move controls around between objectives to strengthen coverage and build more defense in depth. Similarly, you may have used a single control to support two different objectives; for example, security policy for both access control and human resource security. If an audit finds problems with that control, its failure endangers two different control objectives. It may be efficient to share controls between objectives, but if that control has problems, you could risk having twice the audit findings.

Speaking of audit findings, those findings should definitely feed into the work of the internal audit team. Anything that the auditor finds needs to be checked and rechecked for a long while. Sometimes audit findings and observations aren’t around control failures but on the accompanying records and proof. Record management for running controls should also be improved, with internal audits that keep an eye on things.

Ideas for Improvement

How do you specifically improve or fix a control or process? Here are some ideas to consider for your improvement plans :

  • Additional training and testing on that training

  • Create and require the use of checklists

  • Automate all or part of the process

  • Split up a complex process step into substeps

  • Add additional supervision, monitoring, or oversight

  • Add an end-of-process review

Another improvement idea to enforce attention to detail on critical but tedious tasks is to create mechanisms for deliberate actions. For example, if you need someone to do daily log review, require him to fill out a short blank form with a summary of what was in the logs. This goes beyond just checking the box that he viewed the alerts and charts, but moves him to absorb, think, and explain the data. In high-stress, low-tolerance jobs like piloting an aircraft or ship, deliberate action is enforced through the use of verbalization of actions being performed; for example, the helmsman who says “I intend to make right full rudder in 3-2-1…” before he turns. The idea isn’t just to communicate to everyone what is going on, but force the operator to be specific and careful in his or her intentions and actions.

Bridge Letters

For SSAE 16 Type 2 audits, it is possible to have gaps between audit periods. Some firms align their audit periods based on their fiscal year and not the calendar year. For example, an audit may begin on June 1st and run for twelve months until next year. However, some external parties will want an audit based on a calendar year, so there is a gap in audit coverage until the next report is issued. Similarly, some firms do only six- or nine-month SSAE 16 Type 2 audits for cost reasons, also leaving gaps in coverage. In these cases, the audited party can issue a formal attestation, called a bridge letter, to cover the gap. This letter defines the coverage gap and attests that the organization has not made any material changes to the controls in place. It’s not as good as an audit, but if sandwiched between two good audit reports, it’s usually sufficient.

Rolling out a Change Plan

Once you’ve decided what is going to be changed, you need to build a prioritized project plan. In addition to having the ISMS committee budget resources and assign responsibility for the project work, they should also consider time-lines, feedback and progress milestones. When you change or replace controls, you want to have controls overlapping so that the risk and compliance needs are fully covered. More risky, but more common, is a direct transition at a particular time. From this date forward, we will be using the new process to do background checks and retire the old one. If you can get control overlap, then you have the opportunity to analyze the feedback and adjust the new control before dropping the old one.

We Can Never Stop Trying to Improve

For the organization, the executive team, and yourself, the goal is no unpleasant surprises. You don’t want to be surprised by a risk that you hadn’t thought about, an asset sitting unprotected in danger, a control failure that you hadn’t anticipated, or an audit report full of findings. If you work hard and grind away at your assessments and control work, then you are doing the best that you can to reduce unforeseen calamities.

In the end, nothing that is functional can also be impregnable. The Internet ag e has moved into a time of cyber cold war. Our organizations are being covertly targeted and attacked by governments, industrial spies, and political groups. Large-scale industrial espionage, denial of service, ransomware, and revenge hacking are becoming normal. A security program that can meet these challenges requires relentless attention to detail and a deep commitment to reevaluating your assumptions. You need to do your best—and then push to be better—every day. In the end, what you protect needs to feel like an extension of you. You need to know it, feel it, and defend it as if it’s where you live.

What I have described in this book is by no means the final word for building or maintaining a security program. Every organization is different and unique. I challenge you to build things better than what I’ve described here. This is only the beginning. The rest is up to you. Never give up. The world needs your good work.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset