Fred Cohen & Associates

Managing Network Security

Penetration Testing?

by Fred Cohen

Series Introduction

Over the last several years, computing has changed to an almost purely networked environment, but the technical aspects of information protection have not kept up. As a result, the success of information security programs has increasingly become a function of our ability to make prudent management decisions about organizational activities. Managing Network Security takes a management view of protection and seeks to reconcile the need for security with the limitations of technology.

Why People Do Penetration Testing

Bill Murray - a well known computer security consultant for one of the major global CPA firms says something to the effect of: If you have to ask, you're not secure. He goes on to explain: You don't get computer security by accident. In fact, you can just barely get it if you work really hard at it. And if by some accidental miracle or magic, you were secure today, you would not be secure tomorrow, because things change. I have to agree with him on this one, and I find that in light of this, it is a strange thing that so many people choose penetration testing as a first step in building a computer security program.

It may seem like I've gone over the edge, but I'm convinced that the reason for this is that top management simply does not trust their information protection experts. Let me explain by analogy. If a house inspector came to check a house you were about to buy and told you that, after a detailed examination, there was clear evidence of termites, would you: (a) buy the house anyway? (b) ask them to introduce new termites to see if the house fell down within the next week? (c) have them exterminate the termites? or (d) not buy the house?

If you answered (a) it must be because all information technology is full of holes anyway and you had to buy something. If you answered (b) you probably want to have tiger teams come in and test your network. If you answered (c) you are now probably engaged in a substantial information protection program. If you answered (d) you probably know that there are other houses without termites - but this doesn't apply to information technology today, so you should go back and chose another answer.

Relating this back to the subject at hand, answer (b) indicates a lack of belief in the inspection. If you believe you have substantial holes in your information protection program, it seems redundant and foolish to ask people that you don't really know to break in and cause the very things you are trying to avoid.

I have heard many people talk about strong advocates of security as being paranoid. In fact, many people rate how seriously somebody takes information protection by saying that they are more or less paranoid. While the term may seem appropriate for anyone who would worry about somebody guessing a password and bringing down the entire corporate network, if guessing a single password would do this (don't laugh - I have consulted for more than one major corporation where this was the case) it is not paranoia. Paranoia is irrational fear. Serious concerns about such weak protection is not irrational and is not fear.

The reason I bring this up is that there is a cultural creation (Ahhhh! A conspiracy theory!) surrounding information technology that, in essence, asserts that computers are perfect. You may recognize this phenomena when dealing with almost any business which uses computers to aide the people who answer the telephone. Mr. Jones, the computer says you are dead, and therefore, you cannot possibly be talking to me. (This - or things like it - have actually happened to people.) As people are increasingly exposed to computers, they lose some of their blind trust, but still it's an issue to be dealt with.

Getting back to the security expert who tells top management that corporate information technology has inadequate protection – top management does not want to hear such things - there is a cultural bias against believing that computers are imperfect - there is a cultural theme of calling people who identify vulnerabilities paranoid – so management takes what it believes is a prudent step before spending all that money on security - it asks for proof.

Penetration Testing vs. Protection Testing

One of my clients in the San Francisco bay area recently asked me to outline a protection testing program that would allow them to make prudent decisions about purchasing vulnerability testing software. In my report, I explained that, since they didn't have well defined control requirements, protection testing was essentially impossible. Now this may seem fairly harsh, but I think it is right on point. What is a bit confusing about it is that few people understand the difference between protection testing and penetration testing. To clarify:

Penetration testing is an effort to penetrate a system in order to demonstrate that protection has weaknesses.
Protection testing is a way to confirm or refute, through empirical evidence, that controls are functioning as they should be.

The difference is quite stark when you think about it. For example, it is quite possible that penetration testing will succeed at detecting a vulnerability even though controls are functioning as they should be. Similarly, it is quite common for penetration testing to fail to detect a vulnerability even though controls are not operating at all as they should be.

I used italics to clarify what I think is the key distinction. Just because somebody can break into a facility doesn't mean that they should not be able to, and just because they cannot break in, doesn't mean that the facility is appropriately protected. I have been to many companies where the back door is left wide open in the summer to let air flow through the shipping area. A physical penetration is often simple under these conditions, but just because I can enter the loading dock doesn't mean that I can do any real harm to the corporation - or that it would be worth their cost to secure that particular loading dock. The same loading dock may be quite secure in the middle of winter, and if it happens that my penetration test is done when the door is locked, the test might show that entry through that door failed.

If we took a protection testing view on the same loading dock, we would first have to ask what controls were supposed to be in place. Suppose that the relevant policies were (briefly) that (1) only authorized people are allowed in the loading area while automatic loading equipment is operating (for safety reasons) and (2) corporate assets should be protected to a level commensurate with their value. Suppose that the controls put in place to assure this were (1) signs will be posted indicating the safety hazards and barring unauthorized individuals from entry and (2) no items valued at more than $10,000 that can be easily carried away will be left unattended in the shipping area. The penetration test described earlier does not match up very well with these controls, and for that reason, the penetration test may succeed but it doesn't test the controls, and the penetration test may fail, but that doesn't indicate that the controls are working as they should be.

Testing Methods and Limitations

Testing brings up one of the many issues that was first formally taught to me in a philosophy of science course at the University of Pittsburgh in the early 1980s. The particular issue I had in mind was taught under the subject heading of the testability of scientific theories, with the best citation pointing to Karl Popper – the originator of this theory some hundreds of years ago. The theory asserts that for any universal proposition about infinite sets (e.g., all birds have three toes), no matter how many examples you find that confirm the theory, it is not proven, but a single example that refutes the theory, makes the theory as a whole invalid. Put another way, no matter how many easily portable $10,000 items we shove at the loading dock, we can never prove that the loading dock people will always follow the rules simply by watching them follow them again and again - but if we ever see the people at the loading dock leave one of those packages unattended, we have demonstrated a failure.

This example is particularly worthwhile because it begs the question - How good do we have to be? (This is also the name of an interesting recent book about how some people feel guilty about the minorest of mistakes and other such life-related topics.) Many of us would probably agree that after a few successful tests, we could reasonably conclude that the people in the loading dock are doing their job properly. The more picky among us might do some sort of a risk analysis and use a statistically valid sample to assess the quality of the loading dock protection program. Thus we can reasonably view protection testing as an audit function designed to assure that risk management is being properly carried out in practice.

This approach to verifying the proper operation of controls can also be described in terms of a notion called coverage. In the language of digital circuit designers, coverage indicates which portion of the modeled faults are tested by the testing technique. The notion of coverage is of particular interest in understanding the effectiveness of penetration testing, and we will use the term rather liberally from now on.

For example, suppose we use one of the off-the-shelf commercial Internet test suites as our tester. If we assume that attackers will be using off-the-shelf attack tools that the designer of the test program has included in their fault model and that protection in practice matches protection in testing, the coverage may be nearly 100% But these assumptions, while common, are usually quite faulty.

The first flaw in this fault model is that, while many hackers use off-the shelf tools for their attacks, even the mildly cleaver ones augment them or combine them, and the more serious attackers create far more advanced tools. The second flaw is that the designer of the testing tool has access to all of the attack tools that the real attackers have access to. In most of thew cases I have seen, the testers are not even close to having a substantial portion of the available attack tools covered. The third flaw is the assumption that the test results will be reflected in the practical system. The most common example of this is the use of an attack tool launched from only one or a few IP addresses. Because of the way many modern protection methods operate, the results of tests performed from one IP address may not match those from other IP addresses. Thus the tool could indicate numerous flaws where none actually existed (e.g., if operated from an address that is authorized to perform those functions) or indicate no flaws where there were many. (e.g., if operated from an address locked out of all access when other addresses were not)

In order to make sensible decisions about the effectiveness of testing, it appears that we need to have a notion of coverage relative to a valid fault model. Coverage can often be mathematically derived once a fault model is created, while creating a fault model is often done by doing a threat assessment, an analysis of vulnerabilities, and an analysis of the impact of different attack mechanisms used by the threat model against resources being protected. This is not a simple matter, which is why many organizations simply take the off-the-shelf tests and sell them to themselves as the 80% solution.

The 80% Solution - NOT!!!

One of the phrases I hear most often with regards to testing these days is that the client only wants an 80% solution. They say this because they believe that a 100% solution is very expensive – or perhaps unattainable. They conclude that an off-the-shelf tester is their 80% solution. In order to understand how flawed this notion is, we should first consider the question: 80% of what? If we don't have a fault model, how can we claim that this solution covers 80% of it? If we have a fault model and take the time to show 80% coverage, it might be easy to get to 99.999% coverage by putting in only a little bit more work. The 80% solution sounds good, but in this case, it simply covers up our lack of knowledge about the actual environment.

But suppose we actually did an analysis and discovered that the solution covered 80% of something. Let us say for the purposes of simplicity, it's 80% of the off-the-shelf attack methods used by common hackers from over the Internet. An average attack tool tries something on the order of 25 different attack methods in a period of a few minutes. So if 80% of these attacks are detected by the testing tool, that means that 5 out of the 25 attack methods used in the typical break-in attempt will not be tested by our tool. In other words, the average attack attempt will have 5 attacks that we never tested because we only used an 80% solution. If 80% of the things we did not test for succeed against our system, this means that the average attack suite will have one successful method of breaking into our system. So our 80% solution for the defender is, in effect, a 100% solution for the attacker. After all, the attacker's goal is to find any single way to bypass controls, and it doesn't matter to the attacker that only one of their 25 methods works, or which of them it is.

The Risks of Penetration Testing

Many people think that running a penetration tester is a safe way to quickly get a handle on obvious vulnerabilities and is an effective tool for gaining management's attention so that they will apply more resources toward protection. In some cases, I agree with this, but there are also some substantial hazards associated with these tests that must be balanced against their potential benefits.

The three major hazards introduced by such tests appear to be; (1) the tests may corrupt information, deny services, or leak information, thus causing some of the very harm they are intended to prevent by detecting the potential for; (2) the tests may leave the systems more vulnerable than they were before testing; and (3) the results may be something that management does not want to hear.

Any decent test of Internet protection to day should be testing for things like UDP viruses. If they do so, they may also risk bringing down the system under test, since ineffective protection will likely lead to a system failure. If the designers are careful, these problems can usually be avoided, but most designers are not careful, and many systems have crashed, had information deleted, or had information leaked as a direct result of testing. Perhaps more interestingly, the way most test designers avoid these problems is by not including tests for the more dangerous attacks. Thus some of the things not tested in many such tests are the most dangerous things that attackers can do to systems. Even in human-based penetration tests, the people performing the tests often gain access to substantial amounts of company confidential information. There is a serious risk that this information will be leaked by the testers or used by them to gain some other financial advantage.

One of the more interesting things that can be observed in protection testing is that the state of the system under test after the test is performed is often different from the state of the system before testing is performed. This means that the test has somehow effected the system. In many cases, these effects may be minor and have no effect on protection, but in other cases, these state changes may introduce new vulnerabilities into systems. As a good example, suppose that the attack involved creating a file in a temporary area, setting it to be a setUID program owned by root, (under Unix, setUID programs are granted the capabilities of their owners when executed, and the root account is the superuser and is granted unlimited access to all aspects of the system) and then using it to mail a copy of the hidden password file to an outside attacker. Suppose now that the mail program was misconfigured and failed to send out the result. This is an example of an attack that succeeds in penetrating a system but fails to return an intended result because of some unrelated configuration error. The residual setUID program that remains on the system can then be exploited by other users or attackers to gain unlimited access.

In the case where management doesn't want to hear about it, they may choose to kill the messenger. I understand that there are a lot of openings in information protection these days, so this should not be too great a deterrent. In the United States, the Department of Defense announced the results of such tests done on their own systems and, partially as a result of this effort, the United States is acting to mitigate some of these vulnerabilities. It has only taken three years after the test results were announced to get to the point where the United States is seriously considering acting on this issue, and I think you can expect even better performance than this from major corporations.

Summary and Conclusions

Penetration testing is commonly used, but its overall effectiveness in improving protection is quite questionable. While it can be effective in raising awareness and in demonstrating select system weaknesses, it also has a number of negative side effects that should be considered before use. A more serious protection testing program should be considered as an alternative or as an enhancement of penetration testing efforts.

About The Author

Fred Cohen is a Principal Member of Technical Staff at Sandia National Laboratories and a Senior Partner of Fred Cohen and Associates in Livermore California, an executive consulting and education group specializing information protection. He can be reached by sending email to fred at all.net.