Troubleshooting techniques are critical for today’s networks, and network technicians need to recognize and
repair problems as fast as possible. The variety of data traversing the network is quite different today than it
was a few years back. With the addition of Voice and Video applications understanding the most overlooked
causes of trouble in Cisco Networks will aid in correcting most problems quickly and easily. This white paper is a
guide to the top ten causes of trouble in a Cisco network and how to address them, but is not intended to be a
1. Not Knowing Where To Start
Getting information from the users is critical in understanding where to start troubleshooting any network problems.
Asking questions that will assist in defining the problem is just as important as correcting the problem.
Asking the users if what is broken ever worked is one of the biggest causes of trouble network administrators’
face. Users will report network issues with an application when all of their troubleshooting has been exhausted,
and it is up to the network administrator to glean this information when first contacted by the user. If network
administrators do not ask the right questions, it could lead to wasted time troubleshooting perceived network
issues that are really not network issues. Network administrators could also cause other network problems
while troubleshooting. Problems for other users could result when performing troubleshooting steps that may
be unnecessary and do not lead to the cause of the problem.
2. Out-of-Date Documentation
In today’s market, organizations are forced to do more with less. When projects are being completed it is imperative
to take the time to complete the documentation that goes along with that project. However, there are
instances where time is not available to get the documentation completed. When troubleshooting a problem, it
is critical to have the proper updated documentation. Having the updated documentation available helps ensure
that unnecessary troubleshooting steps are avoided. Proper documentation helps network administrators to
focus on where the trouble is.
The following diagram shows a simple network. If a user connected to Switch A reports problems connecting
to a device on Switch B, then there is no reason to include the HQRouter in the troubleshooting steps. However,
if there were no documentation or the documentation was not up-to-date network administrators could waste
time by performing troubleshooting on the HQRouter and add time to resolving the true issue. In this case
troubleshooting should be limited to Switch A, Branch, and Switch B.
3. Misunderstanding the Processes
In order to understand possible causes of trouble, a network administrator needs to understand the processes
involved in sending and receiving data. There are many hours of troubleshooting time spent on working problems
in the network due to lack of process knowledge. Not knowing how a process works can lead to inefficient
troubleshooting. An example would be troubleshooting an application issue when the real problem is that
there is a wrong IP address assigned to the device who’s application is having problems. Almost everything in
networking is a process and network administrators that are not aware of those processes are asking for extra
unnecessary work. The processes that network administrators need to be aware of are related to the OSI model.
Layer two processes are different than layer three processes but knowing these can assist in narrowing the
focus of troubleshooting.
4. Complex Designs
In large networks the design can be very complex. When you’re in this type of environment it is very easy to
make troubleshooting mistakes. Cisco is versatile enough to provide, in some circumstances, many different
methods to accomplish the same goal. When methods are not standardized and more than one method is used
troubleshooting can become more detailed and involved.
5. Redundant Troubleshooting
Redundant troubleshooting relates to having either more than one network administrator performs the same
task or one administrator performs the same task over and over. Although redundant troubleshooting is not a
cause of a network problem, it extends the duration of the problem. Good communications skills are essential
between network administrators in order to avoid redundant troubleshooting. Two network administrators could
be troubleshooting different problems with common devices. For example, two different users report problems
connecting to the same application server. This situation should cause network administrators to look to the
common denominator, which may be the connection from the network device to the server or the path between
the users and the server. By using good communication skills, the two network administrators could decrease
the time needed to isolate and correct the problem.
6. Lack of Communication
Some troubles in the network will need to be passed along to others who can continue troubleshooting. In some
of these cases some troubleshooting steps may be duplicated due to lack of communications between administrators.
This lack of communication can lead to hours of wasted time duplicating efforts and still not lead you
closer to resolving the problem. Having a method of turning over problems is very important for administrators
who work in networks with a need for 24/7 support. This turnover should be as detailed as possible to ensure
that efforts are not duplicated and more focus can be applied to the issue.
7. Complex Configurations
Complex designs sometimes require complex configurations on network devices. Understanding of design
helps network administrators understand and interpret the output of show and debug commands. Network
administrators need to understand the processes required to implement and support these types of configurations.
When administrators understand why things are the way they are then they can successfully support
and troubleshoot. However, today’s networks support a lot more then the networks of just a few short years
ago. Things like VoD (Video on Demand), video conferencing, streaming video and audio, and VoIP (Voice over
Internet Protocol) have added to the complexity of the configurations. The need to understand how the different
processes impact the network is very important so network administrators can recognize the types of configurations
implemented and why they are necessary.
8. Misunderstanding of Device Output
When complex configurations are required, output from the verification commands can be intimidating and
confusing. Network administrators should be aware of and have access to tools that assist in understanding
the meaning of output from a show or debug command. Even when debugging a single voice call, depending
on the call control model being used, the output may show up in different formats. One command will output
pages of information that must be sifted through to find a single detail that would point to the possible cause
of a problem while another command would output a few lines of information and is clear and concise. A firm
understanding of the processes of the protocols would also assist in understanding the output from certain
show and debug commands. Network administrators should have access to and use Cisco’s output interpreter
(on their web site) to assist in understanding the output from show and debug commands.
9. Lack of Documentation of the Fix
Documenting what corrected a problem is as important as having accurate documentation. There are many
different types of applications moving across the network, and one application may be experiencing problems
when another is not. When the cause of the problem is identified, document it so when the same or similar
problem occurs network administrators can quickly recognize or eliminate it as the cause.
10. Forgetting to KISS (Keep It Simple, Sir)
KISS is still one of the most important approaches when it comes to network design. The more complex the
design the more complex the configuration. Complex designs add layer upon layer of complexity to the configuration
which in turn adds to troubleshooting headaches. Keeping it simple helps administrators narrow the focus
to assist in pinpointing the cause of problems in the network. With the diverse applications running in today’s
networks, following this rule is more and more difficult. Occam’s razor is a theory that states “when you have
two competing theories that make exactly the same predictions, the simpler one is the better.” Although in
networking this is not always the case, when problems arise in complex network implementations it is wise to
keep it in mind.
Problems in today’s complex networks that support diverse applications require network administrators to
have good communication skills with both their peers and their customers. These communication skills assist in
gaining access to the information required to isolate possible causes and the understanding of the processes in
order to correct any issues.
Proper documentation shortens any downtime due to unnecessary troubleshooting of devices that are not a part
of the data path and, therefore, not a part of the problem. Remember the value of Occum’s razor and start at
some of the easiest causes and work your way up to the more complex causes. Documenting the solution is also
very important and can be overlooked due to the volume of problems that need attention but, when solutions
are documented it assists the next administrator in either ruling in or out possible causes. Keeping it simple may
not be classified as a cause of network problems but, it can extend network outages due to the complexity and
the requirements of the network.