2008-12-28

Trouble Shooting Guide lines

During the earlier days of my career I found my self constantly being called on Costumer site for trouble shooting reasons .

I would usually discover when arriving there that the so called complex problem is easily solved by a simple and short analysis session.

I tried to lay out some guidelines for others to follow in order to enable them to do the same.

What it basically comes down to is asking the right set of questions in the right order.

the questions which guided me were:
  1. locate the first step/ component to indicate a problem
  2. what is the cpu/ mem utilization?
  3. which application is consuming the most resources?
  4. open the log files
  5. look for the most recent error message
  6. what does it say
  7. what does it mean
When answering these last two questions you can generally solve most problems.
At one time I traveled to one of our costumer sites in the east.
Problem report was system was malfunctioning and missing several input responses.
When arriving on site I noticed that the gate component was running at high CPU utilization (~90% steady state)
when reviewing the logs I noticed a configuration error.
Once I handled this I noticed that the CPU did not drop.
I reviewed the logs again and noticed that the same error message is being printed at a very high rate ( which was causing the high CPU utilization).
I turned to the programmer and requested to reduce the debug level for the specific message (since it was pointing to an not mapped input type and not a system problem)
Once this was accomplished all other problems were gone.
Problem Solved.

If the field Engineer /Support follows this rules he can generally get to the bottom of the problems or at least point R&D in the right direction

This in my opinion is the true meaning of On site Field Engineering, other wise it would mean that the Field Engineer is expected only to perform the "Next, Next, Next, ..." installation type.... Is this what Field Engineers are for ?!?

I Don't think so.


No comments:

Post a Comment