By
Martin Moore, Leader of the UNIX Expert Support Team, Hewlett Packard Corporation.
Steven Hancock, File Systems Support Engineer, HP's Tru64 UNIX Support Engineering Group.
Description
Dealing with system problems—from user login failures to server crashes--is a critical part of a system administrator's job. A down system
can cost a business thousands of dollars per minute. But there is little or no information available on how to troubleshoot and correct
system problems; in most cases, these skills are learned in an ad-hoc manner, usually in the pressure-cooker environment of a crisis.
This is the first book to address this lack of information.
The authors (both experienced Tru64 UNIX support engineer for Compaq)
systematically present the techniques and tools needed to find and fix system problems. The first part of the book presents the general
principles and techniques needed in system troubleshooting. These principles and techniques are useful not only for UNIX system administrators,
but for anyone who needs to find and fix system problems. After this foundation, the authors describe troubleshooting tools used in the
UNIX environment. The remainder of the book covers specific areas of the Tru64 UNIX operating system in detail: listing common problems,
their causes, how to detect them, and how to correct them. Each chapter includes a "Before You Call Support" section that details the
most important things to check and correct before it's necessary to call Compaq technical support. The authors also include decision
trees to help the reader systematically isolate particular problem types.
Audience:
Tru64 system administrators, system engineers and support specialists; Students in Tru64 sys admin and advanced sys admin training courses