Applies to LANDESK Management Suite 9.0 and 9.5
OSD Logs
LANDESK OSD creates logs for every job that is run. Any failures are noted in the log and can be used to troubleshoot and resolve the problem.
Finding the Logs
The logs will be located on the core server. Locally they are in \Program Files\LANDESK\ManagementSuite\log\. Remotely they can be found at \\CORE\ldmain\log\. The logs will be named CJ-OSD-ScriptName-0-MMDDYYYY-HHMMSS.log.
Reading the Logs
OSD jobs are processed by the Custom Job Processor, so the logs will follow the format of other Custom Job logs. Please see the following example of a deployment job.
Note: Because of the web format, the lines may wrap. In the actual log, each step/command is on one line and will only wrap if word wrap is enabled.
Each line in the log indicates a new command run on a client. The line will only appear in the log once the command is complete, either successful or failed. This means that the machine is currently running the line AFTER the last line in the log.
If the job was run on a single device, or from the PXE boot menu, it will only have one device listed. If it was run on multiple devices as part of a larger job, each line will list the computer and the command so the full script for each computer is scattered throughout the log.
Log
"Machine","CbaStatus","ExitCode","Duration","Begin","End","Command"
"001E4F53C173","OK",0,0:00:00,2/8/2010 12:30:49 PM,2/8/2010 12:30:49 PM,"WINPE, TIMEOUT=1800"
"001E4F53C173","ERR_Fail",-2147481753,0:00:23,2/8/2010 12:30:49 PM,2/8/2010 12:31:12 PM,"diskpart /s X:\LDClient\rmvol.txt"
; "Job Complete","0 Done","1 Failed","0 Off","0 Unknown"
Key
Computer Name - This is the name that the core has for the client. If the client is not in the database, or cannot be identified, this will be the MAC address
- Status - This is the status of the command. OK is good. ERR_Fail or ERR_Timeout are bad.
- Exit Code - This is the actual exit code returned by the command. It can be used to track down and resolve the problem.
- Duration - This is how long the command took to run. This can be useful in determining if the command behaved as expected.
- Begin and End - These are the date and time stamps when the action started and ended. They are rarely useful.
- Command - This is the actual command that was run.
- Job Status - This is the final status of the job with machine counts for each condition
Troubleshooting
There are a few different values and clues in the log that can be used to identify the problem. The most common are the exit code, status, and the duration.
Status
The status will usually be either OK, ERR_Fail or occasionally ERR_Timeout. These are pretty self explanatory. If the custom job processor ever gets a ERR_Fail or ERR_Timeout, it will stop processing the script and the job will be marked as failed. The final line in the log (before the job status line) will be the line that failed. Usually you can search for more information about the command or the exit code to find possible resolutions.
Sometimes the job completes successfully and every status is OK but in reality the job didn't work correctly or as expected. In this case, each line has to be evaluated and ideally compared to a known good log. Because of the variety of utilities that can be used in and OSD script and the widely varying exit codes, LANDESK does not always interpret the exit code correctly as a failure. In these cases, use the exit code of the line that you think failed to look for solutions. Examples of where this can happen include diskpart and imagew (depending on the failure reason).
Exit Code
The exit code is the actual return value that the command returned. For example, if diskpart is run and the command line arguments were wrong, diskpart returns a 2. The OSD log with therefore contain a 2 as the exit code.
Normally an exit code of 0 indicates success and anything else is a failure. However some vendors stray from this pattern. If a command has a non-zero exit code, the best place to look for an explanation of that code is from the vendor. In the case of diskpart, a list of exit codes (0-5) can be found here, about 1/3 down the page.
A Description of the Diskpart Command-Line Utility
Note: LANDESK OSD uses diskpart several times for operations on the disk. 0 is success, but not all "failures" are real failures. For example, one command goes through all drive letter mapping from A to Z and removes the mount point so that drives can be mapped correctly. However no one has 26 drives so at some point, diskpart will try to remove a drive letter mapping that doesn't exist. In this case, diskpart returns a non-zero value, but there is no "actual" failure. Other utilities may behave in a similar fashion. Information from the vendor and understanding of the command will help identify these cases.
Duration
The duration listed in the log is the amount of time from start to finish that the command took. This is most useful for the drvmap (mapping drives) and actual image utility commands (imagex, ghost, etc).
Most actions in the OSD script don't take much longer than a few seconds to complete. The obvious exception is the command that is actually restoring (or capturing the disk or files). If the duration seems wrong, it probably is, and that command should be looked at carefully. Some examples would be:
- drvmap - This is the utility that maps the needed drives. It should only take a few seconds. It will timeout at about 2 minutes. If it times out, the most likely cause is incorrect credentials or the client cannot contact the target with the path specified. For example, hostname only might not work. The client may need the FQDN of the target.
- Imaging utility (imagew, imagex, ghost etc) - Sometimes the imaging tool runs in a minute or less. Unless the target computer is connected to a fiber-optic network with the world's fastest disks, something is probably not right. This can be a problem accessing the image file, the image file is invalid or corrupt, the client can't get to the disk (drivers), the disk isn't big enough, and a variety of other possibilities. Use the exit code to figure out exactly what is happening. Note: In these cases, usually the OSD job will continue and possibly complete "successfully". In reality, the machine didn't end up how it should have, so the only way to verify the image deployed correctly is the check the machine.