Copyright |
© Ericsson AB 2010. All rights reserved. No part of this document may be reproduced in any form without the written permission of the copyright owner. | |||
Disclaimer |
The contents of this document are subject to revision without notice due to continued progress in methodology, design and manufacturing. Ericsson shall have no liability for any error or damage of any kind resulting from the use of this document. | |||
Trademark List |
|

1 Introduction
This document describes how to collect troubleshooting data from the SmartEdge® router. This data is required for problems that are escalated to Ericsson support and can be used to determine system status.
1.1 Prerequisites
To use this document, you should be fully trained and experienced in operating the SmartEdge system. You must understand the basic system architecture and be familiar with navigating the command-line interface (CLI) modes.
Although it is not required, expertise and experience with the UNIX operating system are useful.
1.2 Terms
In the following sections, the term controller card applies to any version of the Cross-Connect Route Processor (XCRP) Controller card, including the controller carrier card, unless otherwise noted.
For commonly-used abbreviations, see the Glossary.
2 Mandatory Data Collection Tasks
Whenever a system or network problem requires data collection for troubleshooting, you must define the problem and collect comprehensive system information.
For information on accessing the primary and secondary XCRP cards and the NetBSD shell and enable logging, see Access the SmartEdge System Components.
2.1 Define the Problem
Before you can begin collecting troubleshooting information, the problem must be defined, including:
- Define the problem. Data gathering begins at this stage, and continues to problem resolution. What log messages or errors occurred?
- Describe and consider the symptoms. Are the symptoms signs of known issues? Research this in the Resolved and Known Issues documents for the software release.
- Were there any recent changes to the system or network? For example, was software upgraded, did the configuration change, or did the problem occur by itself?
- What are possible causes of the problem? Consider the possibilities, based on your experience and knowledge.
- Are the variables contributing to the symptoms controllable?
2.2 Collect Comprehensive System Information
The show tech-support command, an all-in-one information collection tool, provides the foundation for data collection. This macro runs many frequently used show commands and displays both hardware- and software-oriented information. Always run the show tech-support macro (in exec mode) and provide the output to the support organization when opening a case or escalating one to upper-level support.
This macro runs many commands, which provide comprehensive, current system information, including the following:
- show backplane-status
- show chassis
- show configuration
- show crashfiles
- show disk
- show hardware detail
- show history global
- show memory
- show process
- show release
- show version
To restrict output to commands relevant to the Advanced Services Engine (ASE) card, add the ase keyword to the show tech-support command.
2.3 Determine Basic Hardware and Software Configuration
- Note:
- Most of the commands in this section are included in the show tech-support command.
You may need to determine the software release, hardware platform, and traffic card configuration in your SmartEdge router deployment. This section describes how to collect detailed information about software and hardware configurations. For more information about the commands in this section, see Command List.
To provide information about the software version and related information, enter the show version, show chassis, or show hardware commands (plain or with keywords), in any mode, to provide detailed hardware information from different angles.
For the procedures to send the output from these commands to the support organization, see Section 3.
2.3.1 Software Versions
Use the show version command to collect:
- SmartEdge version
- Minikernel version
- Bootstrap version
The output also reports how long the system has been continuously running since the last reboot.
Non-matching system components can lead to unexpected problems. The SmartEdge OS, Release 6.1.5.1, 6.2.1, and 6.3.1 require the following SmartEdge OS, OFW, and minikernel components to run the specific versions listed in Table 1 (6.1.5.1), Table 2 (6.2.1), and Table 3 (6.3.1).
Category |
XCRP4 |
SmartEdge 100 |
ASE Card |
---|---|---|---|
Open Firmware |
2.0.2.37 |
2.0.1.4 |
2.0.2.33 |
Minikernel |
11.7 |
2.7 |
Category |
XCRP4 |
SmartEdge 100 |
ASE Card |
---|---|---|---|
Open Firmware |
2.0.2.42 |
2.0.1.4 |
2.0.2.42 |
Minikernel |
11.7 |
2.7 |
Category |
XCRP4 |
SmartEdge 100 |
ASE Card |
---|---|---|---|
Open Firmware |
2.0.2.45 |
2.0.1.4 |
2.0.2.45 |
Minikernel |
11.7 |
2.7 |
13.5 |
2.3.2 Chassis Information
Use the show chassis command to collect:
- Platform type (SmartEdge 100, 400, 800, 1200)
- Installed hardware modules and their status
- Configured hardware modules
- Hardware module status flags, reporting such details as on-board component and software status for cards.
2.3.3 Chassis Power Status
Use the show chassis command with the power keyword to display a summary of power allocation for the current SmartEdge chassis configuration. It displays the required and allocated power for each slot.
2.3.4 Chassis Power Consumption For All Modules
Use the show chassis command with the power inventory keywords to determine the power consumption for all supported modules (even those not present) in the current platform.
The following example displays the power consumption output for a SmartEdge 400 chassis:
[local]SmartEdge#show chassis power inventory Chassis Type Power Capacity -------------------------------------------- SE400 801.60 Watts 16.70 A@-48V XCRP Type Power Consumption ----------------------------------------------- xcrp3 41.28 Watts 0.86 A@-48V Traffic Card Type Power Consumption -------------------------------------------------- 10ge-1-port 130.56 Watts 2.72 A@-48V atm-ds3-12-port 88.32 Watts 1.84 A@-48V atm-oc12-1-port 92.16 Watts 1.92 A@-48V atm-oc12e-1-port 86.40 Watts 1.80 A@-48V atm-oc3-2-port 92.16 Watts 1.92 A@-48V atm-oc3-4-port 90.24 Watts 1.88 A@-48V ch-ds3-12-port 90.24 Watts 1.88 A@-48V ch-ds3-3-port 90.24 Watts 1.88 A@-48V ch-e1ds0-24-port 92.16 Watts 1.92 A@-48V
2.3.5 Hardware Information
Use the show hardware command to collect:
- Running status of fan tray and power supply
- Hardware information about all installed modules
- Serial number
- Revision version
- Manufacturing date
- Voltage
- Temperature
2.3.6 More-Detailed Hardware Information
Use the show hardware command with the detail keyword to collect:
- Base MAC address
- On-demand diagnostics (ODD) status
- Voltage
- LED
- More detailed hardware module information (including FGPA versions)
- Current information about 10 Gbps SFP (XFP), or small form-factor pluggables (SFPs), for Packet Processing ASIC, version 2 (PPA2).
- Note:
- Unsupported transceivers, such as gigabit interface converters (GBICs), SFPs, or XFPs, have not been tested with the SmartEdge router and may cause card crashes. To verify that transceivers are supported, use the show hardware command.
2.3.7 Card-Related Hardware Information
To collect information on a specific card, use the show hardware card slot command, optionally with the detail keyword.
Without the detail keyword, the command provides information that includes:
- Configured card type and installed type
- Initialized state
- Administrative state (shutdown or no shutdown)
The detail keyword adds ASE card information shown in the following example:
[local]Redback#show hardware card 2 detail Slot : 2 Type : ase Serial No : E102D1609D0AKQ Hardware Rev : 02 EEPROM id/ver : 0x5a/4 Mfg Date : 20-APR-2009 Voltage 1.200V : 1.206 (+1%) Voltage 1.800V : 1.807 (+0%) Voltage 2.500V : 2.524 (+1%) Voltage 3.300V : 3.271 (-1%) Temperature : NORMAL (51 C) Card Status : HW initialized POD Status : Success ODD Status : Not Available Fail LED : Off Active LED : Blink Standby LED : On Chass Entitlement : All (0x0) Ports Entitled : All Active Alarms : NONE
2.4 System Alarms Information
- Note:
- The command in this section is included in the show tech-support command.
The active alarms in the system (minor, major, or critical) at the time a problem occurs provide important evidence of root causes. Critical alarms should be handled with highest priority.
2.4.1 Active Alarm Information
Use the show system alarm command to display current active system-level, card-level, port-level, channel-level, or subchannel-level alarms, including :
- Currently active alarms
- Type and source of the alarm: for example, a GE-10-port alarm triggered by slot 10/1
- Timestamps when the alarm was initially triggered
- Alarm severity: minor, major or critical
- Short description of the alarm
To display historical alarms since the last reboot, add the all keyword.
2.5 System Load and Resource Usage Information
- Note:
- Most of the commands in this section are included in the show tech-support command.
System load and resource usage are important factors when evaluating system status. As part of daily maintenance, make sure the system is running under an acceptable load and consuming a reasonable amount of resources. Because controller cards hold most of the essential software modules and provide most of the intelligence to the system, monitoring should focus on the controller cards.
The most important resources are:
- Memory
- CPU usage
- Disks: internal and external compact flash cards
This section includes some commands used to check the current system load and resource usage.
2.5.1 Memory Status
Use the show memory command to display the following from XCRP controller cards:
- Total usable memory
- Current usage
- Current free memory
- Reserved memory
- Note:
- The system should have more than 100 MB free memory available. If the reserve is lower than that, the system is unstable. With very low available memory, the controller card can crash while attempting to free up more memory for running system software.
2.5.2 Information on SmartEdge OS Processes
Use the show process command to display:
- Current overall average system load
- PID, Spawn number, running status, and continuous running time for each SmartEdge OS process
- Current memory consumption and CPU usage time by each process
- Current process status
- Note:
- Process spawn time of 0 or 1 is normal; values greater than 1 indicate that the process might have been restarted.
2.5.3 Routing Table Statistics
Use the show ip route summary all or show ip route all-context command to display the routing table statistics in a matrix for every configured routing protocol. Output includes the statistics for total routes, current active routes, and historical data, such as the maximum route numbers reached by the system (Max Ever Reached). It also shows the current load-balance strategy the system is using.
- Note:
- The Max Ever Reached counter provides useful information about the maximum routing entries the whole system has accumulated when handling cases of memory oversubscription triggered by card crashes.
The following example displays the output for the two commands:
[local]SmartEdge#show ip route summary all load-balance: Use the built-in default hash function Total unicast routes summary in all contexts: Route Source Tot-Routes Act-Routes Max Ever Reached Connected 8 8 13 Subscriber Address 0 0 1 Static 0 0 2 Ospf-IntraArea 1 0 1 [local]SmartEdge#show ip route summary all-context Context :local Context id : 0x40080001 -------------------------------------------------------- load-balance: Use the built-in default hash function Rt Tbl Version: 50, Nh Tbl Version: 73 FIB Rt Tbl Version: 50 Route Source Tot-Routes Act-Routes Max Ever Reached Connected 4 4 4 Ospf-IntraArea 1 0 1 Context :SAT Context id : 0x40080002 --------------------------------------------------------- load-balance: Use the built-in default hash function Rt Tbl Version: 0, Nh Tbl Version: 0 FIB Rt Tbl Version: 0 No routes in Table
2.5.4 Disk Usage on CF Cards
Use the show disk command to display:
- The current disk usage (status) of the internal CF card
- The current disk usage (status) of the external mass-storage device card, if installed
- Note:
- The external CF card should always have free space reserved to hold core-dump files and log files for future use. In the SmartEdge OS, when internal core-dump files (usually large files) are generated, the system moves the files to a preconfigured external File Transfer Protocol (FTP) server automatically, thus minimizing memory requirements on the local CF card. Enable this feature with the service upload-coredump ftp:// command in global configuration mode.
The following example displays the output of the show disk command:
[local]SmartEdge#show disk Location 512-blocks Used Avail Capacity Mounted on Internal 1940606 1513514 330062 82% / External 922558 266488 609944 30% /md
2.5.5 CF Card Detailed Information
Use the show disk command with the internal detail or external detail keywords to collect additional information on the internal compact flash (CF) cards:
With the internal keyword, the command produces information about internal CF cards:
- Brand
- Status
- Soft/hard errors
- Current usage
For more information on checking CF cards or external storage devices, see General Troubleshooting Guide.
2.6 Collect Hardware Diagnostic Information
- Note:
- Most of the commands in this section are included in the show tech-support command.
Efficient hardware diagnosis can greatly speed up troubleshooting. This section describes commands for hardware diagnosis and provides hardware-related information.
2.6.1 Display POD History Logs
Use the show diag pod command to display the Power-On-Diagnostic (POD) history logs. It records the PASS/FAIL status of each module, according to result of the POST procedure, as in the following example:
[local]SmartEdge#show diag pod Slot Type POD status(Enabled) ------------------------------------------- N/A backplane N/A fan tray 1 ge-4-port PASS 3 ether-12-port PASS 5 ge3-4-port PASS 7 xcrp PASS 8 xcrp PASS 11 oc3-8-port FAIL 3 atm-oc3-2-port PASS
For more information on the using the show diag pod command, see Command List.
2.6.2 Display ODD Logs
Use this command to display the on-demand diagnostic (ODD) log of every module.
- Note:
- If this is the first time ODD has been performed, there will
be no ODD report for the current show command.
For more details about the SmartEdge ODD technique, see Section 2.6.3.
2.6.3 Access Boot Logs From a Console Session
Occasionally, an XCRP controller card fails to boot up to the normal running level, so it is unreachable from a remote location. To determine the issue is temporary or is a permanent hardware failure, check the boot logs from the console session. To display this information, attach a console cable to the XCRP controller card before it boots.
2.7 SmartEdge ODD
SmartEdge ODD enables hardware diagnosis at your site. This reduces hardware troubleshooting time and also reduces the need to understand complex internal hardware architecture.
There are four levels of ODD diagnosis; see Table 4.
Level |
Device |
Tests |
---|---|---|
1 |
All |
Duplicates the POD tests; complete in 5 to 10 seconds. |
2 |
Standby controller and traffic cards in SmartEdge routers, controller carrier card, I/O carrier card, and MICs in SmartEdge 100 routers |
Includes level 1 tests: Tests all on-board active units in the line interface module (LIM) of the board, including memory, registers, PPA Dual Inline Memory Modules (DIMMs) and static RAM (SRAM), PPA and other on-board processors; complete in 5 to 10 minutes. |
3 |
Traffic cards, I/O carrier card, MICs, and the standby controller card only if it is an XCRP4 controller card |
Includes level 2 tests: Tests and verifies the card data paths for the entire card with internal loopbacks; complete in 10 to 15 minutes. |
4 |
Traffic cards, I/O carrier card, MICs, and the standby controller card only if it is an XCRP4 controller card |
Includes level 3 tests: Tests the entire card using external loopbacks; must be run on site with external loopback cables installed. Note: To run external loopback tests on the Fast Ethernet-Gigabit Ethernet traffic card, install external loopback plugs on the FE and GE ports. Alternatively, connect the GE ports back to back. |
2.7.1 Display Line Card Diagnostic Information (ODD)
The following example shows how to perform ODD on a line card in slot 2 (using ODD level 3):
[local]SmartEdge(config)#card ether-12-port 2
[local]SmartEdge(config-card)#shutdown
[local]SmartEdge(config-card)#on-demand-diagnostic
[local]SmartEdge(config-card)#exit
[local]SmartEdge(config)#exit
[local]SmartEdge#diag on-demand card 2 level 3 loop 2
Wait until the ODD diagnosis is complete, and then issue the following command:
[local]SmartEdge#show diag on-demand card 2 detail
If you see anything that appears abnormal in the output (such as FAIL, SKIPPED, or ERROR status), contact the support organization and provide the ODD results.
2.7.2 Display XCRP Controller Cards Diagnostic Information (ODD)
The following example show how to perform ODD on the standby controller card:
[local]SmartEdge#diag on-demand 2
[local]SmartEdge#show diag on-demand standby detail
If you see anything that appears abnormal in the output (such as FAIL, SKIPPED, or ERROR status), contact the support organization with the ODD results.
2.8 System Status and Logs Information
Use the Show system nvlog command to check the content of nonvolatile RAM (NVRAM) on the current active controller card.
If a system crash occurred and core-dump files were generated, display them using the show crashfiles command.
More information can be collected on crashes by using the show process crash-info command, however, this command does not record any information about manual process restarts, and it does not retain any information across a system reboot.
2.8.1 Show Logs of Trap and Panic-Related Messages
The nonvolatile RAM (NVRAM) usually stores logs of trap and panic-related messages from the OS. Use the show system nvlog command to debug system crashes in the absence of a local console. It provides:
- Boot history with version information
- Boot reasons
2.8.2 Display Core-Dump Files
Use the show crashfiles command to display core-dump files generated when a system module encounters serious issues or when a manual core-dump is taken for diagnosis purposes.
It produces output similar to the following:
[local]SmartEdge#show crashfiles 15717 Jun 12 01:17 /md/exec_cli_290.mini.core 15783 Jun 12 01:28 /md/exec_cli_298.mini.core 807252 Jul 18 12:45 /md/vxnzram.dat 9437184 Jul 18 12:45 /md/vxcore.gz 10182196 Aug 15 14:26 /md/crashSlot01IppaDram.gz 590894 Sep 16 02:41 /md/crashSlot01Eppa.gz
2.8.3 Display Process Crash Information
Use the show process crash-info command to collect information about process crashes, including when they occurred, which process crashed, and why. The UNIX signal code indicates what might have caused a crash internally. Crash files produce useful information from which the real cause may be traced, similar to the following:
[local]SmartEdge#show process crash-info ME TIME STATUS ospf Mon Jan 27 14:05:43 2001 Kill (9) ism Mon Jan 27 14:28:26 2001 Kill (9) ism Mon Jan 27 14:28:50 2001 Kill (9)
2.9 System Core-Dump Information
In most UNIX-like systems, if a fatal OS error occurs, the system prints to the console a message that describes the error, and then generates a crash report (called a core dump) and sends it to a predetermined dump device.
You can initiate a manual core dump by forcing a crash on any SmartEdge process or card.
Core dumps are an important component of the troubleshooting information used by support to determine the cause of a failure.
A core-dump file is a snapshot of the RAM that was allocated to a process when the crash occurred. The copy is written to a more permanent medium, such as a hard disk. A core-dump file is also a disk copy of the address space of a process, when the crash occurred, that provides information such as the task name, task owner, priority, and instruction queue that were active at the time the core file was created.
On SmartEdge routers, core-dump files are placed in the /md directory in the /flash partition (a directory under root FS mounted on the internal CF card), or in the /md directory on a mass-storage device (the external CF card located in the front of XCRP), if it is installed in the system.
Depending on the active component of the system at the time of a crash, there are various types of core-dump files; the most common are listed in Table 5:
Component |
File |
---|---|
PPA |
crashSlotSSEppaDram.gz |
crashSlotSSEppa.gz | |
SmartEdge OS process |
aaad_NN.core |
aaad_NN.mini.core | |
BSD |
netbsd.0.core.gz |
netbsd.0.gz | |
VxWorks |
vxcore.gz |
vxnzram.dat |
In Table 5, the notation is as follows:
- SS is the slot number; for example, 01 indicates that the crash occurred on the module in slot1
- NN is a process ID; when the same process crashes multiple times, this number uniquely identifies the current process version.
2.9.1 Check Existing Core Dump Files
To display information about existing core-dump files, use the show crash command.
- Note:
- This command does not display information about crash files that have been transferred to a bulkstats receiver, which is a remote file server.
2.9.2 Enable Automatic Upload for Core Dumps
Due to the large size of core dump files, we recommend configuring the SmartEdge OS to send core-dump files to a preconfigured external FTP server automatically, thus minimizing the storage on the local CF card. Enable this feature by entering the service upload-coredump ftp:url command in global configuration mode, as in the following example:
[local]SmartEdge(config)#service upload-coredump ftp://root:admin@192.168.1.3/ftp-root
where 192.168.1.3 is a remote FTP server with write privileges configured. The FTP server has the following format:
//username[:passwd@{ip-addr | hostname} [:port] [//directory]
Use initial double slashes (//) if the pathname to the directory on the remote server is an absolute pathname; use a single slash (/ ) if it is a relative pathname (for example, under the hierarchy of the username account home directory).
You can add the optional context ctx-name construct to name a SmartEdge OS context for reachability.
2.9.3 Force a Manual Core Dump
Sometimes when a process is suspected to be in an abnormal state, the support organization may ask you to produce core-dump files proactively and send them for further analysis. This section discusses how to produce manual core-dump files from the SmartEdge OS process.
Similarly to producing core-dump files from system components by using the show crash command, you can also produce a manual core dump.
To force a core dump for a process without restarting the process, enter the following command:
[local]SmartEdge#process coredump process name
2.10 XCRP Redundancy Problems
For a number of reasons, the standby XCRP sometimes cannot synchronize with the primary one. Use the show redundancy , show redundancy detail, and show system redundancy commands to collect information about current redundancy status.
2.10.1 Display Redundancy Information
Use the show redundancy command to collect:
- Current XCRP synchronization status
- XCRP switchover history and reason
2.10.2 Show System Redundancy Information
Use the show system redundancy command to aggregate the output of a number of other commands and provide additional information, such as:
- Active controller alarms details provided by the show system alarm command
- Output of show redundancy command
- XCRP controller card details included in the output of the show hardware detail command
- Controller protection internal logs
- Controller error logs
2.11 System Logs
System logs usually contain comprehensive information about a variety of system components. Use theshow log command to check system logs.
It is a good practice to check what commands were used and when they were entered. By comparing the timestamps of commands that were entered to timestamps of events or symptoms that occurred, you may find clues to causes of the problem. You can use two methods to view command history:
- Run the show history command to display all commands entered since the last system reboot.
- In shell mode in the /var/log directory, a file called cli_commands records all commands that have ever been entered, even if they have been refused because of syntax errors. The information in this file is retained after a system reboot.
2.11.1 Display System Logs
Use the show log command to display information about system event logs or a previously saved log file.
2.11.1.1 Commonly Used Keywords and Arguments
The tdm-console, since, until, and level keywords are only available after specifying the active keyword or the file filename construct.
The show log active tdm-console command displays field-programmable gate array (FPGA) mismatched log messages. For example, if you power on the system or perform a system or traffic card reload, and that cards FPGA file revision and FPGA chip revision do not match, an FPGA mismatch log message is generated.
The show log active all command prints all current active logs in the buffers. When the buffer is used up, the log is wrapped out of the buffer and written into a series of archive files named messages.x.gz. These files can be found in the /var/log directory through NetBSD shell mode, as shown in the following example:
[local]SmartEdge#start shell # cd /var/log # ls -l total 56 -rwxr-xr-x 1 11244 44 0 Jun 3 21:37 authlog -rw-r--r-- 1 root 44 12210 Aug 12 18:46 cli_commands -rw-r--r-- 1 root 44 415 Aug 12 18:47 commands -rwxr-xr-x 1 11244 10000 1178 Sep 6 17:58 messages
To view the file messages, enter the less messages command:
# less messages Sep 6 07:43:36 127.0.2.6 Sep 6 07:39:52.327: %LOG-6-SEC_STANDBY: <nl />Sep 6 07:39:52.214: %SYSLOG-6-INFO: ftpd[83]: Data traffic: 0 bytes in 0 files Sep 6 07:44:51 127.0.2.6 Sep 6 07:39:52.328: %LOG-6-SEC_STANDBY: Sep 6 07:39:52.326: %SYSLOG-6-INFO: ftpd[83]: Total traffic: 1047 bytes in 1 transfer
To view the messages.x.gz files, use the 'gzip -cd messages.x.gz [|less]' command:
# gzip -cd messages.0.gz | less Sep 6 00:03:21 127.0.2.6 Sep 5 23:53:46.600: %LOG-6-SEC_STANDBY: Sep 5 23:53:46.600: %CSM-6-CARD: slot 12, ALARM_MAJOR: Circuit pack backplane failure Sep 6 00:04:36 127.0.2.6 Sep 5 23:53:46.601: %LOG-6-SEC_STANDBY: Sep 5 23:53:46.601: %CSM-6-CARD: slot 14, ALARM_MAJOR: Circuit pack backplane failure Sep 6 00:05:51 127.0.2.6 Sep 5 23:53:46.603: %LOG-6-SEC_STANDBY: Sep 5 23:53:46.602: %CSM-6-CARD: slot 1, ALARM_MAJOR: Circuit pack backplane failure Sep 6 00:07:06 127.0.2.6 Sep 5 23:53:46.604: %LOG-6-SEC_STANDBY: Sep 5 23:53:46.603: %CSM-6-CARD: slot 2, ALARM_MAJOR: Circuit pack backplane failure
2.11.1.2 Preserving Logs Across System Reload
When you enter the reload command from the CLI, or the reboot command from the boot ROM, the system copies its log and debug buffers into the following files:
/md/loggd_dlog.bin
/md/loggd_ddbg.bin
As an aid to debugging, you can display these files using the show log command:
show log file /md/loggd_dlog.bin
show log file /md/loggd_ddbg.bin
2.11.1.3 Set Logs to Second and Millisecond Timestamps
By default, the timestamps in all logs (and debug output) are accurate to the second. You can configure accuracy to the millisecond by entering the following commands:
[local]SmartEdge#configure
[local]SmartEdge(config)#logging timestamp millisecond
[local]SmartEdge(config)#commit
2.11.2 Access Logs From a Console
When the controller cards cannot boot up, you cannot access them remotely. To determine the cause, collect the information shown in the console while the controller card is booting:
- Connect the console cable between your PC and the XCRP controller card console port.
- Set connect parameters like baud rate, data bits, parity, and Stop bits correctly (for XCRP4, typically set them as 9600, 8, N, 1).
- Enable the capture function in your terminal emulation software.
- Start the controller card by powering on the power supply
or inserting the card back into the slot.
The boot output is collected in a predefined capture file.
2.11.3 Display Historical Operation Record
To show a list of commands entered during the current session, enter one of the following commands:
[local]SmartEdge#show history global Dec 31 21:33:25 show process rcm Dec 31 21:33:28 show process rcm Dec 31 21:34:16 reload
# less /var/log/cli_commands Dec 31 21:33:25 show process rcm Dec 31 21:33:28 show process rcm Dec 31 21:34:16 reload
2.12 SmartEdge Debugging
The SmartEdge system provides a comprehensive collection of debug commands to assist troubleshooting.
Because of the multi-context nature, debugging in SmartEdge can be a context-specific task or a context-independent (global) task.
To enable collaboration on serious and complex issues, the debug function is separate for each administrator logged on to the same router.
- Note:
- If you forget to turn off the debug function, terminating a Telnet or SSH session automatically turns it off.
For more information on debugging, see the General Troubleshooting Guide and the specific debug commands in Command List.
2.12.1 Global Debugging
To debug all contexts on your SmartEdge router, enter the commands in the system-wide local context. If you enter the debug aaa authentication command from the local context, the output contains information about all contexts.
2.12.2 Context-Specific Debugging
To debug functions within specific contexts, enter the debug commands from the context you are investigating.
For example, to display the debug ospf packet output in the terminal monitor from the SmartEdge context ABC:
- Enter the ABC context and direct the debug messages to
the current session:
[local]SmartEdge#context ABC
[ABC]SmartEdge#terminal monitor
- Activate the debugging for the Open Shortest Path First
(OSPF) module:
[ABC]SmartEdge#debug ospf packet
Collect the screen output for the enabled debug function.
- Deactivate the enabled debug function:
[ABC]SmartEdge#no debug ospf all
Or alternatively, turn off all debug functions on all modules:
[ABC]SmartEdge#no debug all
- Disable the terminal monitor function when debugging is
complete:
[ABC]SmartEdge#no terminal monitor
2.12.3 Separated Administrator-Specific Debugging
To enable multiple administrators to debug different functions or contexts at the same time, debugging in the SmartEdge OS is administrator-specific; two or more administrators can log on and perform different debug activities without interfering with each other. For example, you can debug Open Shortest Path First (OSPF) in a context while you debug Border Gateway Protocol (BGP) in the same context. Each person sees only the output triggered by the debug functions that they enable. Administrator-specific debugging enhances troubleshooting efficiency and collaboration.
2.12.4 Rule Breakers
There are two exceptions to the rule that debugging is global in the local context and context specific in other contexts:
2.12.4.1 Global Debug Context in the Local Context
If you enter debug commands in the local context, some will produce debug output from the same instances in all contexts.
For example, If you enter the debug ospf lsdb command in the local context, it produces output similar to the following:
Apr 18 12:21:05: [0002]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.0 Update Router LSA 200.1.1.1/200.1.1.1/80000013 cksum 26f1 len 72 Apr 18 12:21:05: [0003]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.2 Update Router LSA 200.1.2.1/200.1.2.1/80000009 cksum ce79 len 36 Apr 18 12:21:05: [0004]: %OSPF-7-LSDB: OSPF-1: Area 0.0.0.3 Update Sum-Net LSA 0.0.0.0/200.1.3.1/80000001 cksum bb74 len 28
The numbers in square brackets are internal context ID numbers; to display the context names, enter the show context all command:
[local]SmartEdge#show context all Context Name Context ID VPN-RD Description ------------------------------------------------------ local 0x40080001 Rb-1 0x40080002 Rb-2 0x40080003 Rb-3 0x40080004
2.12.4.2 Global Debug Functions
Some debug functions are global everywhere they are entered, so the context where you enter the command is not important. If you enter the debug aaa authentication command in any context, the output contains information about all contexts. If you enter the debug pppoe exception command in every configured context, the output is the same for every configured context:
Jan 1 10:48:44: [1/8:1023:63/3/2/23445]: %PPPOE-7-PKT_E: queueing up packet Jan 1 10:48:44: [1/6:1023:63/6/2/29542]: %PPPOE-7-ISM_E: Received event: CCT state sub_event: SUB down cplt encap: ether-dot1q-tunnel-pppoe-ppp for unknown circuit Jan 1 10:48:44: [1/8:1023:63/6/2/29576]: %PPPOE-7-DISC_E: [6082] Received PADT: already terminating Jan 1 10:48:44: [1/8:1023:63/6/2/29576]: %PPPOE-7-ISM_E: Received event: CCT state sub_event: SUB down cplt encap: ether-dot1q-tunnel-pppoe-ppp for unknown circuit
2.13 Collect Information About Packet Drops
Packet drops can occur on every type of transport media. While a small number of packet drops are usually considered normal, a large number of packet drops that continues to increase often indicates a problem that should be investigated further.
The SmartEdge OS provides several levels of packet statistics; this document describes methods to collect port- and circuit-level packet statistics. For detailed descriptions of the counters, see the commands in Command List.
2.13.1 Port and Circuit Level Statistics
You can monitor port-level packet statistics by using the show port counters S/P [ detail | live | queue ] command.
Display circuit-level packet statistics by entering the show circuit counters [ detail | live | queue ] command.
For descriptions of the output for the show port counters detail or the show circuit counters detail commands, see the commands in Command List.
3 Send Data to the Support Organization
Save the collected data in a zip file with a unique filename, and send it to the support organization (using FTP or e-mail) according to the instructions from your support representative.
Although you should check with support for the preferred delivery method, in many cases, you can send your collected data by FTP to:
- IP address: 155.53.3.26
- Directory: /incoming
- Login: anonymous
- Password: anonymous
Inform your support representative that the files have been uploaded and the filenames.
4 Optional: Collect Information for Specific Problems
To troubleshoot specific problems, you may need to collect additional data related to your problems. For example, if PPPoE subscribers are unable to connect to the SmartEdge router, or an OSPF neighbor is continually flapping, system-wide data may not be enough to resolve the issue. You will need to collect information focused on the specific area.Use the debug ? command to list the possible debug commands. For complete syntax and usage guidelines for specific commands, see Command List.
For more information about troubleshooting, see the General Troubleshooting Guide.
4.1 Software Problems
When the symptoms of an issue indicate there may be a problem with a SmartEdge module there are many debug commands that may provide information you can send to Customer Support for analysis.
4.1.1 NTP
If you have encountered error logs for NTP, use the following methods to collect further information:
Use the show ntp associations detail command to display the NTP synchronization status:
[local]SmartEdge#show ntp associations detail remote 155.53.12.12, local 10.192.17.246 hmode client, pmode unspec, stratum 4, precision -18 leap 00, refid [130.100.199.242], rootdistance 0.20607, rootdispersion 0.09840 ppoll 6, hpoll 6, keyid 0, version 3, association 1508 valid 7, reach 377, unreach 0, flash 0x0000, boffset 0.00000, ttl/mode 0 timer 0s, flags system_peer, config, bclient reference time: cf15e293.0af6a289 Thu, Feb 4 2010 16:19:31.042 originate timestamp: cf15e8eb.30cf72d9 Thu, Feb 4 2010 16:46:35.190 receive timestamp: cf15e8ea.2e3065f9 Thu, Feb 4 2010 16:46:34.180 transmit timestamp: cf15e8e9.c66f2e8c Thu, Feb 4 2010 16:46:33.775 filter delay: 0.40527 0.33043 0.10486 0.11139 0.17230 0.12062 0.16760 0.15277 filter offset: 1.212877 1.044107 1.050654 0.985800 0.714899 0.650443 0.531281 0.360664 filter order: 2 3 5 7 6 4 1 0 offset 1.050654, delay 0.10486, error bound 0.05133, filter error 0.36215 context id: 0x40080001
Use the show ntp status command to display the NTPD version and the synchronized peer, precision, and reference time:
[local]SmartEdge#show ntp status Ntpd version 4.0.98f system peer: 192.168.0.151 system peer mode: client leap indicator: 00 stratum: 5 precision: -18 root distance: 0.00212 s root dispersion: 0.04713 s reference ID: [192.168.0.151] reference time: cc787efc.454f7a91 Mon, Sep 15 2008 14:28:12.270 system flags: bclient monitor ntp kernel stats kernel_sync jitter: 0.019791 s stability: 0.599 ppm broadcastdelay: 0.003998 s authdelay: 0.000000 s
4.1.2 MPLS
If you have attempted to troubleshoot MPLS using Troubleshooting MPLS and the problem persists, you can collect debug information for further analysis by technical support representatives using the following debug commands:
- debug ip routing all
- debug ip routing message
- debug ldp all
- debug ldp filter {interface if-name | neighbor ip-addr | prefix ip-addr/prefix-length [exact-match]} —Use the interface, neighbor, or prefix keywords to enable LDP debugging filter on an interface, for neighbors, or for prefixes.
- debug ldp message {msg-type [detail] | {dump | receive | send}}
- debug lm download
- debug lm in-label
- debug lm lsp [ip-addr/prefix-length]
- debug lm msg [ip-addr/prefix-length]
- debug lm rib [ip-addr/prefix-length]
- debug rsvp
Run these debug commands for a short time, save the data, and send it to your technical support representatives.
4.1.3 SNMP
To collect information on SNMP functions, use the show snmp server command. It displays information such as the SNMP agent server status (listening port 161) and SNMP packet statistics, with output similar to the following:
[local]SmartEdge#show snmp server snmp server is listening on port 161 authentication failure traps are enabled 0 packets received 0 bad versions 0 unknown community names 0 bad community uses 0 packets sent 0 get responses sent 0 traps sent 0 silent drops 0 ASN parse errors 0 not in time window 0 unknown user names 0 unknown engineIDs 0 wrong digests 0 decryption errors
4.2 Hardware Problems
If you suspect that a problem is related to hardware, you can enter show and debug commands to provide evidence that may enable Customer Support to understand the root cause. For example, if symptoms point to a problem with an ASE card, you can focus data collection on that specific card.
4.2.1 ASE Cards
When you enter the show tech-support command, use the ase keyword to restrict output to that relevant to the ASE card; see Section 2.2.
Use the show version command to check the underlying software. If the SmartEdge OS, Open Firmware, and minikernel software are not the correct version, the SmartEdge OS may not recognize the ASE card. For the compatible software versions for recent releases, see Section 2.3.1.
Enter the show chassis command to collect information about the ASP status and FPGA software status for the card, as well as the readiness of the card components.
[localSmartEdge#show chassis Current platform is SE800s (Flags: A-Active Crossconnect B-Standby Crossconnect C-SARC Ready D-Default Traffic Card E-EPPA Ready G-Upgrading FPGA H-Card Admin State SHUT I-IPPA Ready M-FPGA Upgrade Required N-SONET EU Enabled O-Card Admin State ODD P-Coprocessor Ready P1-ASP1 Ready P2-ASP2 Ready R-Traffic Card Ready S-SPPA Ready U-Card PPAs/ASP UP W-Warm Reboot X-XCRP mismatch) Slot: Configured-type Slot: Installed-type Initialized Flags ---------------------------------------------------------------- 1 : none 1 : none No 2 : none 2 : none No 3 : none 3 : oc48-1-port No 4 : ge-4-port 4 : ge-4-port Yes IEUDR 5 : ase 5 : ase Yes P1P2UR 6 : none 6 : none No 7 : xcrp 7 : xcrp Yes A 8 : xcrp 8 : xcrp Yes B 9 : ge-4-port 9 : ge-4-port Yes IEUR 10 : ge-4-port 10 : ge-4-port Yes IEUR 11 : ase 11 : ase Yes P1P2UR 12 : none 12 : none No 13 : none 13 : none No 14 : none 14 : none No
You can also collect ASP logging, statistics, and system information using one of the following constructs of the show security asp command:
- For ASE card logs—show security asp slot/port logging
- For ASE card statistics—show security asp slot/asp-id statistics [packet slot | system]
- For ASE card system information—show security asp slot/port system
5 Access the SmartEdge System Components
To perform the tasks in this document, you may need to access the SmartEdge system components on the primary and secondary XCRP controller cards.
Each controller card runs the NetBSD and VxWorks operating systems, each located on a dedicated compact flash (CF) card and run on a dedicated processor in the SmartEdge 400 or 800 platform (PowerPC) or on a dedicated core in the multicore processor environment on the SmartEdge 1200 platform (MIPS).
NetBSD is the OS on which the SmartEdge OS runs. You may be asked by your support representative to access the NetBSD OS to perform such tasks as reloading NetBSD processes and generating core dumps of their memory at the time of a failure.
VxWorks is the OS that is responsible for most low-level processing, such as driving or monitoring traffic cards.
5.1 Access Primary and Secondary XCRP Controller Cards
Table 6 describes the XCRP controller card terms used in this document.
Term |
Description |
---|---|
Primary controller card |
XCRP Controller card installed in Slot 7 on a SmartEdge 800 router and slot 6 on a SmartEdge 400 router |
Secondary controller card |
XCRP Controller card installed in Slot 8 on a SmartEdge 800 router and slot 5 on a SmartEdge 400 router |
Active controller card |
Controller that is currently active or working |
Standby controller card |
Controller that is currently in standby mode |
To enable access to controller cards from the CLI, the SmartEdge OS provides default addresses (IP addresses and ports) for each controller card; for the default slots, IP addresses, and ports for the SmartEdge 400 platform, see Table 7; for the SmartEdge 800 or 1200 platform, see Table 8.
SmartEdge 400 Slot |
IP Address and Port |
Destination |
XCRP 5 |
127.0.2.6 23 |
SmartEdge OS CLI |
XCRP 6 |
127.0.2.5 23 |
SmartEdge OS CLI |
- Note:
- The XCRP that comes up first in slot 5 or slot 6 on a SmartEdge 400 chassis is the primary, active XCRP.
SmartEdge 800 and 1200 Slot |
IP Address and Port |
Destination |
XCRP 7 |
127.0.2.5 23 |
SmartEdge OS CLI |
XCRP 8 |
127.0.2.6 23 |
SmartEdge OS CLI |
- Note:
- The XCRP that comes up first in slot 7 or slot 8 on a SmartEdge 800 or 1200 chassis is the primary, active XCRP.
- Note:
- Descriptions and output examples of most commands in this document are based on commands entered on the active controller card; unless noted, the commands also apply to the backup controller card.
5.2 Log On to the Standby XCRP Controller Card
To collect information or to perform recovery tasks on the standby controller card, log on to it from the active controller card and use the same commands that you would on the active one. The following example shows how to log on to the standby controller card from the active one, assuming that slot 8 contains the active controller card and slot 7 contains the standby one. The standby prompt indicates that you are now working on the standby controller.
[local]SmartEdge#show chassis | include xcrp 7 : xcrp 7 : xcrp Yes B 8 : xcrp 8 : xcrp Yes A
[local]SmartEdge#telnet 127.0.2.5 Trying 127.0.2.5... Connected to 127.0.2.5 Escape character is '^]
SmartEdge login:the same login name as with active XCRP Password:the same password as with active XCRP [local]standby#
5.3 Access NetBSD Shell Mode
To access the NetBSD OS level from the SmartEdge CLI, use the following command in exec mode:
[local]SmartEdge#start shell
#
The # prompt indicates you are now at the NetBSD OS level.
5.3.1 Access Open Firmware (OpenBoot) Mode
To access the Open Firmware mode (also known as the BootROM or OK mode) CLI through the console port on the front of each controller card:
- Enter the reload command (in exec mode) from the console port.
- Watch the reload progress messages carefully. When the
following message appears, type se* within five seconds:
Auto-boot in 5 seconds - press se* to abort, ENTER to boot:
- If you typed se* within 5 seconds, the OpenBoot ok prompt appears. The system sets the autoboot time limit to 5 seconds; however, during some operations, such as a release upgrade, the system sets the time limit to 1 second to speed up the process, then returns it to 5 seconds when the system reboots. (If you missed the time limit, the reload continues; start again with Step 1)
5.4 Enable Logging
Enabling logging is useful for performing offline analysis and providing information for further escalation. Most logon software now supports automatic logging.
The following procedure shows how to configure secureCRT software to enable automatic logging. (Different versions of secureCRT may require different steps.) You can also use PuTTY or other terminal emulation software.
To configure login information and enable automatic logging in secureCRT software:
- Start the secureCRT software and click File > Connect.
- Click new session and then Connection.
- In the Connection screen Name field, type a meaningful name. We recommend that you use the site
name concatenated with the node IP address, such as site-name-61.130.33.6.
In the Protocol field, type the logon method; for example Telnet.
- Click Logon Scripts and specify the logon
sequences.
- Some nodes are protected by one or two intermediate jumphosts
for enhanced security. Click Automate logon and enter
the parameters required during the logon procedure.
- Note:
- Logon automation may take several attempts to configure properly, and may not be possible if user names and passwords change frequently.
- Click the Telnet tab.
In the Hostname field or SSH1 or SSH2 fields, if the login protocol is Secure Shell (SSH), type the IP address for the direct connection. This can be the node IP, or the springboard IP if one is used.
- For minimum setup, leave the other options at their default
settings and click Log File .
- To predefine a log file for this node, in the Log File Name field, type a meaningful name. For flexibility
and ease of log file maintenance, especially when you have hundreds
of nodes set up, you should use the same name as in the Name field on the Connection screen. Each log file is then uniquely associated
with a node by its name.
We recommend selecting Prompt for filename, Start log upon connection, and Append to file.
6 References
For information about SmartEdge hardware and software, see related documentation on https://ebusiness.ericsson.net.
Glossary
ASE |
Advanced Services Engine |
DIMMs |
Dual Inline Memory Modules |
FPGA |
Field Programmable Gate Array |
FTP |
file transfer protocol |
GBICs |
gigabit interface converters |
IS-IS |
Intermediate System-to-Intermediate System |
LIM |
line interface module |
MIPS |
Microprocessor without Interlocked Pipeline Stage |
NTP |
Network Time Protocol |
NTPD |
NTP daemon |
NVRAM |
nonvolatile RAM |
ODD |
On-Demand-Diagnostics |
OFW |
Open Firmware |
OpenBoot |
Open Firmware |
OSPF |
Open Shortest Path First |
POD |
Power-On-Diagnostic |
PPA |
Packet-Processing ASIC |
PPA2 |
Packet Processing ASIC, version 2 |
SFPs |
small form-factor pluggables |
SNMP |
Simple Network Management Protocol |
SRAM |
static RAM |
SSH |
Secure Shell |
XCRP |
Cross-Connect Route Processor |
XFP |
10 Gbps SFP |