Troubleshooting Guide: When Your 100G Transceivers Won’t Link
100G transceivers are currently widespread and essential for maintaining high-capacity links. However, their complexity means that 100G troubleshooting issues like link failures, signal degradation, or hardware compatibility can be challenging. This article provides a structured approach to diagnosing, transceiver testing, and resolving common 100G transceiver problems.
Common causes of link failures
Overall, the link failures can be separated into 5 main groups:
- Fiber connectivity issues
- Power budget problems
- FEC mismatch between endpoints
- Auto-negotiation failures
- Wavelength mismatches in WDM systems
Let’s start easy: if the 100G transceivers you have planned for usage now have been lying around on your desk for a while, without a connector plug, the transceivers might not link due to dirty connectors. Another fiber connectivity issue we have seen is the usage of a wrong fiber type. For example, when using a 100G LR4 (100G-QSFP28-10) module, it needs to be used with a single-mode fiber (SMF). The transceiver will not work when multi-mode is used (MMF)
Speaking about power budget issues, this budget represents the difference between the transmitter’s output power and the minimum required receiver sensitivity.
- •If the received optical power is too low (often due to excessive fiber length, dirty connectors, or high insertion loss), the receiver may fail to detect the signal, resulting in a link-down condition
- •If the received power is too high, such as when a short fiber is used without proper attenuation (for a transceiver that is meant to be used in higher distances), the receiver can become overloaded, leading to data errors or link instability.
Next up, we have a FEC mismatch. FEC is a critical feature that helps detect and correct bit errors in high-speed optical links. More on FEC, you can read in our Blog about FEC. For FEC to function correctly, both ends of the link must have compatible FEC settings – either enabled or disabled. A mismatch, such as one side using RS-FEC and the other using no FEC or a different type, can prevent the link from establishing or cause excessive error rates.
Another frequent cause of link failures is the Auto-Negotiation setup. Auto-negotiation is the process by which two connected devices exchange capabilities and agree on parameters like speed, FEC, and lane alignment. If one end does not support auto-negotiation, is misconfigured, or uses a different protocol (such as passive DACs vs. active optics), the link may fail to come up or remain in a faulty state (unactive).
Wavelength mismatches in WDM systems are another cause of link failure for 100G transceivers. Each transceiver in a WDM system is designed to operate at a specific wavelength (or channel), and both ends of the link must use matching wavelengths to establish proper communication.
Quality indicators to check
When diagnosing 100G transceiver issues, you can review the quality indicators through DDM (Digital Diagnostics Monitoring). Usually, the command to check DDM is “show interfaces transceiver detail”.
Start by checking the optical power readings (both transmitter (Tx) and receiver (Rx)) to ensure they fall within the expected range specified by the transceiver datasheet. Insufficient Rx power may indicate fiber loss or poor connections, while excessive power could lead to receiver overload. In the transceiver datasheet, you can also check the value of Rx overload. If the current Rx passes this value, the transceiver most likely is burned. Temperature monitoring is also crucial, as transceivers operating outside their rated temperature range may exhibit instability or failure. Additionally, inspect supply voltage levels and bias current values, which can reveal power delivery issues or laser degradation. DDM data provides access to all optical diagnostics in real time.
SFP Detail Diagnostics Information (internal calibration)
----------------------------------------------------------------------------
Current Alarms Warnings
Measurement High Low High Low
----------------------------------------------------------------------------
Temperature 50.30 C 75.00 C -5.00 C 70.00 C 0.00 C
Voltage 3.15 V 3.63 V 2.97 V 3.47 V 3.13 V
Current 59.81 mA 120.00 mA 5.00 mA 110.00 mA 10.00 mA
Tx Power 2.52 dBm 6.29 dBm -3.90 dBm 5.29 dBm -2.90 dBm
Rx Power 4.27 dBm 6.29 dBm -10.22 dBm 5.29 dBm -9.20 dBm
Transmit Fault Count = 0
----------------------------------------------------------------------------
Note: ++ high-alarm; + high-warning; -- low-alarm; - low-warningStep-by-step 100G troubleshooting procedures
Step 1: Physical Layer Verification
Inspect and clean all fiber connectors using appropriate tools to remove dust that can cause signal loss. Ensure proper seating of transceivers, wavelength match, and fiber cables.
Step 2: Optical Power Level Measurements
Use either an optical power meter or DDM data to measure/ see both transmitter signal (Tx) and receiver (Rx) power optical diagnostics at each end of the link. Compare readings against the transceiver specifications. As mentioned, low Rx power may indicate dirty connectors, long distances, or poor splicing. Excessively high power might point to a lack of required attenuation on short links.
Power budget verification
To verify the power budget and have a reliable 100G link, you need to calculate the link loss budget. How to calculate optical power budget? The article will get you covered. Next, measure the actual link loss using an optical power meter and compare it against your calculated budget. If the measured loss exceeds the budget, investigate potential sources of excessive loss (dirty connectors, damaged fiber, etc.).
For very short links, especially in lab or data center environments, the received power may exceed the transceiver's maximum input level. In such cases, optical attenuators should be used to reduce the power and prevent receiver overload.
Step 3: Loopback Transceiver Testing
Perform loopback testing to isolate the issue that the transceiver hardware might be the root cause. More on how to perform a loopback test can be found in our step-by-step guide.
Step 4: Systematic Swap Testing
Replace the transceiver with a known good one, test with a different fiber patch cable, and, if needed, try another port on the same device. Swapping and testing other transceivers helps to pinpoint whether the failure is due to a faulty module, cable, or port hardware.
Step 5: Configuration Verification Checklist
Check configuration settings on both endpoints. Verify matching FEC settings, auto-negotiation mode, and speed configuration so that they align. Ensure that the transceiver types are supported by the device you are using them in. Usually, if a transceiver is not supported in the equipment, it will arise an alarm of “unsupported” or “unapproved”. Review DDM data to confirm that temperature, voltage, and bias current are within normal operating ranges (the ranges can be checked in a transceiver datasheet).
Conclusions
When 100G transceivers fail to establish a link, effective troubleshooting requires a methodical, layered approach. For us, the main issue which is seen for 100G optics is FEC settings, when the FEC is not enabled or disabled. PAM4 transceivers should have FEC disabled on the host device, as they have built-in FEC by themselves. NRZ transceivers should have FEC enabled in order to achieve the desired distance and not have unnecessary link errors (CRC errors).
If you still have issues with the transceivers after completing these steps, be sure to contact the supplier for further technical support!
FAQ:
My 100G transceivers show 'link down' even though everything is connected. What should I check first?
Start with these most common checks:
- Clean the connectors
- Check FEC settings
- Verify fiber type
- Check DDM(TX/RX power) readings
How do I know if my received optical power is too high or too low?
Check the Digital Diagnostics Monitoring (DDM) data using your switch’s CLI, there, you will find the RX actual readings. Usually, the alarms for various states are in effect if the RX is either low, too low, high, or too high. Additionally, you can compare the readings to the acceptable limits listed in our product datasheets. If the power is too high on short links, add optical attenuators, or the laser may burn down. If too low, clean connectors, check for fiber damage, or verify the link distance is within specifications.
I've tried everything but my 100G link still won't come up. What's a systematic approach to isolate the problem?
You can fallow these steps to determine the root cause of the failure you are experiencing
- Loopback test – Guide for this test can be found here!
- Known-good swap – Swap the faulty module with a one that you know works without issues and check if the link comes up
- Cable swap – Swap the patch cables used
- Port swap – Test faulty modules in different ports on the same switch or in another one
- Configuration double-check – Verify switch port configurations if they support the module used
If all these steps fail, collect the DDM data showing temperature, voltage, current, and optical power readings, then contact our technical support team with this information here.