Add Using AT Commands to Verify PDP Context and QMI Sync
74
Using-AT-Commands-to-Verify-PDP-Context-and-QMI-Sync.md
Normal file
74
Using-AT-Commands-to-Verify-PDP-Context-and-QMI-Sync.md
Normal file
@@ -0,0 +1,74 @@
|
||||
Cross-verify the modem's internal state via AT commands, to detect any desynchronization between QMI and the actual modem connection (e.g. a “ghost” PDP context where the modem is still connected but QMI/host think it's down, or vice versa).
|
||||
|
||||
Quectel modems expose multiple AT command ports (e.g. /dev/ttyUSB2) which allow you to query and control the modem at a low level. While QMI (via uqmi) is managing the data connection, you can simultaneously use AT commands to peek at PDP context status without interfering (as long as you stick to querying commands, or carefully issued context deactivation if needed).
|
||||
|
||||
Check PDP context activation with AT+CGACT: This standard command shows which PDP contexts are active. For example, AT+CGACT? might return a list like:
|
||||
|
||||
+CGACT: 1,1
|
||||
+CGACT: 2,0
|
||||
OK
|
||||
|
||||
This means context ID 1 is activated (1,1) and context 2 is inactive (2,0). Quectel by default uses profile 1 for the first data connection unless configured otherwise. If you see 1,1, that indicates the modem's cellular data is currently active (you have an IP). If you run this after the drop:
|
||||
|
||||
If it still shows 1,1 (active) even though your Linux interface is down, that's a ghost connection. The modem radio-side didn't actually drop the PDP session. This can happen if QMI client lost its handle or if the host network interface went down due to some race condition, while the modem remained attached. In such a case, you'd likely want to send AT+CGACT=0,1 to deactivate context 1 (or use the vendor-specific AT+QIDEACT=1 on Quectel which does the same). Otherwise, the modem might refuse new connections thinking one is still up.
|
||||
|
||||
If it shows 1,0 after drop, that confirms the PDP context is indeed disconnected at the modem level. That means the modem did tear down the data call (whether by its own decision or network command).
|
||||
|
||||
Check IP address with AT+CGPADDR: This command shows the IP address for a given context. AT+CGPADDR=1 will return the IP (and possibly gateway/DNS depending on modem) for context 1 if it's active. If the modem still has an IP here after your Linux interface dropped, it's another sign the modem didn't realize it needs to disconnect. Compare it to what uqmi --get-current-settings shows:
|
||||
|
||||
If uqmi says “Out of call” but AT+CGPADDR=1 returns an IP, the modem is still connected but QMI client considers itself out of session. You likely have a client ID mismatch or lost QMI state. In this case, you should manually clear the context (via AT or by reacquiring a QMI client and issuing a stop). The --sync command in uqmi is designed for this scenario - it will abort any lingering data sessions that the previous client didn't clean up.
|
||||
|
||||
Use AT to query network registration and signal: You can also do AT+CGREG? (or AT+CEREG? for LTE) to check registration status from the AT perspective, and AT+CSQ for a basic signal reading (though CSQ in LTE is scaled differently, it's not as informative). Quectel's extended command AT+QENG="servingcell" can dump the serving cell info including RSRP/RSRQ too. If for some reason uqmi --get-serving-system isn't giving you the full picture, these AT commands can confirm if the modem itself thinks it's registered or not when the drop happens. In the EG25 case on forums, they had to use AT+QENG and AT+CEREG to investigate a similar “disconnect after a few seconds” issue.
|
||||
|
||||
Check for “ghost bearer” after reconnects: A ghost bearer means the modem kept the data session alive internally even though the host dropped the interface. This often happens if the QMI client (uqmi) exited without issuing a stop, and autoconnect is enabled (modem keeps it alive), or due to some hiccup where the host assumes it's down but modem didn't. The result is that a subsequent uqmi --start-network might fail (because the modem says “already have an active PDN”). If you suspect this, use the AT checks above to verify. The cure is:
|
||||
|
||||
Disable autoconnect (if it's on) so the modem won't immediately try to resurrect it.
|
||||
|
||||
Issue a stop via QMI or a manual deactivate via AT.
|
||||
|
||||
Sync and then start fresh.
|
||||
|
||||
Also, ensure your recovery script, when it detects a drop, first issues uqmi --stop-network <pdh> --autoconnect (with the last known PDH) before starting a new one. If you lost the PDH, try uqmi --stop-network 0xffffffff --autoconnect (some implementations use 0xFFFFFFFF as a wildcard to stop all, but I'm not 100% sure uqmi supports that) - or simply use AT to kill the context as a fallback. The patch we discussed earlier added --sync into the QMI bring-up sequence to handle exactly this sort of case.
|
||||
|
||||
Can AT commands and QMI calls conflict? Generally, querying status via AT is safe and does not interfere with QMI data traffic. Commands like AT+CGACT=0,1 will drop the connection from the modem side, which is akin to pulling the rug under QMI's feet (QMI will get a disconnect event). That's okay if done intentionally (like as part of recovery or testing). Just avoid doing drastic changes via AT (like resetting the modem or changing modes) while the QMI session is up, unless you're prepared for it to drop.
|
||||
|
||||
Verify modem isn't in a weird state: There are cases where the modem might hang in a partially disconnected state (for example, the radio interface “dropped” but the IP interface on the modem is still assigned). Using AT+CFUN? to ensure the modem is in full functionality (should return 1) and not unexpectedly in airplane mode, etc., is a sanity check (though if you can reconnect fine each time, CFUN is likely okay).
|
||||
|
||||
Proper cleanup sequence (best practice): When bringing the link down intentionally (or in your script's recovery logic when a drop is detected):
|
||||
|
||||
1. Call uqmi --stop-network <pdh> [--autoconnect]. Include --autoconnect if autoconnect was enabled, to turn it off for that context
|
||||
2. If uqmi --stop-network hangs or fails (which can happen if the QMI handle was lost), use AT: AT+CGACT=0,1 to force the PDP down.
|
||||
3. Call uqmi --sync to clear any residual client state in QMI
|
||||
4. Now issue uqmi --start-network ... to bring it up again (or let autoconnect handle it if you left it enabled).
|
||||
|
||||
This sequence prevents “bearer ghosting” by explicitly telling both the modem and QMI driver that the session is done, rather than just, say, ifdowning the interface and leaving the modem thinking it's still connected.
|
||||
|
||||
By verifying things at the AT level, you effectively double-check the modem's perspective. This can often reveal issues like “Oh, the modem never dropped the IP context - it's the router that gave up.” or “The modem lost registration (CEREG went to not registered) at the moment of drop, so it was a network/radio loss.” Each finding steers the investigation (toward RF/network causes vs host/QMI causes).
|
||||
|
||||
System-Level Root Causes (OpenWrt/GL.iNet)
|
||||
|
||||
Goal: Determine if any software on the router (OpenWrt system or GL.iNet add-ons) is inadvertently causing the disconnects or modem resets, and ensure we have a “clean” environment for manual control.
|
||||
|
||||
OpenWrt's network management (netifd) and GL.iNet's custom scripts can sometimes interfere with manual control. Let's address those:
|
||||
|
||||
Netifd auto-management: If your EP06-A interface is defined in /etc/config/network with option proto 'qmi' (or 'wwan' depending on GL.iNet's naming), then netifd will try to manage it automatically. This means at boot or on link loss, netifd might call /lib/netifd/proto/qmi.sh to bring it up or down. Since you are running manual uqmi commands, you likely want to avoid netifd doing anything on its own. The simplest way is to remove or disable that interface config:
|
||||
|
||||
Either set option auto '0' in the config (to not bring it up automatically) and refrain from doing an ifup on it. You can also change the protocol to none (unmanaged) so that OpenWrt ignores it (you'll then bring it up via your script only). Also be aware of mwan3 (Multi-WAN manager) if installed. It can mark an interface offline and possibly trigger scripts when it's offline. In the log snippet for GL-A1300, mwan3 saw the interface drop and started switching routes. While mwan3 wouldn't directly disconnect a modem, it might interfere with your test by rapidly cycling interfaces or adding load. If you don't need multi-WAN features, you could disable mwan3 during troubleshooting.
|
||||
|
||||
On OpenWrt, if an interface is configured with proto qmi, netifd spawns /lib/netifd/proto/qmi.sh which in turn runs uqmi commands (including a persistent uqmi -s ... --autoconnect typically). You'll want to ensure that is not running. Use ps to search for any uqmi processes aside from your own. If you find one, it's likely netifd's instance. Bringing the interface down via ifdown or removing the config will stop it. Some routers have scripts to reset the USB port if certain conditions occur (like if the TTY goes away or if no IP after X seconds). Check /etc/hotplug.d/usb/ or /etc/hotplug.d/iface/ for such entries. For example, Rooter (an OpenWrt mod) and perhaps GL.iNet might reset the modem if no connection after a timeout. Disable any such logic to avoid it messing with your manual process.
|
||||
|
||||
Eliminate firewall or DHCP issues: This is a long shot, but sometimes the “disconnect” might not be a disconnect at the modem layer but a networking issue. For instance, if the DHCP lease on wwan0 is very short (say 30 seconds) and something goes wrong in renewal, the interface could drop IP. That suggests the DHCP client might not have gotten a renewal and then interface went down. In your manual setup, if you use uqmi --start-network (which often outputs IP info) and then maybe you call udhcpc for that interface, ensure the DHCP process isn't what's dropping things. You can bypass DHCP by using the IP given by uqmi --get-current-settings and setting it static for testing (just to remove udhcpc from the equation temporarily).
|
||||
|
||||
Ensure that you are not running more than one uqmi at the same time. If you need to get signal info in a loop, do it with a single instance or with a sufficient delay between calls. Each uqmi invocation by default acquires a new client ID for the service and should release it, but if it hangs, it might not release properly. Over time (or rapidly), you could run out of available QMI client IDs and the modem refuses new commands, leading to a hung state that might look like a disconnect.
|
||||
|
||||
If you suspect this, you can use uqmi --get-client-id wds (and for nas) to explicitly manage a single client. Or switch to AT-based polling for things like signal to lighten the QMI load. Turn off any “auto doctor” services (GL health monitor, etc.) so they don't reset the modem out from under you. Stop netifd from fighting your manual control. Ensure power is solid. Then see if the problem replicates in this clean state. If it doesn't drop anymore, one of those services might have been causing it (for example, maybe the health monitor erroneously decided the connection was bad and reset it every 30 seconds). If it still drops, we know the cause is likely either on the modem/network side or a fundamental driver issue, not just an external script.
|
||||
|
||||
This step is about removing variables: we want the modem <-> QMI <-> your script to be the only things in play, so we can trust the observations.
|
||||
|
||||
Quectel AT commands for diagnostics: There are a few vendor-specific AT commands that might help:
|
||||
|
||||
1. AT+QCFG="usbnet" - just to verify the modem is in the expected mode (should return 0 for QMI/MBIM, or 2 for MBIM, etc. If it was accidentally in a different mode that might cause issues, but since you can connect, it's likely fine).
|
||||
2. AT+QCRMERROR - enables extended error reporting for some failures. Not sure if it reports PDP drops, but you could try setting AT+QCRMERROR=1.
|
||||
3. AT+CEER (Call End Extended Report) - after a call/drop, this sometimes reports the last call termination reason. On LTE data calls it might not always give useful info (often more for voice calls), but it's worth a try right after a drop to see if it outputs something like “SM internal error” or “NW cause code XYZ”.
|
||||
4. AT+QIDBG - Quectel has some undocumented debug commands. Unless you have documentation or support from Quectel, it might not be useful to go there.
|
||||
uqmi tool limitations: As mentioned, uqmi is a minimalist tool and has known issues:
|
||||
Reference in New Issue
Block a user