Skip to content
This repository has been archived by the owner on Dec 16, 2020. It is now read-only.

MME sometimes drops communication(?) and triggers partial reset from eNB #4

Open
spencersevilla opened this issue Dec 15, 2018 · 0 comments

Comments

@spencersevilla
Copy link

Problem: Okay - this one is a doozie! Every so often (I'm not sure why), the MME will decide not to respond to a UE's request - this is usually, but not always, an Attach or Detach request. When this happens, after exactly 5 seconds, the eNB identifies that something is amiss and sends a Partial Reset message, which used to crash the MME (this happened every 15 mins or so). I fixed Partial Reset handling in the MME, so now what happens is the MME receives the Partial Reset, tears down any connection-related information, and responds to the eNB with a Reset Acknowledgement. At this point, the UE tries again to rejoin the network, usually successfully.

Expected Behavior: S1AP RESET messages are reserved for really weird conditions. All messages should be handled normally and the eNB should seldom/never send a RESET.

Logs/Data: Ask Spencer for them.

Hints: In Wireshark, search for "s1ap.Reset_element" to find the Reset message, and then scroll up EXACTLY five seconds to find the offending packet (it will typically be a query from the eNB that is not answered). Because several phones are usually attaching/detaching at the same time, the best way to follow the exchanges and figure out exactly what packet got ignored is to inspect the packets and search for eNB_S1AP_ID and MME_S1AP_ID, which both serve to uniquely identify the specific UE transaction.

This could be caused by a WIDE range of problems. Sometimes I can find the offending trace in the MME's log, other times I can't at all. When I can find the offending trace, it usually has something to do with a MME ID mismatch, which shouldn't ever happen. Other possible culprits: this seems to happen when the system gets swamped with requests, maybe an issue with SCTP buffer size, or number of simultaneous packets/threads the system can handle, or something leading to a drop? Not sure.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant