Cell Phone Networks are Just Microservices

Written by: Ben Iofel, Software Engineer at Cape

If you’ve ever wondered how a cell phone carrier works, and looked up a 4G/5G network architecture diagram, you may have found a confusing mess of acronyms and arrows, felt your eyes glaze over, and given up like I did. Here's what I wish someone had told me: ignore the acronyms. It's just microservices.

It turns out, if you can understand a typical distributed microservices backend, you’re only two obscure protocols away from understanding how to build a cell phone network. The two protocols you should know about are Diameter and GTP.

When your phone first connects to our network, it listens for any cell towers yelling around, and tries to authenticate. If Cape has rented capacity on a given tower, the packets will flow from the tower operator’s network, through an IP exchange company that we’ve paid to do BGP router magic and to plug in a couple of fiber optic cables, into an AWS router. In our AWS VPC, we configure a to source the packets into our infrastructure (BGP routers into AWS is roughly how your home internet works too).

In a typical microservices setup, you would likely see a lot of JSON or gRPC traffic. Inside a mobile core, however, all of the signaling & auth messages are encoded in the Diameter protocol. Diameter is kind of like protobuf if it was designed 30 years ago, and all of the RPC methods were pre-defined in the spec. The 4G standards are built on top of this protocol – we don’t get a choice, particularly if we want to communicate with our tower operators and roaming partners. There’s a package we use in all of our code that needs to speak Diameter, namely some proxies and firewalls we wrote.

Diameter itself runs on top of SCTP, a better version of TCP that never really made it outside the cell carrier world, but it’s supported by Linux. It was designed to solve the same problems that QUIC (used by HTTP/3) does.

The tower operator’s network sends us an diameter message, which we handle much like any other RPC method or API call: by doing some SQL queries and sending a response. We look up the IMSI and SIM credentials in our database and reply with an auth challenge in an response. If your phone successfully completes the auth challenge, we get an message. Yes, there's a message called Update-Location-Request. No, we're not tracking you. It’s just telling us which server, on the tower’s side, you’re connected to. The 3GPP naming committee was not thinking about optics. To that message we reply with an diameter response.

Here’s what a real Authentication-Information-Answer response looks like, captured from our lab environment. You can see the authentication challenge vectors at the bottom.

Ok, your phone is now attached to the network, and you’ve got some bars. But there’s one last step needed before your phone can reach IPs on the internet. When your phone finishes authenticating to the network, it then tries to create several pipes, or “bearers” in telco terms, from the phone all the way to the internet. On our end we get a couple of messages, which include your IMSI (to identify your eSIM), the network it wants to connect to (found in your phone’s APN settings, usually preconfigured), and the priority of data being sent (regular internet traffic, real time voice or video, etc…). We allocate an IP address, configure our routers and then reply with a Create-Session-Response, and ta-da. You have internet.

When your phone sends TCP/IP packets to the internet over this established pipe, the tower operator’s routers wrap it in a protocol called GTP, or GPRS Tunneling Protocol. GTP itself runs over UDP. So when you open google.com what we see coming in is actually an HTTP on TLS on TCP on IP on GTP on UDP on IP packet. Yes, really. (The extra layers can get redundant so often the network will do some header compression.) There’s a package we use to write proxies and debugging tools. Otherwise, it’s just a series of routers doing router stuff. GTP comes into our network, we unwrap it, and send it out the other end.

You’ll notice we haven’t mentioned phone calls & SMS yet. That’s because, by the time the 3GPP committee started designing 4G, it made sense to start running it all on top of the existing data network. Calls and SMS are treated as just another type of data application - essentially high-priority VoIP - running through a dedicated pipe.

Your phone actually creates two default pipes when it connects to a network. One is for your general internet needs, but the second one is to the network called “ims”. IMS, or IP multimedia subsystem, refers to the components in a mobile core that handle voice and SMS. This IMS pipe stays open 24/7 just so your phone can listen for incoming calls or SMS’s, even when you aren't using your phone. When your phone specifies “ims” in the Create-Session-Request message, we know to route those connections to our IMS microservices.

Once this is set up, actually making a phone call or sending an SMS uses a protocol called SIP (the same thing that VoIP phones like Google Voice and Twilio do). It looks suspiciously similar to HTTP. For instance, here’s how you’d ring someone’s number:

INVITE sip:1118675309@ims.carrier.net SIP/2.0
Via: SIP/2.0/UDP ims.carrier.net:5060;branch=z9hG4bK776asdhds
To: <sip:+11118675
309@ims.carrier.net>
From: +16789998212@ims.carrier.net;tag=f1326c03
Call-ID: 5b699d25dbb6297d@10.200.126.66

If we get back status code 180 (Ringing), we know their phone is ringing. If we get back a 200 (you know this one), they’ve picked up and your phone can start sending real time voice packets.

And here’s a simplified SIP message showing how your phone sends an SMS:

MESSAGE tel:9999999999;phone-context=ims.carrier.net SIP/2.0
Via: SIP/2.0/UDP ims.carrier.co:5080;branch=z9hG4bK18c066926
From: sip:+12028675309@ims.carrier.net;tag=d61e023
To: <tel:18002274669;phone-context=ims.carrier.net>
Call-ID: 45d619668d1c549b@10.200.126.66
CSeq: 1 MESSAGE
Content-Type: application/vnd.3gpp.sms
Content-Length: 63

Hi, this is your CEO. Please send me $500 in iTunes gift cards.

There’s a bunch of extra headers I’ve left out, and the SMS content is actually encoded in GSM 7-bit encoding, but the idea is that it’s modeled after HTTP.

MMS, as opposed to SMS, actually happens over real HTTP (check the MMSC URL in your APN settings). Each MMS you get is first received as an SMS which notifies your phone to download the MMS content (which itself contains an HTML-type markup language to layout message content and images!)

I’m skipping a ton of details here, but hopefully you get the point. Also, the stuff I just described is pretty specific to 4G mobile cores. Just wait til you learn about 5G, where the spec committee replaced diameter with JSON and made everything Cloud Native™️.

Building a cell carrier may be difficult and thorny, but it’s built on the same building blocks that you would use for any other moderately complicated system (except maybe the routing stuff, unless you work at a cloud provider). The telco industry has been solving distributed systems problems like the rest of the web. They just did it in a parallel universe with different acronyms.

If working on stuff like this sounds like fun, .

Share it

Signup Callout

Switch to Cape,
America's privacy-first mobile carrier.

Premium, nationwide cell service for $99/month with no hidden costs.

Sign up now