+------------------------------------------------------------------------------+
|                                                                              |
|  Dhananjaya D R                   @/Blog   @/Software   @/Resume   @/Contact |
|                                                                              |
+------------------------------------------------------------------------------+


Simulating AWS SNAT Exhaustion on a Raspberry Pi (OpenWrt)
________________________________________________________________________________

One of the most frustrating bottlenecks in cloud infrastructure is 'NAT Gateway 
Source Port Exhaustion'. It’s invisible, expensive to replicate in a real cloud 
environment, and when it hits, it looks like a random network ghost.

AWS NAT Gateways have a hard physical limit 55,000 concurrent connections per 
destination. If your application opens 55,001 connections, packets are silently
dropped.

Why 55,000? AWS NAT Gateways rely on the 16-bit port field for TCP/UDP source 
ports. Mathematically there are 65,535 ports available, AWS NAT Gateways 
reserve a range of source ports (1024-65535) for internal use. This leaves 
approximately 64,512 available ports, but AWS limits this further to 
55,000 to ensure system stability and manage connection states.

I wanted to prove this is a hard resource constraint and verify the 
architectural fix (Secondary IPs). To do this without burning cloud budget, I 
built a scale model using a Raspberry Pi running OpenWrt.


The Setup
________________________________________________________________________________
  
Before breaking the network, I had to build a miniature version of the Cloud.

[1] The 'Gateway' (Raspberry Pi 4): Running OpenWrt. This acts as my NAT 
    Gateway. It sits between my laptop and the internet.
[2] LAN IP: 192.168.50.1 (Connected to my Laptop via Ethernet).
[3] WAN IP: 192.168.1.38 (Interface 'phy0-sta0', connected to my home WiFi).
[4] The 'Application' (Laptop): Connected to the Pi. All its traffic must pass 
    through the Pi to get to the internet.
[5] The Traffic Flow:
    Laptop -> OpenWrt Pi (NAT) -> Home Router -> Internet

Just like in AWS, the Pi performs SNAT (Source NAT). It takes thousands of 
connections from my laptop and squeezes them through its single WAN IP address 
using "Source Ports" to keep track of them.

  
The Test
________________________________________________________________________________

In my lab without generating insane traffic, I simply lowered the ceiling. I 
configured the OpenWrt kernel to cap the maximum number of tracked connections 
to just 5,000 and lowered the TCP timeout to 60 seconds to ensure rapid 
recycling.
  
    ```bash
    root@OpenWrt:~# sysctl -w net.netfilter.nf_conntrack_max=5000
    net.netfilter.nf_conntrack_max = 5000
    
    root@OpenWrt:~# sysctl -w \
    > net.netfilter.nf_conntrack_tcp_timeout_established=60
    net.netfilter.nf_conntrack_tcp_timeout_established = 60
    ```
  
I wrote a Python script (flood.py) to flood the router with connection attempts 
to a "dead" IP ('192.168.1.253'). This forces the router to keep the connections
open in the 'SYN_SENT' state, filling the table instantly 

[1] Start Flood: The script ramped up immediately.
[2] Monitor: I watched the counter hit the limit instantly.
  
    ```bash
    root@OpenWrt:~# watch -n 0.5 "sysctl net.netfilter.nf_conntrack_count"
    Every 0.5s: sysctl net.netfilter.nf_conntrack_count       
    OpenWrt: Sat Jan 24 20:50:00 2026
    
    net.netfilter.nf_conntrack_count = 5000
    ```
  
[3] The Failure: While the table was full, I tried to browse the web from the 
    laptop.
  
    ```bash
    $ curl -I --connect-timeout 2 https://www.google.com
    curl: (28) Resolving timed out after 2000 milliseconds
    ```

The router physically refused to open a new connection for the DNS lookup 
because there were "no seats left" in the conntrack table.


The Fix
________________________________________________________________________________
  
The standard AWS fix for this is adding a Secondary IPv4 Address to the NAT 
Gateway.
              1 IP = 55k Ports. 2 IPs = 110k Ports.

To prove this architecture works, I added a secondary IP alias ('192.168.1.99') 
to the Pi's WAN interface ('phy0-sta0'). I then used 'nftables' to split the 
traffic 50/50 between the main IP and the new secondary IP.

    ```bash
    root@OpenWrt:~# ip addr add 192.168.1.99/24 dev phy0-sta0
    
    root@OpenWrt:~# nft add rule inet fw4 srcnat oifname "phy0-sta0" \
    > meta l4proto tcp numgen random mod 2 0 \
    > counter snat ip to 192.168.1.99:10000-10100
    
    root@OpenWrt:~# nft add rule inet fw4 srcnat oifname "phy0-sta0" \
    > meta l4proto tcp counter masquerade to :10000-10100
    
    root@OpenWrt:~# nft list chain inet fw4 srcnat
    table inet fw4 {
        chain srcnat {
            oifname "phy0-sta0" meta l4proto tcp numgen random mod 2 0 \
                counter packets 0 bytes 0 \
                snat ip to 192.168.1.99:10000-10100
            oifname "phy0-sta0" meta l4proto tcp \
                counter packets 0 bytes 0 \
                masquerade to :10000-10100
        }
    }
    ```

I ran the flood script again. If the fix works, the traffic should be 
distributed across both IPs. I checked the internal 'nftables' counters.

    ```bash
    root@OpenWrt:~# nft list chain inet fw4 srcnat
    table inet fw4 {
        chain srcnat {
            oifname "phy0-sta0" meta l4proto tcp numgen random mod 2 0 \
                counter packets 3992 bytes 239520 \
                snat ip to 192.168.1.99:10000-10100
            oifname "phy0-sta0" meta l4proto tcp \
                counter packets 4143 bytes 248580 \
                masquerade to :10000-10100
        }
    }
    ```

The counters proved the split works (~49% vs ~51%). By adding a second IP, I 
added a 'second lane' for traffic and bypass the bottleneck. In AWS, this 
doubles capacity from 55,000 to 110,000 connections.


Appendix: flood.py
________________________________________________________________________________
  
Here is the 'flood.py' script I used to fill the table. It uses 
'multiprocessing' to open connections faster than the router can expire them.

    ```python
    import socket
    import multiprocessing
    import time
    import random
    
    TARGET_IP = "192.168.1.253" 
    
    def flooder(process_id):
        sock_list = []
        while True:
            try:
                s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
                s.setblocking(0)
                
                dst_port = random.randint(1000, 65000)
                
                try:
                    s.connect((TARGET_IP, dst_port))
                except (BlockingIOError, OSError):
                    pass
                
                sock_list.append(s)
                
                if len(sock_list) > 10000:
                    old_s = sock_list.pop(0)
                    old_s.close()
                    
            except Exception:
                time.sleep(0.1)
    
    if __name__ == '__main__':
        print(f"Igniting flood against {TARGET_IP}...")
        for i in range(multiprocessing.cpu_count() * 2):
            p = multiprocessing.Process(target=flooder, args=(i,))
            p.daemon = True
            p.start()
            
        while True: 
            time.sleep(1)
    
    ```
________________________________________________________________________________