The WAN Street Journal: October 2024

Sunday, 27 October 2024

SONiC: The Linux of Networking?

Alright, let’s talk about SONiC, which some folks in the industry are hyping as “the Linux of networking.” You know the drill: modular, open-source, scalable—basically, all the buzzwords that make execs perk up in meetings. But is SONiC really the networking equivalent of Linux, or is it just another shiny object vendors are waving around? Let’s unpack that without too much of the usual hand-waving nonsense.

What is SONiC?

For those not paying attention, SONiC (Software for Open Networking in the Cloud) is a network operating system that came out of Microsoft Azure’s need to manage massive scale, with the agility to tweak and optimize everything from routing to network monitoring. It runs on commodity hardware and supports the same kind of flexibility and customization that made Linux the beast it is in the server world. It’s got all the key ingredients to be a big deal: open-source, vendor-agnostic, and built on a foundation of solid Linux fundamentals.

SONiC’s Modular Mojo

Here’s where SONiC shines: modularity. You can mix and match components—BGP, SNMP, DHCP—like a networking Lego set. That’s not just cool; it’s necessary in a world where cookie-cutter network solutions don’t cut it. Networks have evolved, and vendors who still want to lock you into a stack of proprietary nonsense are looking more outdated than ever. With SONiC, you can take the parts you like and ditch the bloat. That’s Linux thinking, and it works.

Linux-Level Freedom, but at What Cost?

The comparison to Linux isn’t just marketing fluff—it’s legit in a lot of ways. Like Linux, SONiC gives you the freedom to screw up your system just as much as you want. That’s the trade-off of customization: power, but also responsibility. If you’re going to run a large-scale network on SONiC, you need people who know their stuff. It’s not for network engineers who like to color inside the lines. But then again, the most interesting things happen when you go outside the lines, right?

Where SONiC Doesn’t Quite Measure Up

Let’s be real for a second: Linux is everywhere. Your phone, your router, your server—it’s the backbone of the modern internet. SONiC isn’t there yet. Sure, it’s growing in the cloud and data center worlds, but it’s far from ubiquitous in the enterprise. Vendors like Cisco and Juniper still dominate that space with their full-stack solutions. SONiC is the punk rock newcomer, while those guys are the stadium-filling legacy acts. But we’ve seen how that story plays out in tech: the scrappy underdog usually wins in the long run.

The Future of SONiC: Overhyped or Underrated?

SONiC’s trajectory mirrors that of Linux’s early days—exciting but rough around the edges. We’ll see more adoption as it matures and as more companies realize they don’t need to shell out for bloated solutions. However, it’s not quite ready for prime-time across every use case. You’re not going to replace your entire network with SONiC tomorrow unless you enjoy living on the bleeding edge, which, let’s be honest, is a pain in the ass.

Still, if you’re building a cloud network or massive-scale infrastructure, you’d be foolish not to consider SONiC. It’s got the potential to disrupt the traditional networking landscape—just don’t expect it to do the heavy lifting for you. That’s still on you, and SONiC isn’t apologizing for it.

Bottom Line

Is SONiC the Linux of networking? Yeah, kind of. It’s open, flexible, and potentially game-changing, but also complex, demanding, and not without its rough spots. If you want control and freedom, SONiC delivers. If you want a turnkey solution, stick with your traditional vendor and keep paying the big bucks.

SONiC’s not here to hold your hand. It’s here to give you the tools, and what you do with them is up to you. That’s what makes it exciting—and, yeah, a little terrifying.

Friday, 25 October 2024

The 101 on VXLAN EVPN Fabrics

This blog post aims to break down the architecture behind a VXLAN EVPN Data Centre Fabric. By the end of this, you’ll have a clear understanding of how the components fit together, both from an underlay and overlay perspective.

Why VXLAN EVPN?

Traditional data center networks, typically based on VLANs and spanning tree protocol (STP), face challenges in scalability, redundancy, and network segmentation. As applications become more dynamic, demanding better east-west traffic handling and more flexible workload mobility, traditional Layer 2 and Layer 3 constructs start to show their limitations.

Enter VXLAN (Virtual Extensible LAN) and EVPN (Ethernet VPN), which together solve these challenges. VXLAN expands Layer 2 networks over Layer 3 boundaries, while EVPN serves as the control plane, managing MAC address learning, reducing flooding, and enabling better network segmentation.

The VXLAN EVPN Architecture

At its core, a VXLAN EVPN fabric follows a Leaf-Spine architecture, which ensures consistent high-speed performance, predictable latency, and flexibility in scaling. The architecture is split into two major layers: the underlay and the overlay.

Underlay: The Physical Foundation

The underlay is the physical network on which the VXLAN overlay is built. In a VXLAN EVPN fabric, the underlay is based on Layer 3 IP routing rather than traditional Layer 2 switching. The typical underlay consists of:

• Spine Switches: These are the backbone of the fabric. They provide interconnectivity between the leaf switches. Since all leaf switches connect to every spine, traffic can be routed efficiently with predictable latency. Spine switches don’t hold any intelligence about the endpoint devices—they just forward packets based on routing protocols.

• Leaf Switches: These are the access layer switches where hosts, servers, and virtual machines connect. In a VXLAN fabric, leaf switches handle endpoint communication, serving as the primary building blocks for service connectivity. They terminate VXLAN tunnels and communicate with the spine switches to route traffic.

• Routing Protocols: The underlay uses a routing protocol like BGP (Border Gateway Protocol) or OSPF (Open Shortest Path First) to provide reachability between the leaf and spine switches. This ensures that the IP-based underlay can forward VXLAN traffic reliably.

The underlay exists to ensure that all nodes can reach each other over Layer 3. IP addresses are assigned to each device, and routing protocols ensure there is a stable and redundant network with no loops or bottlenecks.

Overlay: Extending Layer 2 over Layer 3

While the underlay provides IP reachability, the VXLAN overlay is responsible for building the Layer 2 networks over this IP infrastructure. VXLAN (RFC 7348) encapsulates Ethernet frames into UDP packets, allowing Layer 2 segments to span across Layer 3 boundaries.

• VXLAN Tunnel Endpoints (VTEPs): Each leaf switch serves as a VTEP. VTEPs are responsible for encapsulating and decapsulating Layer 2 frames into/from VXLAN packets. When a host sends traffic to another host in a different Layer 2 domain, the VTEP on the source leaf encapsulates the traffic, forwards it across the Layer 3 underlay, and the destination VTEP decapsulates it and delivers it to the intended host.

• VXLAN Network Identifier (VNI): In VXLAN, each Layer 2 domain (akin to a VLAN) is identified by a VXLAN Network Identifier (VNI). This enables multi-tenancy and traffic separation within the data center fabric. Each VNI maps to a traditional VLAN, extending the Layer 2 segment across the entire fabric.

EVPN Control Plane

The key to making VXLAN scalable and reducing Layer 2 broadcast storms is the EVPN control plane. Unlike older VXLAN deployments that relied on flood-and-learn mechanisms, EVPN brings in a more intelligent, BGP-based control plane.

• MAC-to-IP Binding: EVPN allows the leaf switches (VTEPs) to advertise MAC and IP bindings through BGP, creating a distributed database of endpoint information. This reduces the need for flooding ARP requests throughout the fabric, as switches can query the EVPN control plane for endpoint locations.

• Layer 2 and Layer 3 Connectivity: EVPN doesn’t just handle Layer 2 traffic (MAC addresses); it can also advertise Layer 3 information (IP addresses). This means the fabric can handle both Layer 2 switching and Layer 3 routing seamlessly, making it an ideal choice for modern data centers.

Architectural Advantages

1. Scalability: The Leaf-Spine design provides non-blocking performance and predictable scaling. As you grow your network, you can simply add more leaf or spine switches without redesigning the entire network.

2. Workload Mobility: VXLAN allows workloads to be moved anywhere in the data center without changing IP addresses, thanks to the ability to extend Layer 2 domains over the Layer 3 underlay.

3. Multi-tenancy: VXLAN provides the ability to create isolated Layer 2 segments using VNIs, making it ideal for multi-tenant environments where each tenant needs its own virtualized network.

4. Efficient Traffic Flow: EVPN eliminates the need for broadcast flooding by learning MAC and IP information via BGP advertisements. This creates a more efficient fabric and improves overall network performance.

Conclusion

To recap, we’ve laid out the foundational architecture of a VXLAN EVPN data centre fabric, focusing on the Leaf-Spine topology, underlay and overlay design, and the importance of EVPN as a control plane.

Wednesday, 23 October 2024

Big O Notation: The Necessary Evil Of Code Performance

Alright, let’s dive into Big O notation. You’ve probably heard of it, right? That cryptic mathematical mumbo-jumbo developers like to toss around to sound smart in code reviews. “Oh, yeah, that algorithm’s O(n^2), you might want to optimize it.” Sure, buddy. Meanwhile, most people are writing code that’s barely a step up from spaghetti and praying it doesn’t crash in production.

Let me break it down: Big O notation is how we measure the complexity of an algorithm. It’s supposed to help you figure out how well your code scales as the input size grows. In theory, it sounds like something you should definitely care about. In practice? Well, let’s just say if most developers even get their code working, they’re ready to call it a day.

Big O is essentially the ultimate excuse for developers to avoid dealing with real-world problems. It’s like, “Hey, I know the app is slow, but I did the math and it’s O(n log n), so it’s probably fine.” Meanwhile, the users are out here refreshing their browsers wondering why your beautifully efficient algorithm is taking forever to load. Congrats, your code is theoretically fast. Too bad it sucks in practice.

Here’s the thing: Big O is important, but not in the way people like to pretend. It’s not about flexing your brain muscles and tossing out impressive-sounding jargon. It’s about being practical. The next time you’re writing that for-loop nested inside another for-loop, just ask yourself, “Is this going to blow up when someone tries to use it with a dataset bigger than my local dev environment?” If the answer is yes, you’ve got a problem, no Big O analysis required.

Let’s talk real-world scenarios. You’re building an app that processes a list of user transactions. You start with a simple loop—cool, no sweat. Then, for every transaction, you decide to check it against a list of previous transactions with another loop. Boom! O(n^2) and your app’s on its way to lagging harder than your ancient family PC trying to run Crysis. It doesn’t take a computer science degree to realize that doubling the data means quadrupling the time. And now you’re stuck optimizing because you didn’t think about it earlier.

But here’s the twist: do you always need to care? No! You’re not Google. You’re probably not even handling a fraction of their data. So, when you’re processing 1,000 rows, Big O can sit down and take a break. Just ship the code. But if you're dealing with 10 million rows? Yeah, maybe take a minute and actually think about it.

Big O is like flossing—everyone knows they should do it, but most people only care when something starts hurting. It’s not there to make you feel bad, it’s just a way of saying, “Hey, think ahead a bit.” Don’t overthink it, though—most of the time, scaling issues aren’t going to hit you until your app actually gets some users. If that day comes, congratulations, you’ve got bigger problems now, and sure, then you can break out the Big O cheat sheet.

So, to wrap this up: Big O notation is cool, I guess. It’s not magic, it’s not going to make you a coding god, and most of the time, nobody’s asking you to bust out a whiteboard and graph time complexity during a stand-up. But if you’re writing a nested loop or doubling data unnecessarily, Big O’s your friend. Just don’t get too excited. Half the time, you’re writing CRUD apps, not inventing the next big algorithm. Chill out.

Tuesday, 22 October 2024

The Big O Cheat Sheet For Code Performance

Following on from my previous blog post about what the Big 0 is all about let’s dive into Big O notation and break down what it really means. This cheat sheet will help you understand the most common Big O terms you might encounter in technical interviews or algorithm discussions. We’ll walk through each one, explaining what they represent, and I’ll include some Python examples to make the concepts clearer

1. O(1): Constant Time

This is the holy grail. O(1) means no matter how big your input gets, the time to run the operation stays the same. It’s like looking up a value in a dictionary—no drama, no sweat, always the same time.

Example:


def get_first_element(items):
    return items[0]

# Input size doesn't matter, always O(1)
get_first_element([1, 2, 3, 4, 5])  # O(1)

2. O(n): Linear Time

O(n) means the time it takes to run your code increases directly with the size of the input. The bigger the list, the longer it takes. A simple for-loop over a list? Yep, that’s O(n). Welcome to average coding life.

Example:

def print_all_elements(items):
    for item in items:
        print(item)

# The more items, the more work. O(n)
print_all_elements([1, 2, 3, 4, 5])  # O(n)

3. O(n^2): Quadratic Time

O(n^2) is the “I’m about to make your life hell” of algorithms. This is what happens when you start nesting loops, and it doesn’t scale well. The time to execute grows exponentially as the input size increases.

Example:

def print_all_pairs(items):
    for i in items:
        for j in items:
            print(i, j)

# If items has 5 elements, you're printing 25 pairs. O(n^2)
print_all_pairs([1, 2, 3, 4, 5])  # O(n^2)

4. O(log n): Logarithmic Time

O(log n) is your performance-friendly pal. Think binary search—cutting the problem size in half at every step. You barely do any work while still looking smart.

Example:

def binary_search(items, target):
    low, high = 0, len(items) - 1
    while low <= high:
        mid = (low + high) // 2
        if items[mid] == target:
            return mid
        elif items[mid] < target:
            low = mid + 1
        else:
            high = mid - 1
    return -1

# Searching a sorted list is O(log n)
binary_search([1, 2, 3, 4, 5, 6, 7, 8, 9], 7)  # O(log n)

5. O(n log n): Linearithmic Time

O(n log n) is your fancy-sounding way to describe algorithms like merge sort or quicksort. It’s better than O(n^2), but not as slick as O(n). Still, not bad for a sorting algorithm.

Example:

def merge_sort(items):
    if len(items) <= 1:
        return items
    mid = len(items) // 2
    left = merge_sort(items[:mid])
    right = merge_sort(items[mid:])
    
    return merge(left, right)

def merge(left, right):
    sorted_list = []
    i = j = 0
    while i < len(left) and j < len(right):
        if left[i] < right[j]:
            sorted_list.append(left[i])
            i += 1
        else:
            sorted_list.append(right[j])
            j += 1
    sorted_list.extend(left[i:])
    sorted_list.extend(right[j:])
    return sorted_list

# Merge Sort is O(n log n)
merge_sort([5, 3, 8, 6, 2, 7, 4, 1])  # O(n log n)

6. O(2^n): Exponential Time

O(2^n) is when things start to spiral out of control. Your algorithm doubles in execution time with each additional element. Recursion-heavy algorithms, like the naïve Fibonacci calculation, live here—and it's not a fun neighborhood.

Example:

def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

# Every new 'n' means twice the work. O(2^n)
fibonacci(5)  # O(2^n)

7. O(n!): Factorial Time

O(n!) is the algorithmic equivalent of lighting your server on fire. This is what happens when you brute-force permutations. It gets ugly fast. Think of trying to compute all possible orderings of a list.

Example:

import itertools

def permutations(items):
    return list(itertools.permutations(items))

# Permutations are factorial. O(n!)
permutations([1, 2, 3])  # O(n!)

Wrapping It Up

There you go—a quick rundown of Big O notations with Python examples. Will you need to use O(n^2) or O(log n) in every codebase? Probably not. But when someone asks, now you can spit out a couple of Big O terms, drop a “yeah, this could be optimized,” and move on like a pro.

My Thoughts on Imposter Syndrome

Let’s talk about imposter syndrome. Everyone’s got their theories: it’s the silent career killer, the gnawing feeling that you’re one step away from being exposed as a fraud. And sure, in a world where every job is some idealized version of itself, maybe that’s true. But here’s the kicker: imposter syndrome can’t exist when the bar is set low.

You know what’s great about the world? Most people suck. Yeah, I said it. And I don’t mean that as some misanthropic, “the world is trash” kind of sentiment. I mean it in a real-world, walk-into-an-office-and-look-around way. Chances are, half the people you work with are just skating by, doing the bare minimum, collecting a paycheck, and fooling everyone into thinking they’re busy.

Here’s the dirty little secret: the bar is low. So low, in fact, that just giving half a damn about your work sets you apart. You feel like an imposter? Really? Look around. People are coasting on mediocrity while you’re sitting there thinking, “Maybe I don’t belong here.” Guess what? If you even care enough to have that thought, you’re already doing better than most.

Imposter syndrome assumes that there’s this towering standard to live up to. But most of the time, that standard’s a myth. We live in a world of minimum viable products, Agile sprints, and "good enough" solutions. We half-ass, ship it, and iterate. You know what that means? It means the expectations have been so thoroughly dumbed down that anyone who remotely tries isn’t an imposter—they’re the overachiever in the room.

Let’s put it this way: if you’re constantly looking over your shoulder, waiting to get caught, chances are no one’s even paying enough attention to notice. We’re in the age of inbox zero and KPIs nobody understands, not the Renaissance. The average workplace is just a bunch of people crossing off tasks like they’re going for a high score in a to-do list app, not rewriting the rules of human achievement.

So stop worrying about being an imposter. You’re not. The bar is so low you’d have to consciously try to trip over it. And if you do mess up, guess what? So does everyone else. Half your coworkers can’t remember the login to their own damn HR portal. You think they’re going to catch your little slip-up?

The real lesson here? When the bar’s set low, imposter syndrome becomes irrelevant. Just show up, do the work, and recognize that nobody else is playing on some higher plane. You’re not an imposter. The game’s rigged to make you think you are.