This week’s system design refresher:
Reverse Proxy vs API Gateway vs Load Balancer (YouTube video)
Typical AWS Network Architecture in one diagram
15 Open-Source Projects That Changed the World
Top 6 Database Models
How do we detect node failures in distributed systems?
SPONSOR US
POST/CON 24 will be an unforgettable experience! Connect with peers who are as enthusiastic about APIs as you are, all as you come together to:
Learn: Get first-hand knowledge from Postman experts and global tech leaders.
Level up: Attend 8-hour workshops to leave with new skills (and badges!)
Become the first to know: See the latest API platform innovations, including advancements in AI.
Help shape the future of Postman: Give direct feedback to the Postman leadership team.
Network with fellow API practitioners and global tech leaders — including speakers from OpenAI, Heroku, and more.
Have fun: Enjoy cocktails, dinner, 360° views of the city, and a live performance from multi-platinum recording artist T-Pain!
So grab your Early Adopter ticket for 30% off now while you can, because you don’t want to miss this!
Register Now
Reverse Proxy vs API Gateway vs Load Balancer
One picture is worth a thousand words - Typical AWS Network Architecture in one diagram
Amazon Web Services (AWS) offers a comprehensive suite of networking services designed to provide businesses with secure, scalable, and highly available network infrastructure. AWS's network architecture components enable seamless connectivity between the internet, remote workers, corporate data centers, and within the AWS ecosystem itself.
VPC (Virtual Private Cloud)
At the heart of AWS's networking services is the Amazon VPC, which allows users to provision a logically isolated section of the AWS Cloud. Within this isolated environment, users can launch AWS resources in a virtual network that they define.
AZ (Availability Zone)
An AZ in AWS refers to one or more discrete data centers with redundant power, networking, and connectivity in an AWS Region.
Now let’s go through the network connectivity one by one:
Connect to the Internet - Internet Gateway (IGW)
An IGW serves as the doorway between your AWS VPC and the internet, facilitating bidirectional communication.
Remote Workers - Client VPN Endpoint
AWS offers a Client VPN service that enables remote workers to access AWS resources or an on-premises network securely over the internet. It provides a secure and easy-to-manage VPN solution.
Corporate Data Center Connection - Virtual Gateway (VGW)
A VGW is the VPN concentrator on the Amazon side of the Site-to-Site VPN connection between your network and your VPC.
VPC Peering
VPC Peering allows you to connect two VPCs, enabling you to route traffic between them using private IPv4 or IPv6 addresses.
Transit Gateway
AWS Transit Gateway acts as a network transit hub, enabling you to connect multiple VPCs, VPNs, and AWS accounts together.
VPC Endpoint (Gateway)
A VPC Endpoint (Gateway type) allows you to privately connect your VPC to supported AWS services and VPC endpoint services powered by PrivateLink without requiring an internet gateway, VPN.
VPC Endpoint (Interface)
An Interface VPC Endpoint (powered by AWS PrivateLink) enables private connections between your VPC and supported AWS services, other VPCs, or AWS Marketplace services, without requiring an IGW, VGW, or NAT device.
SaaS Private Link Connection
AWS PrivateLink provides private connectivity between VPCs and services hosted on AWS or on-premises, ideal for accessing SaaS applications securely.
Latest articles
If you’re not a paid subscriber, here’s what you missed this month.
15 Open-Source Projects That Changed the World
The Top 3 Resume Mistakes Costing You the Job
How Video Recommendations Work - Part 1
How to Design a Good API?
How do We Design for High Availability?
To receive all the full articles and support ByteByteGo, consider subscribing:
15 Open-Source Projects That Changed the World
To come up with the list, we tried to look at the overall impact these projects have created on the industry and related technologies. Also, we’ve focused on projects that have led to a big change in the day-to-day lives of many software developers across the world.
Web Development
Node.js: The cross-platform server-side Javascript runtime that brought JS to server-side development
React: The library that became the foundation of many web development frameworks.
Apache HTTP Server: The highly versatile web server loved by enterprises and startups alike. Served as inspiration for many other web servers over the years.
Data Management
PostgreSQL: An open-source relational database management system that provided a high-quality alternative to costly systems
Redis: The super versatile data store that can be used a cache, message broker and even general-purpose storage
Elasticsearch: A scale solution to search, analyze and visualize large volumes of data
Developer Tools
Git: Free and open-source version control tool that allows developer collaboration across the globe.
VSCode: One of the most popular source code editors in the world
Jupyter Notebook: The web application that lets developers share live code, equations, visualizations and narrative text.
Machine Learning & Big Data
Tensorflow: The leading choice to leverage machine learning techniques
Apache Spark: Standard tool for big data processing and analytics platforms
Kafka: Standard platform for building real-time data pipelines and applications.
DevOps & Containerization
Docker: The open source solution that allows developers to package and deploy applications in a consistent and portable way.
Kubernetes: The heart of Cloud-Native architecture and a platform to manage multiple containers
Linux: The OS that democratized the world of software development.
Over to you: Do you agree with the list? What did we miss?
Top 6 Database Models
The diagram below shows top 6 data models.
Flat Model
The flat data model is one of the simplest forms of database models. It organizes data into a single table where each row represents a record and each column represents an attribute. This model is similar to a spreadsheet and is straightforward to understand and implement. However, it lacks the ability to efficiently handle complex relationships between data entities.
Hierarchical Model
The hierarchical data model organizes data into a tree-like structure, where each record has a single parent but can have multiple children. This model is efficient for scenarios with a clear "parent-child" relationship among data entities. However, it struggles with many-to-many relationships and can become complex and rigid.
Relational Model
Introduced by E.F. Codd in 1970, the relational model represents data in tables (relations), consisting of rows (tuples) and columns (attributes). It supports data integrity and avoids redundancy through the use of keys and normalization. The relational model's strength lies in its flexibility and the simplicity of its query language, SQL (Structured Query Language), making it the most widely used data model for traditional database systems. It efficiently handles many-to-many relationships and supports complex queries and transactions.
Star Schema
The star schema is a specialized data model used in data warehousing for OLAP (Online Analytical Processing) applications. It features a central fact table that contains measurable, quantitative data, surrounded by dimension tables that contain descriptive attributes related to the fact data. This model is optimized for query performance in analytical applications, offering simplicity and fast data retrieval by minimizing the number of joins needed for queries.
Snowflake Model
The snowflake model is a variation of the star schema where the dimension tables are normalized into multiple related tables, reducing redundancy and improving data integrity. This results in a structure that resembles a snowflake. While the snowflake model can lead to more complex queries due to the increased number of joins, it offers benefits in terms of storage efficiency and can be advantageous in scenarios where dimension tables are large or frequently updated.
Network Model
The network data model allows each record to have multiple parents and children, forming a graph structure that can represent complex relationships between data entities. This model overcomes some of the hierarchical model's limitations by efficiently handling many-to-many relationships.
Over to you: Which database model have you used?
How do we detect node failures in distributed systems?
The diagram below shows top 6 Heartbeat Detection Mechanisms.
Heartbeat mechanisms are crucial in distributed systems for monitoring the health and status of various components. Here are several types of heartbeat detection mechanisms commonly used in distributed systems:
Push-Based Heartbeat
The most basic form of heartbeat involves a periodic signal sent from one node to another or to a monitoring service. If the heartbeat signals stop arriving within a specified interval, the system assumes that the node has failed. This is simple to implement, but network congestion can lead to false positives.
Pull-Based Heartbeat
Instead of nodes sending heartbeats actively, a central monitor might periodically "pull" status information from nodes. It reduces network traffic but might increase latency in failure detection.
Heartbeat with Health Check
This includes diagnostic information about the node's health in the heartbeat signal. This information can include CPU usage, memory usage, or application-specific metrics. It Provides more detailed information about the node, allowing for more nuanced decision-making. However, it Increases complexity and potential for larger network overhead.
Heartbeat with Timestamps
Heartbeats that include timestamps can help the receiving node or service determine not just if a node is alive, but also if there are network delays affecting communication.
Heartbeat with Acknowledgement
The receiver of the heartbeat message must send back an acknowledgment in this model. This ensures that not only is the sender alive, but the network path between the sender and receiver is also functional.
Heartbeat with Quorum
In some distributed systems, especially those involving consensus protocols like Paxos or Raft, the concept of a quorum (a majority of nodes) is used. Heartbeats might be used to establish or maintain a quorum, ensuring that a sufficient number of nodes are operational for the system to make decisions. This brings complexity in implementation and managing quorum changes as nodes join or leave the system.
SPONSOR US
Get your product in front of more than 500,000 tech professionals.
Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases.
Space Fills Up Fast - Reserve Today
Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing hi@bytebytego.com.