Parin's musings

System Design Study Notes: Key Patterns and Concepts

1. Scalability and Load Balancing

Key Concepts:

Horizontal and Vertical Scaling

  • Horizontal Scaling: Adding more machines (servers) to handle increased load. Often more flexible for large-scale systems.
  • Vertical Scaling: Increasing the resources (CPU, RAM) of a single machine. Easier to implement but has limits.
  • Examples:
    • Horizontal Scaling: Adding more web servers behind a load balancer to handle more traffic.
    • Vertical Scaling: Increasing the memory of a database server to handle larger queries.

Load Balancing

  • Usage: Distribute incoming traffic across multiple servers to avoid overloading any single server.
  • Load Balancing Algorithms:
    • Round Robin: Requests are distributed sequentially among servers.
    • Least Connections: The server with the fewest active connections is chosen.
    • Weighted Round Robin: Servers with higher capabilities get more traffic.
  • Examples:
    • Web Server Load Balancing: Distributing HTTP requests across multiple web servers.
    • Database Load Balancing: Distributing read requests across replicas to optimize database performance.

2. Data Partitioning and Sharding

Key Concepts:

Sharding

  • Usage: Partition data across multiple machines to handle large datasets. Each shard contains a subset of the data.
  • Shard Keys: The key used to determine which shard stores the data.
  • Examples:
    • User-Based Sharding: Partition user data by user ID, where each shard contains users whose IDs fall within a specific range.
    • Geographic Sharding: Partition data based on geographic regions (e.g., users from North America in one shard and users from Europe in another).

Data Partitioning

  • Horizontal Partitioning: Splitting rows of data across different tables or databases.
  • Vertical Partitioning: Splitting columns of data across different tables or databases.
  • Examples:
    • Horizontal Partitioning: Splitting a large table of users into smaller tables based on regions.
    • Vertical Partitioning: Splitting user profiles, with frequently accessed data (e.g., username, email) in one table and less accessed data (e.g., address, preferences) in another.

3. Caching

Key Concepts:

Cache Placement

  • Client-Side Caching: Caching data locally on the user's machine or browser.
  • Server-Side Caching: Caching data on the server to reduce load and improve performance.
  • Distributed Caching: Caching data across a distributed system to serve multiple clients and servers efficiently.
  • Examples:
    • CDNs (Content Delivery Networks): Use caching to store static assets like images, CSS, and JavaScript files closer to users.
    • Memcached / Redis: In-memory caching systems used to cache database query results or frequently accessed data.

Cache Eviction Policies

  • Least Recently Used (LRU): Discards the least recently accessed items first.
  • First In First Out (FIFO): Removes the oldest items first.
  • Least Frequently Used (LFU): Removes items that are accessed the least frequently.
  • Examples:
    • Caching User Sessions: Using LRU to remove old user sessions.
    • Caching Product Pages: Using LFU to remove products that are less popular.

4. Database Design

Key Concepts:

SQL vs NoSQL

  • SQL Databases (Relational): Use structured schemas and are best for complex queries and ACID transactions.
    • Examples: MySQL, PostgreSQL, Oracle.
  • NoSQL Databases: Schema-less and optimized for scalability and flexibility. Use when you need to handle large volumes of unstructured or semi-structured data.
    • Examples: MongoDB, Cassandra, DynamoDB.

CAP Theorem

  • Consistency: Every read receives the most recent write.
  • Availability: Every request receives a response, without a guarantee of the most recent write.
  • Partition Tolerance: The system continues to function even if communication between servers is unreliable.
  • Examples:
    • NoSQL Databases (like Cassandra): Prioritize availability and partition tolerance over consistency.
    • SQL Databases (like MySQL): Prioritize consistency and availability over partition tolerance.

Replication

  • Master-Slave Replication: Writes go to the master, and reads can be distributed among slaves.
  • Multi-Master Replication: Multiple nodes accept writes, and consistency is maintained between them.
  • Examples:
    • Read Replicas in MySQL: Set up multiple read replicas to distribute read requests and reduce load on the master database.
    • Multi-Master Replication in Cassandra: Distribute writes across multiple nodes to ensure high availability and fault tolerance.

ACID Properties

  • Atomicity, Consistency, Isolation, Durability (ACID): Ensure reliable transactions in databases.
  • Examples:
    • Banking Systems: Ensuring transactional integrity during financial operations like transfers.
    • Order Management: Ensuring that once an order is placed, the inventory is reduced and confirmed atomically.

5. Consistency and Availability

Key Concepts:

Eventual Consistency

  • Usage: Data will eventually become consistent across all nodes in a distributed system, but not immediately.
  • Examples:
    • DNS Systems: DNS records are eventually consistent across global servers.
    • Distributed NoSQL Databases: Systems like Cassandra and DynamoDB often use eventual consistency to ensure high availability.

Strong Consistency

  • Usage: Every read receives the most recent write. Ideal for critical systems where consistency is paramount.
  • Examples:
    • Banking Systems: Ensuring that account balances are always consistent.
    • Real-Time Bidding Systems: Ensuring that all users see the latest auction results.

6. Message Queuing and Asynchronous Processing

Key Concepts:

Message Queues

  • Usage: Decouple systems by allowing them to communicate asynchronously via message passing.
  • Examples:
    • Task Queues: Send tasks to background workers using queues like RabbitMQ or AWS SQS.
    • Log Processing: Use message queues to process log data asynchronously.

Publish-Subscribe Model (Pub/Sub)

  • Usage: A messaging pattern where senders (publishers) send messages to subscribers who listen for specific events or topics.
  • Examples:
    • Real-Time Notifications: Send notifications to users when certain events occur.
    • Event Streaming: Kafka is used to stream large amounts of event data for real-time analytics.

Polling vs Push Systems

  • Polling: Continuously check the server for updates (pull).
  • Push: The server sends updates directly to the client as they occur.
  • Examples:
    • Email Notification Systems: Use polling to check for new emails at regular intervals.
    • Push Notifications: Push updates to mobile devices when new messages arrive.

7. Rate Limiting

Key Concepts:

Usage

  • Control the rate at which users or services can make requests to a system, preventing abuse and overload.
  • Examples:
    • API Rate Limiting: Limit the number of requests a user can make to an API in a given time frame.
    • Login Attempts: Limit the number of login attempts a user can make to prevent brute-force attacks.

Techniques for Rate Limiting

  • Token Bucket Algorithm: A bucket of tokens is replenished over time; each request consumes a token.
  • Leaky Bucket Algorithm: Similar to token bucket, but allows for a more controlled flow rate of requests.
  • Examples:
    • Token Bucket: Allowing 100 API requests per minute.
    • Leaky Bucket: Limiting download speed by controlling the flow of data packets.

8. Microservices Architecture

Key Concepts:

Microservices vs Monolithic Architecture

  • Microservices: Small, independent services that communicate via APIs. Easier to scale and manage independently.
  • Monolithic: A single, large codebase that handles all business logic and services in one place.
  • Examples:
    • Microservices: Breaking a large e-commerce application into services like user management, product catalog, and payment processing.
    • Monolithic: A traditional LAMP stack application where the entire business logic is part of a single server codebase.

Service Discovery

  • Usage: Automatically detecting and managing how services communicate with each other in dynamic environments.
  • Examples:
    • Consul: A tool for discovering services in a microservice architecture.
    • Kubernetes Service Discovery: Automates the detection and management of services within a Kubernetes cluster.

API Gateway

  • Usage: An API gateway acts as a reverse proxy, routing requests to appropriate services.
  • Examples:
    • Zuul or Kong: Used as API gateways to manage traffic between microservices.
    • Netflix API Gateway: Used to handle requests from different client devices (mobile, TV, desktop) and direct them to microservices.

9. Security in System Design

Key Concepts:

Authentication and Authorization

  • Authentication: Verifying the identity of a user or service.
  • Authorization:

Verifying if the user has access to perform an action.

  • Examples:
    • OAuth2: An open standard for access delegation, commonly used for granting websites or applications limited access to a user's information.
    • JWT (JSON Web Tokens): A compact and self-contained way for securely transmitting information between parties as a JSON object.

Encryption

  • Usage: Encrypt data both in transit and at rest to protect sensitive information.
  • Examples:
    • TLS (Transport Layer Security): Encrypts data during communication over networks.
    • AES (Advanced Encryption Standard): Encrypts sensitive data stored in databases.

Rate Limiting and Throttling for Security

  • Prevent denial-of-service (DoS) attacks by limiting the rate at which requests can be made.
  • Examples:
    • API Rate Limiting: Protect public APIs by limiting the number of requests a user can make within a certain time frame.

10. Monitoring and Logging

Key Concepts:

Monitoring Systems

  • Usage: Keep track of system health and performance through monitoring tools.
  • Examples:
    • Prometheus: A monitoring tool that collects and stores metrics as time series data.
    • Grafana: A tool used to create dashboards and visualize data from monitoring tools like Prometheus.

Logging Systems

  • Usage: Store logs to help diagnose issues, track performance, and monitor user behavior.
  • Examples:
    • ELK Stack (Elasticsearch, Logstash, Kibana): Used for searching, analyzing, and visualizing log data in real-time.
    • Splunk: A platform for analyzing machine-generated data and providing real-time insights.

Preparation Tips for System Design:

  1. Understand Scalability: Know when to use horizontal vs. vertical scaling, and how to design systems for high availability.
  2. Think in Trade-offs: Balancing consistency, availability, and performance is key in system design.
  3. Caching: Understand how caching can improve system performance and reduce load.
  4. Design for Failure: Always assume your system will fail and plan redundancy, failover, and backup systems accordingly.
  5. Use Load Balancers and Rate Limiting: These are crucial to ensuring your system can handle large amounts of traffic and prevent overloads.
  6. Security First: Always incorporate authentication, authorization, and encryption into your design.

These system design patterns and concepts cover key areas that you will encounter in interviews. Each concept is illustrated with examples to help solidify your understanding. Focus on mastering these and you'll be well-prepared for system design interviews!

System Design Published on 2024-09-28