Note--Distribute System

Other topics

CAP

Consistency: Consistency means that all clients see the same data at the same time, regardless of which node they connect to.

Availability: Availability means that any client requesting data receives a response, even if some of the nodes are down.

Partition Tolerance: A partition indicates a communication break between two nodes. Partition tolerance means that the system continues to operate despite network partitions.

Availability

Availability Percentages versus Service Downtime

Availability %	Downtime per Year	Downtime per Month	Downtime per Week
90% (1 nine)	36.5 days	72 hours	16.8 hours
99% (2 nines)	3.65 days	7.20 hours	1.68 hours
99.5% (2.5 nines)	1.83 days	3.60 hours	50.4 minutes
99.9% (3 nines)	8.76 hours	43.8 minutes	10.1 minutes
99.99% (4 nines)	52.56 minutes	4.32 minutes	1.01 minutes
99.999% (5 nines)	5.26 minutes	25.9 seconds	6.05 seconds
99.9999% (6 nines)	31.5 seconds	2.59 seconds	0.605 seconds
99.99999% (7 nines)	3.15 seconds	0.259 seconds	0.0605 seconds

Reliability

Mean time between failures (MTBF) and mean time to repair (MTTR)

Scalability

the ability of a system to handle an increasing amount of workload without compromising performance.

Dimension	Definition	Example
Size scalability	Ability to add resources to handle more workload	Adding more CPUs to handle more requests
Administrative scalability	Capacity for multiple users to share a single distributed system	Multiple companies sharing a cloud-based system
Geographical scalability	Ability to cater to a broad geographical region	A search engine serving users in multiple countries

Approach	Definition	Example
Vertical scalability	Scaling up by providing additional capabilities to an existing device	Adding more RAM to a server
Horizontal scalability	Scaling out by increasing the number of machines in the network	Adding more nodes to a distributed system

Maintainability

Aspect	Definition	Importance
Operability	Ease of ensuring smooth system operations under normal circumstances and achieving normal conditions under a fault	Important for maintaining system availability and reducing downtime
Lucidity	Simplicity of the code base, making it easy to understand and maintain	Important for reducing maintenance time and costs
Modifiability	Capability of the system to integrate modified, new, and unforeseen features without difficulty	Important for adapting to changing requirements and improving system functionality

Metric	Definition	Formula	Goal
Maintainability (M)	Probability that the service will restore its functions within a specified time of fault occurrence	M = probability of restoring the component to its fully active form within a specified time	High M value
Mean Time To Repair (MTTR)	Average amount of time required to repair and restore a failed component	MTTR = total maintenance time / total number of repairs	Low MTTR value

Fault tolerance

Technique	Definition	Trade-offs
Replication	Replicating both services and data to swap out failed nodes or data stores with healthy ones	Consistency vs. availability trade-off, synchronous vs. asynchronous updates
Checkpointing	Saving the system's state in stable storage when the system state is consistent	Consistency vs. availability trade-off, synchronous vs. inconsistent updates

Types of Data Center Servers

Server Type	Purpose	Example Resources
Web Servers	Handle API calls from clients behind the load-balancer (mostly serve static content)	Medium memory and storage resources, good computational resources
Application Servers	Run core application software and business logic (servers primarily provide dynamic content)	Extensive computational and storage resources, volatile and non-volatile storage, up to 256 GB RAM and 6.5 TB storage
Storage Servers	Store and manage structured and non-structured data	Structured (SQL) and non-structured (NoSQL) data management systems, storage capacity up to 120 TB, exabytes of storage, 32 GB RAM

Load balancing

Load Balancing Technique	Local Load Balancing	Global Load Balancing
Definition	Balancing within a data center.	Balancing traffic across multiple geographical regions.
Purpose	Improving efficiency and better resource utilization within a data center.	Distributing traffic intelligently across multiple geographical regions.
Focus	Within a data center.	Across multiple geographical regions.
Technology used	Reverse proxy.	Load Balancing as a Service (LBaaS).
Installation location	Within the data center.	Can be installed on-premises or obtained through LBaaS.
Load balancing technique	Divides incoming requests among the pool of available servers.	Uses techniques such as DNS and round-robin to perform load balancing.
Limitations	Limited control over the client's behavior, smaller packet size (512 bytes), clients can't determine the closest address to establish a connection with.	Can suffer from uneven load distribution on end-servers, keeping on distributing the IP address of the crashed servers until the TTL of the cached entries expires.
Use of ADCs	Used as an additional layer of load balancing.	ADCs can implement GSLB.

DataBase

Feature	Relational Database	Non-Relational (NoSQL) Database
Data structure	Organized in one or more tables/relations	Can be structured, semi-structured, or unstructured data
Query language	Structured Query Language (SQL)	Various languages depending on the type of NoSQL database
Schema	Follows strict schema and requires data to conform to it	Dynamic schema allows for flexible data
Scalability	Vertical scaling. Or horizontally scale by separating table	Can scale horizontally by adding more nodes
ACID properties	Provides full ACID compliance	Often sacrifices some level of consistency for availability and partition tolerance
Use case	Best for structured data with complex relationships and strict integrity constraints	Best for unstructured or semi-structured data and high scalability needs

Type of NoSQL Database	Description
Key-value Database	Stores data as key-value pairs using hash tables, with the key serving as a unique or primary key, and values being anything from simple scalar values to complex objects. Efficient for session-oriented applications, such as web applications. Example databases include Amazon DynamoDB, Redis, and Memcached DB.
Document Database	Designed to store and retrieve documents in formats like XML, JSON, BSON, etc. Documents are composed of a hierarchical tree data structure that can include maps, collections, and scalar values. Suitable for unstructured catalog data and content management applications. Example databases include MongoDB and Google Cloud Firestore.
Graph Database	Uses the graph data structure to store data, where nodes represent entities, and edges show relationships between entities. Allows storing data once and interpreting it differently based on relationships. Suitable for social applications, data regulation and privacy, machine learning research, and financial services-based applications. Example databases include Neo4J, OrientDB, and InfiniteGraph.
Columnar Database	Stores data in columns instead of rows, enabling access to all entries in the database column quickly and efficiently. Suitable for large numbers of aggregation and data analytics queries. Example databases include Cassandra, HBase, Hypertable, and Amazon SimpleDB.

Terms:

Cloud computing is running applications on computing resources managed by cloud providers. When using cloud computing, we do not have to purchase or manage hardware ourselves.

Serverless computing builds on the convenience of cloud computing with even more automation. It enables developers to build and run applications without having to provision cloud servers. The serverless provider handles the infrastructure and automatically scales the computing resources up or down as needed. This provides a great developer experience since developers can focus on the application code itself, without having to worry about scaling.

Best practices in cloud applications - Azure Architecture Center

Learn best practices for building reliable, scalable, and secure applications in the cloud. See resources on caching, partitioning, monitoring, and other areas.

https://learn.microsoft.com/en-us/azure/architecture/best-practices/index-best-practices