📋
space
  • SDE Interview
  • Multi-threading
    • Mutex vs Semorphore
    • Thread vs process
  • Design Pattern
  • Java
    • Polymorphism
    • Encapsulation
    • Inheritance
    • Override vs overload
  • MySQL
    • DB transaction
  • Data Structure
    • Design hashset
    • AVL / red black tree
    • LinkedList vs arrayList
    • HashMap vs HashTable
    • Binary Tree
    • Heap
  • System Design
    • Session Cookie
    • GFS / BigTable / MapReduce
    • Zookeeper
    • gRPC vs thrift
    • Amazon RDS vs Oracle
    • Microservices
    • REST vs RPC
    • Database design
    • idempotent in HTTP
    • Optimistic locking / Pessimistic locking
    • Partitioning / Sharding data
    • Consistent Hashing
    • Case Study
      • Design Delay Task scheduler
      • Design View Count
      • Design Twitter
      • Design Web Crawler
      • Design Uber
      • Design Netflix
      • Design Google Doc
      • Design Monitoring System
      • Design Dropbox
      • Distributed Lock
      • Design Instagram
      • Design Yelp
      • Design Amazon
      • Design Google Search
      • Distributed Database System Key Value Store
      • Design Facebook message / Whatsapp
      • Design Logging Systems
      • Design Movie booking system
      • Design Google Autocomplete Feature
      • Design Twitter Search
    • Message Broker
      • Kafka
    • Design Data Intensive Application
      • Chapter 8
    • SQL vs NoSQL
      • Cassandra
      • MongoDB vs Cassandra vs MySQL vs HBase
    • TCP vs UDP
    • Load Balancer
    • Cache
      • Memcached
      • Redis
    • DNS
    • CDN
    • Strong consistency vs eventual consistency
    • Scalability
Powered by GitBook
On this page
  • Single Node
  • Scientific Computing: a giant compter
  • Commercial Data Center:
  • Unreliable network
  • network delays are chosen by design
  • Unreliable Clock
  • Two type of clock
  • Why assumption based on clock is very dangerous
  • Caveat of last write win (Cassandra, can not used for bank transaction) / DynamoDB can write client config resolution
  • Process pause
  • Redlock: fault tolerant distributed locks
  • A real case in HBase (Redis)
  • Truths
  • Lies: Byzantine failures - unreliable node / network
  • Different models in distributed system

Was this helpful?

  1. System Design
  2. Design Data Intensive Application

Chapter 8

Truth knowledge and Lies, after this, you should know which design does not work fundamentally

PreviousDesign Data Intensive ApplicationNextSQL vs NoSQL

Last updated 4 years ago

Was this helpful?

Single Node

  • Single computer system, result is deterministic, hardware problem - > blue screen, this is a good choice(deliberate), we prefer to a computer to crash completely rather than returning a wrong result

Scientific Computing: a giant compter

Commercial Data Center:

  • Cost efficiency matters, use cheap commodity hardwares

  • something is always broken

Unreliable network

  • request may have been lost

  • request may be waiting in a queue

  • the remote node may have failed

  • the remote node may have temporarily stopped

  • no absolute reliable fault detector

  • widely acceptable indicator is timeout, could not tell if it's node failure or network failure, retry few times when possible

  • a request could be called several times and results in disastrous outcomes. Idempotent. call 多次,返回相同结果

  • 如何让一个system 变得idempotent?设计API的时候

  • network partitions: 写到A, 读B

  • without testing, the network failure could break the fundamental assumptions

  • network partition naturally happens, we can only resolve it when it happens, we can not prevent it from happening. CAP theory -> CP or AP (consistency, availability and partition)

network delays are chosen by design

  • traditionally telephones lines are almost guaranteed no delays, focus on latency

  • network use patterns are burst mode, system design is focusing on the maximum throughput / resource utilization

  • similar thing happens with retail time OS

Unreliable Clock

  1. has this request time out yet?

  2. how many queries per second did this service handle on average in the last five minutes?

  3. how long did the user spend on our site?

  4. when was this article published?

  5. at what date and time should the reminder email be sent?

  6. when does this cache entry expire?

  7. what is the timestamp on this error message in the log file?

Most commonly used mechanism is the Network Time Protocol (NTP) Individual NTP is not reliable.

Two type of clock

  1. time-of-day. It returns the current date and time according to some calendar. Java System.currentTimeMillis() and Linux: clock_gettime(CLOCK_REALTIME)

    1. highly unreliable, could jump forward, backward.

    2. in google spanner, time drift assumption is 30 secs per 24 hrs

  2. . monotonic clock is suitable for measuring a duration (time interval).

    1. clock_gettime(CLOCK_MONOTONIC) on linux and System.nanoTime() in Java

    2. duration could be measured relatively accurate

Why assumption based on clock is very dangerous

  • most of the components in distributed system provides accurate deterministic result of crash. Not clock

Caveat of last write win (Cassandra, can not used for bank transaction) / DynamoDB can write client config resolution

  • 后来的request的time clock比其他的慢,request会被覆盖

  • distributed id generator (UUID) -> very difficult, mac id + machine local time, different machine has different time clock

Process pause

  • Java virtual machine has a garbage collector which occasionally needs to stop all running threads. "Stop-the-world" GC pauses have sometimes been know to last for several minutes.

  • When the OS context-switches to another thread, or when the hypervisor switches to a different virtual machine, the current running thread can be paused at any arbitrary point in the code.

  • if the application performs synchronous disk access, a thread may be paused waiting for a slot disk I/O operation to complete, Java class loader lazy load first time.

  • Thrashing: swapping to disk (paging)

  • real time OS: every library call has a worst time guaranteed.

Redlock: fault tolerant distributed locks

how to do distributed locking: martin kleppmann

A real case in HBase (Redis)

ZooKeeper transaction ID

Truths

Quorum

Lies: Byzantine failures - unreliable node / network

Different models in distributed system

what's the 99th percentile response time of this service?

https://www.i3geek.com/archives/841
https://engineering.linkedin.com/performance/who-moved-my-99th-percentile-latency