Sunday, September 4, 2016

Locking : Mutex vs Spinlocks

http://algorithmsandme.in/2014/01/locking-mutex-vs-spinlocks/



Difference between mutex vs spinlocks

I would start with a very basic technical interview question, that is : What is the difference between mutex and spinlock?
First of all, let’s understand what is a mutex?
Mutex is kind of a key to use any resources in the system. If you have mutex, you can use the resource, if you don’t wait till the mutex is released. The process goes to wait queue for that particular resource. 
What is spin-lock? Spin lock is a mechanism in which the process needing a resource poll the lock on resource till it gets it. It is also called as busy-waiting. Process will be busy in a loop till it gets the resource.
So we get the first difference there itself, the way both wait for the resource lock. In case of mutex, process goes in wait state and does not use processor while in spin-lock, process keeps on looping and hence uses the processor even while waiting.
Second difference comes from : when should we use spin-lock or mutex?
Spin-locks are best used when a piece of code cannot go to sleep state. Best example of such code would be interrupts request handlers.
Mutexes are best used in user space program where sleeping of process does not affect the system in catastrophic way.
Spin locks make sense when we have multi-processor systems or we have uni-processor system with preemption enabled, spin locks used in uni-processor systems without preemption enabled, may lead to deadlock where process goes on spinning forever. Following table summarizes the above points.

Mutex Vs Spinlock

Criteria
Mutex
Spinlock
Mechanism
Test for lock.
If available, use the resource
If not, go to the wait queue
Test for lock.
If available, use the resource.
If not, loop again and test the lock till you get the lock
When to use
Used when putting process is not harmful like user space programs.
Use when there will be considerable time before the process gets the lock.
Used when the process should not be put to sleep like Interrupt service routines.
Use when lock will be granted in reasonably short time.
Drawbacks
Incur process context switch and scheduling cost.
Processor is busy doing nothing till lock is granted, wasting CPU cycles.
In next post we would list some the APIs which are used for using mutex and spinlocks in Linux kernel.

Saturday, September 3, 2016

System Design: BigTable

1.4S analysis

scenario: read / write
service: client - BigTable - GFS
store: NoSQL
scale:

2.A workable solution

1.Read:

2.Write:


3.A better and detailed solution

1.Read:

[Explain for details]
  1. Distributed lock server
    1. provide lock service
      (Chubby in Google: http://bit.ly/1Pukiyt
      Zookeeper in Apache: http://bit.ly/1TOWIsR)
    2. manage metadata (the consistent hashmap)
  2. Master, lock server and user
    1. Master: in this framework, master does not maintain the consistent hashmap anymore, it 1) manage the servers health and 2) shard the files
    2. Lock Server: 1) maintain the metadata 2)lock the key
    3. Communication:
      Communication (user<->lock server): lock, slave server id
      Communication (lock server<->master): when a slave server fail, the master needs to assign a new one, copy the data and update the consistent table in the lock server.
      No communication (user<->master)
  3. Tablet
  4. SSTable
  5. Optimize read from GFS
    1. index
    2. bloom filter


2.Write:


  1. The writing process in tablet server:
    1. write-ahead log: http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-log.html
    2. write memory
    3. asynchronously write the SSTable in memory to GFS
      1. sort
      2. index
      3. write bloom filter