Wednesday, October 12, 2016

Basic Idea about Jackson’s classes and other important things:
1. class ObjectMapper:
  This mapper (or, data binder, or codec) provides functionality for converting between Java objects (instances of JDK provided core classes, beans), and matching JSON constructs. It will use instances of JsonParser and JsonGenerator for implementing actual reading/writing of JSON.
  <e.g.> mapper.writeValueAsString(object)
  mapper.readValue(json, Object.class)
2. class JsonGenerator and JsonParser:
  JsonGenerator is responsible to write json text, while JsonParser is responsible to parse the 就json text.
3. class StdSerializer and StdDeserializer:
  When JsonGenerator wants to generate json text, it needs to call all kinds of serializers to convert objects to corresponding json text. StdSerializer is the base class used by all standard serializers, and can also be used for custom serializers.
  StdDeserializer plays similar role in deserializing process.
4. class JsonNode:
  Jackson has a built-in tree model which can be used to represent a JSON object. Jackson's tree model is useful if you don't know how the JSON you will receive looks, or if you for some reason cannot (or just don't want to) create a class to represent it.
  JSON tree model is represented by JsonNode.
5. Jackson Annotations:
  The Jackson JSON toolkit contains a set of Java annotations which you can use to influence how JSON is read into objects, or what JSON is generated from the objects.
  <e.g.,> @JsonProperty, @JsonCreator
Serialization: convert an object to a Json text:
———————————————————————————————————
ObjectMapper mapper = new ObjectMapper();
// write json to file: mapper.writeValue(File resultFile, Object value);
// generate a string: String json = mapper.writeValueAsString(value);
// write json to stream: mapper.writeValue(OutputStream out, Object value);
———————————————————————————————————
Let first take a look at a simplified process to serialize a standard java class.

For a user-defined class, how can JsonGenerator knows which field should be included in json text, and what should be its external property name (the key), there are four ways:
1. Use annotation @JsonProperty(“key_name”):@JsonPropery (also indicates that property is to be included) is used to indicate external property name, name used in data format (JSON or one of other supported data formats)
  The annotation should be put in front of fields.e.g.,
  public class PersonValue {
  @JsonProperty(“id”)
  public long personId = 0;
  
  @JsonProperty(“name”)
  public String name = null;
  }
  //The Json text will be {“id”:0, “name”:null}
2. Use annotation @JsonGetter(“key_name”):
  @JsonGetter is used to tell Jackson that a certain field value should be obtained from calling a getter method instead of via direct field access.
  The annotation put in front of a method. The method’s return value is the value of json.e.g.,
  public class PersonValue {
  @JsonProperty(“id”)
  public long personId = 0;
  public String name = null;
  
  @JsonGetter(“name”)
  public String getName(){
  return this.name;
  }
  }
  //The Json text will be {“id”:0, “name”:null}
  [Attention] @JsonProperty and @JsonGetter can be used together.
3. Use annotation @JsonValue:
  The Jackson annotation @JsonValue tells Jackson that Jackson should not attempt to serialize the object itself, but rather call a method on the object which serializes the object to a JSON string.Deserialization: convert an Json text to an object:
  e.g.,
  public class PersonValue {
  
  public long personId = 0;
  public String name = null;
  
  @JsonValue
  public String toJson(){
  return “{\”id\”:” + this.personId + "," + “\”name\”:” +this.name +”}";
  }
  }
4. Self-define serializer
  e.g.,
  @JsonSerialize(using = PersonValueSerializer.class)
  public class PersonValue {
  public long personId = 0;
  public String name = null;
  }
  
  class PersonValueSerializer extends StdSerializer<PersonValue>{
  public PersonValueSerializer(){
  super(PersonValue.class, true);
  }
  
  @Override
  public void serialize(PersonValue person, JsonGenerator jsonGenerator, SerializerProvider serializerProvider) throws IOException, JsonGenerationException {
  //method-1: write person as an json array
  jsonGenerator.writeStartArray();
  jsonGenerator.writeNumber(this.personId);
  jsonGenerator.writeString(this.name);
  jsonGenerator.writeEndArray();
  
  //method-2: write person as an json object jsonGenerator.writeStartObject();
  jsonGenerator.writeNumberField(“id”, this.personId);
  jsonGenerator.writeStringField(“name”, this.name);
  jsonGenerator.writeEndObject();
  }
  }
  When ObjectMapper calls the function “writeValueAsString(person)”, it knows it should use the PersonValueSerializer to serialize the object.
  If we use the method-1, the json text will be like “[271, “Hongkai Wu"]”.
  If we use the method-2, the json text will be like “{“id”:271, “name”:”Hongkai Wu"}
Deserialization: convert a Json text to an object:
Similarly, let's take a look at simplified process of deserialization

There are three ways to construct a custom defined class' object from Json text
1. Use annotation @JsonCreator and @JsonProperty in a constructor
  @JsonCreator: is used to help Jackson construct the Java object from Json text.
  The annotation should be put in front of a class constructor.e.g.,
  public class PersonValue {
  
  public long personId = 0;
  public String name = null;
  
  @JsonCreator
  public PersonValue(@JsonProperty(“id”) id,
  @JsonProperty(“name”) name){
  this.id = id;
  this.name = name;
  }
  }
  When deserializing the json text, ObjectMapper knows it should call the constructor with the annotation @JsonCreator. It will map the @JsonProperty with the field name and read the field value.
2. Use annotation @JsonCreator and @JsonSetter(“key_name”)
  @JsonSettoer is used to tell Jackson that is should match the name of this setter method to a property name in the JSON data, when reading JSON into objects.
  The annotation should be put in front of a method
  e.g.,
  public class PersonValue {
  
  public long personId = 0;
  public String name = null;
  
  @JsonCreator
  public PersonValue( //@JsonProperty(“id”) id,
  @JsonProperty(“name”) name){
  this.id = id;
  this.name = name;
  }
  
  @JsonSetter(“id”)
  public void setID(long id){
  this.id = 0;
  }
  }
  When deserializing the json text, ObjectMapper knows it should call the constructor and the function with the annotation @JsonSetter to construct the object. One thing worthy attention here is that when @JsonProperty(“id") in the constructor and @JsonSetter(“id") exist simultaneously, as shown in the above example, the ObjectMapper will ignore the @JsonSetter and use @JsonProperty to construct the field value.
3. Custom Defined Deserializer
  For a custom defined deserializer, the deserialization process is shown in the graph below:
  
  important API:
  jsonParser.nextToken(): return the next JsonToken object
  jsonParser.getCurrentName(): get the field name
  jsonParser.readValueAs(ObjClass.class): read the value
  
  jsonToken.equals(JsonToken.FIELD_NAME): return true if the token is a field
  Let's read the following code example to understand the process:
  e.g.,
  //@JsonSerialize(using = PersonValueSerializer.class)
  @JsonDeserialize(using = PersonValueDeserializer.class)
  public class PersonValue {
  @JsonProperty(“id”) public long personId = 0;
  @JsonProperty(“name”) public String name = null;
  @JsonProperty(“friends”) public List<String> friends = new ArrayList<>();
  }
  
  class PersonValueDeserializer extends Deserializer<PersonValue>{
  public PersonValueDeserializer(){
  super(PersonValue.class);
  }
  
  @Override
  public PersonValue deserialize(JsonParser jsonParser, DeserializationContext deserializationContext) throws IOException, JsonProcessingException {
  PersonValue p = new PersonValue();
  //build an empty object, fill the fields later using JsonParser
  
  while(!jsonParser.isClosed()){
  JsonToken token = jsonParser.nextToken();
  //JsonParser is like an iterator. The deserializing process is to read the tokens one by one.
  
  if(token.equals(JsonToken.FIELD_NAME)){
  String fieldName = jsonParser.getCurrentName();
  //now we know the field name, next we need to know the field value
  switch(fieldName){
  case “id”:
  jsonParser.nextToken(); //now we move the iterator to the field value
  p.id = jsonParser.readValueAs(Long.class);
  break;
  case “name”:
  jsonParser.nextToken();
  p.name = jsonParser.readValueAs(String.class);
  break;
  case “friends”:
  //the value of this field is an array, so the first token is JsonToken.START_ARRAY "["
  jsonParser.nextToken();
  while(jsonParser.nextToken() != JsonToken.END_ARRAY){
  //the last token for the field “friends” should be JsonToken.END_ARRAY ]
  p.friends.add(jsonParser.readValueAs(String.class);
  }
  //read all friends name until jsonParser’s token is END_ARRAY
  break;
  }
  }
  return p;
  }
  }
Other important annotations:
1. @JsonInclude:
  tells Jackson only to include properties under certain circumstances.
  put in front of class defn.e.g.,
  @JsonInclude(JsonInclude.Include.NON_EMPTY)
  public class PersonValue {
  …...
  }
2. @JsonIgnore:
  is used to tell Jackson to ignore a certain property (field) of a Java object.
  put in front of fields.e.g.,
  public class PersonValue {
  @JsonIgnore
  public long personId = 0;
  
  public String name = null;
  }
Jackson TreeModel

Sunday, September 4, 2016

Locking : Mutex vs Spinlocks

http://algorithmsandme.in/2014/01/locking-mutex-vs-spinlocks/

Difference between mutex vs spinlocks

I would start with a very basic technical interview question, that is : What is the difference between mutex and spinlock?

First of all, let’s understand what is a mutex?

Mutex is kind of a key to use any resources in the system. If you have mutex, you can use the resource, if you don’t wait till the mutex is released. The process goes to wait queue for that particular resource.

What is spin-lock? Spin lock is a mechanism in which the process needing a resource poll the lock on resource till it gets it. It is also called as busy-waiting. Process will be busy in a loop till it gets the resource.

So we get the first difference there itself, the way both wait for the resource lock. In case of mutex, process goes in wait state and does not use processor while in spin-lock, process keeps on looping and hence uses the processor even while waiting.

Second difference comes from : when should we use spin-lock or mutex?

Spin-locks are best used when a piece of code cannot go to sleep state. Best example of such code would be interrupts request handlers.

Mutexes are best used in user space program where sleeping of process does not affect the system in catastrophic way.

Spin locks make sense when we have multi-processor systems or we have uni-processor system with preemption enabled, spin locks used in uni-processor systems without preemption enabled, may lead to deadlock where process goes on spinning forever. Following table summarizes the above points.

Mutex Vs Spinlock

Criteria	Mutex	Spinlock
Mechanism	Test for lock. If available, use the resource If not, go to the wait queue	Test for lock. If available, use the resource. If not, loop again and test the lock till you get the lock
When to use	Used when putting process is not harmful like user space programs. Use when there will be considerable time before the process gets the lock.	Used when the process should not be put to sleep like Interrupt service routines. Use when lock will be granted in reasonably short time.
Drawbacks	Incur process context switch and scheduling cost.	Processor is busy doing nothing till lock is granted, wasting CPU cycles.

In next post we would list some the APIs which are used for using mutex and spinlocks in Linux kernel.

Mutex Vs Semaphore

Saturday, September 3, 2016

System Design: BigTable

1.4S analysis

scenario: read / write
service: client - BigTable - GFS
store: NoSQL
scale:

2.A workable solution

1.Read:

2.Write:

3.A better and detailed solution

1.Read:

[Explain for details]

Distributed lock server

provide lock service
(Chubby in Google: http://bit.ly/1Pukiyt
Zookeeper in Apache: http://bit.ly/1TOWIsR)
manage metadata (the consistent hashmap)

Master, lock server and user

Master: in this framework, master does not maintain the consistent hashmap anymore, it 1) manage the servers health and 2) shard the files
Lock Server: 1) maintain the metadata 2)lock the key
Communication:
Communication (user<->lock server): lock, slave server id
Communication (lock server<->master): when a slave server fail, the master needs to assign a new one, copy the data and update the consistent table in the lock server.
No communication (user<->master)

Tablet
SSTable
Optimize read from GFS

index
bloom filter

2.Write:

The writing process in tablet server:

write-ahead log: http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-log.html
write memory
asynchronously write the SSTable in memory to GFS

sort
index
write bloom filter

Thursday, August 25, 2016

System Design: Twitter

Standard for evaluation

可行解 Work Solution 25%

特定问题 Special Case 20%

分析能力 Analysis 25%

权衡 Tradeoff 15%

知识储备 Knowledge Base 15%

4S analysis

Scenario

Features: list all the features and sort them by priority
DAU -> QPS: based on daily active user (DAU), calculate the query per second (QPS, read QPS, write QPS), including avg QPS and peak QPS (usually about 2 or 3 times of avg QPS)
- DAU: assuming ~100M
- QPS: DAU * query estimation per day / 86400 (Note:86400s = 24h)
  QPS = 100M*5/86400
Interfaces

Service: design the service based on QPS

Split the system into several subsystems

Storage

How to store/access data: SQL / NoSQL / File System
- SQL: transaction, complex table with multiple columns
- NoSQL: graph relationship (e.g., friendship)
- File System: picture, video, etc.
Schema: details the table structure

Scale

Sharding / Optimize / Special Case

Design Twitter

Scenario:
- Register / Login
- User Profile Display / Edit
- Upload Image / Video
- Search
- Post / Share a tweet*
- Timeline / News Feed*
- Follow / Unfollow a user*
Service
- User service: register/login/user profile display/edit
- Search service: search
- Media service: upload image/video
- Tweet service: post/share/timeline/newsfeed
- Friendship service: follow/unfollow
Storage:

Decide where to store:

User service: SQL
Search service: index -> File System
Media service: File System
Tweet service: NoSQL (tweet content -> NoSQL)
Friendship service: NoSQL (it is also OK to put it in SQL, but NoSQL is better)

Details the schema of the table

Wednesday, August 24, 2016

Advance Tree Algorithm 1 Summary - Leetcode

1. Serialization

[Method 1] pre-order+in-order or in-order+post-order
[Problem] same nodes will cause problems in deserialization.
e.g., pre-oder = [5,5,5,5], in-order = [5,5,5,5] ->no way to figure out what exactly the tree looks like.

[Method 2] level-order serialization

[1,2,3,4,null,5,6,null,null,null,null]

(You can also trim all the "null" in the end)

2. Binary Search Tree

1. Predecessor and Successor

java.util.TreeMap support API ceiling(value) and floor(value)
[Question]
270. Closest Binary Search Tree Value

2. other BST questions

222. Count Complete Tree Nodes
230. Kth Smallest Element in a BST

Tuesday, August 23, 2016

Compare DFS and BFS

BFS Summary - Leetcode

0.Basic Concepts

Breadth First Search (BFS) is a method for traversing or searching tree or graph data structures.

1.When do we apply BFS?

It is frequently used to
1) explore some features/characteristics in the tree/graph.
[typical examples]
[101] Symmetric Tree
[261] Graph Valid Tree

2) search for nodes/level/connected components
[typical examples] search for nodes/connected components
  [133] Clone Graph
  [200] Number of Islands
  [323] Number of Connected Components in an Undirected Graph
  [286] Walls and Gates

[typical examples] level related problem
[102] Binary Tree Level Order Traversal
[103] Binary Tree Zigzag Level Order Traversal
[107] Binary Tree Level Order Traversal II
[199] Binary Tree Right Side View

3) shortest steps(path)

Utilizing BFS to find the shortest path involves a lot of details, so let's use an example to discuss it.

Example:

The solution needs to:

1)Utilize a matrix (int[][] distance) to record the shortest distance from node A to node (i, j). If we cannot reach the node (i, j), marked it as -1.

2)Apply BFS to calc distance[][]. All distance[i][j] except the the start point should be initialized as -1; the start point distance is initialized as 0.

3)When updating the distance matrix using BFS, for each dequeued node, we will enqueue its unvisited grass neighbor (which means distance of an obstacle will always be -1. This also makes sense because -1 means unreachable):

Here we show the pseudo codes for this BFS process (which is also classical BFS codes):

Follow-up 1:

What if we can move in 8 directions? (up, upright, right, ...)

If you run the above code, you will meet a bug like this:

Why?

In the round 0, you dequeue A, and enqueue all nodes around A, which are (0,0), (0,1), (1,1), (2,1), (2,0)

In the round 1, you first dequeue (0,0), as we can see, for (0,0), its unvisited neighbors are (0,1), (1,1), so you will enqueue these two nodes.

This means in the round 2, you will pop (0,1) and (1,1) again, and set their distances value as 2.

Now we see our problem:

If the nodes in the queue can be each other’s neighbors, the neighbor can be added to the queue again, which will cause wrong distance.

How to solve it?

distance[i][j] == -1 means this node is unvisited. So before we update node (i,j)’s distance value, check whether distance[i][j] == -1

So we new code is:

Follow-up 2:

If we want to find the minimum path, rather than only find the minimum steps to from one node to another, how to achieve this?
There are two ways:

1. We can store Tuple<Node, Path> into the queue. -> O(n^2)

2. We can use BFS to find how long is the shortest path (assume k). Then we apply DFS to find the path (because we know path length = k, for a path longer than k, we can stop earlier). -> O(n), but DFS has stack overflow problem.

[typical questions]
  [111] Minimum Depth of Binary Tree
  [126] Word Ladder II
  [310] Minimum Height Trees

  [317] Shortest Distance from All Buildings

Hongkai's Blog