Tutorial: Data Structure

Serverless Database Data Structure

The [[service]] real-time database service manages an application data as a tree similar to a JSON object.

Each data node (sometimes also referred to as location) in this tree has:

  • a name (or key), which corresponds to the key of a property in a JSON object,
  • a path, which corresponds to the list of the names of the nodes you must follow to reach this node starting from the root node.

Every piece of data inside your database is attached to a node referred to by its path. Here is an example:

  • ⇐ root node
    • title: "My super chat !" ⇐ node at path /title
    • users
      • robert ⇐ node at path /users/robert
        • name
          • first: "Robert"
          • last: "Martin"
      • john
        • name
          • first: "John"
          • last: "Doe" ⇐ node at path /users/john/name/last

Nodes can store either primitive data (strings, numbers or booleans) or nested child nodes. Note that a node cannot contain both primitive data and children, and only leaf nodes in the tree can contain primitive data. This tree model requires to structure data hierarchically.

In the example above, the node at path /users represents a list of users. The data for "robert" and "john" users are respectively stored at paths: /users/robert and /users/john. The nodes robert and john are said to be children of the node users, and the node users is said to be the parent of the nodes robert and john. Nodes for data can nest as deeply as needed. For example, the string representing the last name of user "robert" is located at path: /users/robert/name/last.

When creating a new nested node, any parent that doesn't currently exist is automatically created. If you want to add another user, you can set the string "Mike" at path /mychat/users/mike/name/first and the parent nodes /users/mike and /users/mike/name will be automatically created, resulting in the following database:

  • ⇐ root node
    • title: "My super chat !"
    • users
      • robert
        • name
          • first: "Robert"
          • last: "Martin"
      • john
        • name
          • first: "John"
          • last: "Doe"
            • mike
            • name
            • first: "Mike"

If you remove the just created node (at path /mychat/users/mike/name/first), the parent nodes at /users/mike/name and /users/mike (which consequently contain no more children with data) will be automatically removed and our database will revert to its state before adding the new user.

API

The [[service]] SDK provides with adapted methods to:

In addition, this section provides some recommendations to properly design data models for [[service]] applications.

Key ordering

Each non-leaf node in the database maintains the list of its children in an ordered manner. During read operations, the children of a data node are fetched along this order.

The underlying key order is defined as follows:

  • Keys that are parsable as integers are ordered before all other keys.
  • Integer keys (with no leading 0) are ordered following the natural order on integers.
  • Integer keys representing the same number are ordered from the one with less leading 0 to the one with more leading 0.
  • All other keys are considered as strings and ordered in lexicographical order.

Example: 0 < 1 < 01 < 001 < 7 < 09 < 72 < 521 < 1000 < aa < bb

Limitations

Charset limitations

A data node name or key may contain any unicode characters except:

  • . (period)
  • $ (dollar sign)
  • [ (left square bracket)
  • ] (right square bracket)
  • # (hash or pound sign)
  • / (forward slash)
  • ASCII Control Characters (0-31 and 127)

Depth limitations

The maximum depth of the database tree is 32.
In other words, a node cannot have more than 31 parent nodes, or the path of a node is limited to 32 segments.

Width limitations

The maximum name length of a node is 256 characters.

The maximum number of children of a node is 50,000.

The maximum key-set size of a node (sum of the lengths of the names of all its children) is 10 MiB.

If you need a node with more children, you have to add intermediary children between the parent node and the desired child node.
To do so, a common technique consists in generating such intermediary children based on splitting a hash of the name (typically using SHA-1) of the target child node into several parts
.

Weight limitations

The maximum value size of a leaf node is 10 MiB.
Only string values are concerned with this limitation since representations of number and boolean values are always under this threshold.

The maximum size of data that can be written at a node in a single operation (see « write data » section) is 10 MiB.

The maximum size of data that can be read at a node in a single operation (see « read data » section) is 10 MiB.
In particular, if the whole data attached to a node exceeds this limit (typically due to many children and sub-children), they cannot be fetched as a single bulk, they have to be fetched in many smaller bulks (typically by reading each child node separately).