Data Models And Query Languages
Relational VS. Document Databases
Compare in Data Model
| Compare | Relational DB | Document DB |
|---|---|---|
| Document-like Data structure | ❌Shredding(splitting doc to multiple tables), cumbersome schemas, | ✅ Good |
| Use many-to-many relatinship | ✅Good with join | ❌poor support for join, need to write complex code in application, also poor performance compare to join in DB |
| Highly connected data | ✅Acceptable | ❌Awkward, maybe Graph-like models |

Schema Flexibility
Document DB: implicit-Schema(Schema-less), Schema on read
Relational DB: Explicit-Schema, Schema on write
Performance On Change Data schema:
Document DB: fast, suitable for heterogeneous data(composed of parts of different kinds)
Relational DB: slow when update and alter table, especially with large tables
Data Locality For Queries
If app often accesses the entire document, there is a performance advantage to this storage locality.
For relational DB, data is splitted into multiple tables.
Locality advantage only applies if need large parts of document at the same time. Or it will be wastefull for Document DB.
Rewrite performance is easy when updates to document don't change the encoded size of doc.
So keep documents fairly small and avoid writes that increase the size of document
Query language fo Data
| feature | Declarative | Imperative |
|---|---|---|
| optimalization | Give DB more room for automatic optimizations | Hard to optimize |
| parallel | parallel execution, only specify the pattern of result, not the algorithm used. Free to use parallel implementation | hard to parallelize, instruction most perform in particular order |
Graph-like Data Model
Poperty of Graph Model: vertices, edges
vertex:
- A unique identifier
- A set of outgoing edges
- A set of incoming edges
- A collections of properties (key-value pairs)
Edge:
- A unique identifier
- The vertex at which edge starts(the tail vertex)
- The vertex at which the edge ends(the head vertex)
- A label to describe the kind of relationship between two vertices
- A collection of properties(Key-value pairs)
A graph store as consisting of two relational tables: vertices, edges
Features:
- Any vertex can have a edge connecting with any other vertex, no strict schema
- Given any vertex, you can efficiently find both its incoming and outcoming edges, thus traverse the graph, forward or backward
- By using different labels for different kinds of relationship, Graph can store several different kinds of information in a single graph, while still maintaining a clean data model
A great deal of flexibility for data model, good for evoluability
Graph Query Language
Cypher is created for Neo4J, which is a graph Database.
Cypher is a declarative language. You don't need to specify such execution details.
Use SQL in Graph Database?
- You need to know in advance which join you need in your query
- But the number of joins is not fixed in advance.
Triple-stores and SPARQL
Triple-store model is mostly equivalent to the property graph model.
All information stores in form of three-part statements. (Subject, predicate, Object)
Eg.(John, likes, banana)
SPARQL is a query language for triple-stores using RDF data model.