Artificial Intelligence

Relational Databases vs Non-Relational Databases

A data model decides how an application stores, retrieves and manages the data. Picking the right data model could make a huge impact on the security, scalability, performance and availability of an application. In this article, we will review two main players in this arena : relational databases and non-relational databases.

Differences

Structure

In relational databases, data is organized into relations (tables) and each relation is an unordered collection of tuples(rows). The table schema is explicit and the database ensures all written data conforms to it (schema-on-write). It provides better support for joins, many-to-one and many-to-many relationships.

Non-relational databases do not require tabular relationships. The schema is implicit and interpreted when data is read (schema-on-read). There are four categories of such databases: Key-Value Store, Wide-Columns Databases, Document Databases and Graph Databases.

Storage

Normalization is a database technique to eliminate data redundancy. Relational databases mostly support normalization while non-relational databases usually don’t.

With normalization, data is defined once. It is easy to insert or update, but reading data could be slower since you may need to retrieve data from multiple collections by join or perform multiple queries. Denormalization, on the other hand, stores data in multiple places. It performs better reads but is slower for inserts and updates. This method of data storage takes up more memory and you may see inconsistent data if insert/update is not done properly.

For above reasons, non-relational databases are less storage efficient than their relational counterparts, but it’s not a concern. As storage price has dropped dramatically since relational database was designed and it makes sense to optimize for compute over storage.

Transaction Guarantees

Relational databases are ACID compliant:

  • Atomicity — guarantees that each transaction is treated as a single unit so the transaction either succeed or fail completely.
  • Consistency — ensures that a transaction can only bring the database from one consistent state to another, preserving database invariants. This prevents database corruption by an illegal transaction.
  • Isolation — ensures that concurrent execution of transactions leaves the database in the same state that would have been obtained if the transactions were executed sequentially.
  • Durability — guarantees that once a transaction has been committed, it will remain committed even in the case of a system failure.

Non-relational databases do not necessarily support ACID properties, instead most of them are BASE compliant:

  • Basically Available — ensure the availability of data by spreading and replicating it across the nodes of the database cluster.
  • Soft-state — replicas are not consistent all the time; so the state may only be partially correct as it may not yet have converged.
  • Eventually consistent — data will become consistent at some point in the future, but no guarantee when.

Difference in Performance

  • Scalability and Fault Tolerance

There are two commonly used database horizontal scaling techniques: replication and horizontal partitioning (sharding). Replication refers to creating copies of a database. With replication, requests can be spread across the nodes in the cluster which increases the throughput and adds fault-tolerance to the system. Partitioning (sharding) distributes the database across multiple nodes so each node stores a portion of the data.

It’s easy to scale non-relational databases horizontally across multiple servers or nodes via sharding because the partitions are more self-contained and less related (no “join” needed across nodes). Non-relational databases offer horizontal scaling at various CAP Theorem tradeoffs. However, achieving horizontal scalability is a huge challenge for relational databases. Relational databases usually scale vertically by adding more resources to a single server. They do know the concept of sharding but it comes with certain restrictions and is typically hard to implement due to the schema requirement, ACID guarantees and query flexibility.

  • Consistency VS. Availability

Relational databases prioritize strong consistency over everything else while non-relational databases typically prioritize availability and partition tolerance and offer only eventual consistency. Note: the consistency here means the state of all nodes being consistent with each other at any given time, which is different from the one in ACID consistency which is about data integrity (data is consistent w.r.t. relations and constraints after every transaction).

Examples:

Relational Databases:

Amazon Relational Database Service (RDS), SQL Server, Oracle, MySQL, PostgreSQL, Google BigQuery, Snowflake, Databricks

Non-relational Databases:

Key-Value Store:

DynamoDB, CosmosDB, Redis, ElastiCache, Memcached

Wide-Columns Databases:

Keyspaces, CosmosDB, HBase, Cassandra

Document Databases:

DynamoDB, DocumentDB, CosmosDB, MongoDB, Elasticsearch

Graph Databases:

Neptune, CosmosDB, Neo4J

More content at PlainEnglish.io.

Sign up for our free weekly newsletter. Follow us on Twitter, LinkedIn, YouTube, and Discord.


Relational Databases vs Non-Relational Databases was originally published in Artificial Intelligence in Plain English on Medium, where people are continuing the conversation by highlighting and responding to this story.

https://ai.plainenglish.io/relational-databases-vs-non-relational-databases-20656ab95b16?source=rss—-78d064101951—4
By: MachineLovesLearning
Title: Relational Databases vs Non-Relational Databases
Sourced From: ai.plainenglish.io/relational-databases-vs-non-relational-databases-20656ab95b16?source=rss—-78d064101951—4
Published Date: Wed, 14 Jun 2023 11:03:11 GMT

Did you miss our previous article…
https://e-bookreadercomparison.com/pointers-in-computing-and-its-applications/

Leave a Reply

Your email address will not be published. Required fields are marked *

Exit mobile version