Relational Databases

Originally, databases were flat. This means that the information was stored in one long text file, called a tab delimited file. Each entry in the tab delimited file is separated by a special character, such as a vertical bar (|). Each entry contains multiple pieces of information (fields) about a particular object or person grouped together as a record. The text file makes it difficult to search for specific information or to create reports that include only certain fields from each record. With a relational database, you can quickly compare information because of the arrangement of data in columns. The relational database model takes advantage of this uniformity to build completely new tables out of required information from existing tables. In other words, it uses the relationship of similar data to increase the speed and versatility of the database. The "relational" part of the name comes into play because of mathmatical relations. A typical relational database has anywhere from 10 to more than 1,000 tables. Each table contains a column or columns that other tables can key on to gather information from that table. By storing this information in another table, the database can create a single small table with the locations that can then be used for a variety of purposes by other tables in the database. A typical large database, like the one a big Web site, such as Amazon would have, will contain hundreds or thousands of tables like this all used together to quickly find the exact information needed at any given time. Relational databases are created using a special computer language, structured query language (SQL), that is the standard for database interoperability. SQL is the foundation for all of the popular database applications available today, from Access to Oracle.

1. Relational Database A relational database is one which employs the relational model, in which the raw data is organized into sets of tuples, and the tuples organized into relations. This relational model imposes structure on its contents, in contrast to unstructured or semi-structured data of the various NoSQL architectures. 2. Database Management System (DBMS) A database management system is a software system which facilitates the organization of housed data into a particular database architecture, be it relational (Relational Database Management System, or RDBMS), document store, key-value store, column-oriented, graph, or other. Popular DBMSs include MongoDB, Cassandra, Redis, MySQL, Microsoft SQL Server, SQLite, and Oracle, among many, many, many others. 3. Primary Key In the relational model, a primary key is a single attribute, or combination of attributes, which can be used to uniquely identify a row of data in a givn table. Common primary keys include vendor ID, user ID, email address, or combination of attributes considered together such as first name, last name, and city of residence, all considered together as a single entity. It should be noted that what is an acceptable primary key in one situation may not uniquely identify data instances in another. 4. Foreign Key Again in the relational model, a foreign key is an attribute or collection of attributes from one relational table whose values must match another relational table's primary key. A common use for such an organizational scheme would be to link a street address in one table to a city in another, and perhaps to a country in a third. This eliminates repetitive data input, and reduces the possibility of error, increasing data accuracy.

HTML5 Icon

Both columns (customers.customer_id and orders.customer_id) should be the same exact data structure. If one is INT, the other one should not be BIGINT for example. Please note that in MySQL only the InnoDB engine has full support for Foreign Keys. But other storage engines will still allow you to specify them without giving any errors. Also the Foreign Key column is indexed automatically, unless you specify another index for it.

Join Queries

To retrieve data from a database that has relationships, we often need to use JOIN queries.

HTML5 Icon

We have 4 customers. One customer has two orders, two customers have one order each, and one customer has no order. Now let's see the different kinds of JOIN queries we can run on these tables. Cross Join This is the default type of JOIN query when no condition is specified.

Natural Join With this kind of JOIN query, the tables need to have a matching column name. In our case, both the tables have the customer_id column. So, MySQL will join the records only when the value of this column is matching on two records.

HTML5 Icon

As you can see the customer_id column is only displayed once this time, because the database engine treats this as the common column. We can see the two orders placed by Adam, and the other two orders by Joe and Sandy. Finally we are getting some useful information.

The result is a so called "Cartesian product" of the tables. It means that each row from the first table is matched with each row of the second table. Since each table had 4 rows, we ended up getting a result of 16 rows.

HTML5 Icon

Inner Join When a join condition is specified, an Inner Join is performed. In this case, it would be a good idea to have the customer_id field match on both tables. The results should be similar to the Natural Join.
The results are the same except a small difference. The customer_id column is repeated twice, once for each table. The reason is, we merely asked the database to match the values on these two columns. But it is actually unaware that they represent the same information.

Left Join
The MySQL LEFT JOIN clause allows you to query data from two or more database tables. The LEFT JOIN clause is an optional part of the SELECT statement, which appears after the FROM clause. Let’s assume that you are going to query data from two tables t1 and t2. The following statement illustrates the syntax of LEFT JOIN clause that joins the two tables:

HTML5 Icon

When you join the t1 table to the t2 table using the LEFT JOIN clause, if a row from the left table t1 matches a row from the right table t2 based on the join condition ( t1.c1 = t2.c1 ), this row will be included in the result set. In case the row in the left table does not match with the row in the right table, the row in the left table is also selected and combined with a “fake” row from the right table. The fake row contains NULL for all corresponding columns in the SELECT clause. In other words, the LEFT JOIN clause allows you to select rows from the both left and right tables that are matched, plus all rows from the left table ( t1 ) even with no matching rows found in the right table ( t2 ).

ON Clause Before moving on to other join types, we need to look at the ON clause. This is useful for putting the JOIN conditions in a separate clause.

HTML5 Icon

Now we can distinguish the JOIN condition from the WHERE clause conditions. But there is also a slight difference in functionality.