The silent syntax difference in foreign keys between Postgres and MySQL
Review your schema definitions when migrating databases
Foreign keys are a concept embedded in almost every relational database. If you are normalizing your database, you will expect to use foreign keys throughout. And they seem simple. But simplicity often belies the subtle differences that can trip you up when switching between database systems.
One significant difference is how Postgres and MySQL handle the syntax for defining foreign keys. Both databases support foreign key constraints, but the way you define them can lead to unexpected behavior if you’re unaware of the nuances.
A refresher on foreign keys
Foreign keys are a crucial concept in relational database design and play a significant role in maintaining data integrity and establishing relationships between tables. A foreign key is a column or a set of columns in one table that refers to the primary key or a unique key in another table. It establishes a link between two tables, defining a parent-child relationship.
Foreign keys enforce referential integrity, ensuring the relationships between tables remain consistent and valid. They prevent orphaned records and maintain data consistency across related tables.
How foreign keys work:
- The table containing the foreign key is called the child table or referencing table.
- The table referenced by the foreign key is called the parent table or referenced table.
- The foreign key in the child table must reference a primary or unique key in the parent table.
- The values in the foreign key column of the child table must match the values in the referenced column of the parent table or be null if the foreign key allows null values.
Consider two tables: orders
and customers
. The orders table has a foreign key customer_id
that references the customer_id
primary key in the customers
table. Let’s build these in Postgres:
The foreign key assignment is done in this line:
customer_id INT
defines thecustomer_id
column in theorders
table as an integer data type.REFERENCES customers (customer_id)
establishes a foreign key constraint on thecustomer_id
column in theorders
table. It specifies that thecustomer_id
column references thecustomer_id
column in thecustomers
table. This means that the values in thecustomer_id
column of theorders
table must exist in thecustomer_id
column of thecustomers
table.ON DELETE CASCADE
specifies the action to be taken when a referenced row in thecustomers
table is deleted. In this case,CASCADE
means that if a customer is deleted from thecustomers
table, all the corresponding orders in theorders
table that reference that customer will also be automatically deleted. This ensures data consistency and prevents orphaned records in the orders table.ON UPDATE CASCADE
specifies the action to be taken when thecustomer_id
value in thecustomers
table is updated. In this case,CASCADE
means that if thecustomer_id
of a customer is updated in thecustomers
table, all the correspondingcustomer_id
values in the orders table will also be automatically updated to match the new value. This ensures data consistency and maintains the integrity of the foreign key relationship.
Foreign keys define the relationships between tables, making it easier to understand and query the data based on those relationships. What happens if we try to add an order with an invalid customer_id
?
We get an error telling us that this customer_id
isn’t in customers and thus violates foreign key constraint
.
The subtle difference in defining foreign keys in MySQL and Postgres
Let’s use the same SQL to create those tables in MySQL:
We’ll again populate data and try to add an order with a customer_id
that doesn’t exist.
Wait, what? Let’s take a look at our orders
table:
Hmm, we have an order from a customer that doesn’t exist. Not great. Why? Well, check out this excerpt from the MySQL docs:
MySQL parses but ignores “inline REFERENCES
specifications” (as defined in the SQL standard) where the references are defined as part of the column specification. MySQL accepts REFERENCES
clauses only when specified as part of a separate FOREIGN KEY
specification.
Defining a column to use a REFERENCES tbl_name(col_name)
clause has no actual effect and serves only as a memo or comment to you that the column which you are currently defining is intended to refer to a column in another table. It is important to realize when using this syntax that:
- MySQL does not perform any sort of check to make sure that
col_name
actually exists intbl_name
(or even thattbl_name
itself exists). - MySQL does not perform any sort of action on
tbl_name
such as deleting rows in response to actions taken on rows in the table which you are defining; in other words, this syntax induces noON DELETE
orON UPDATE
behavior whatsoever. (Although you can write anON DELETE
orON UPDATE
clause as part of theREFERENCES
clause, it is also ignored.) - This syntax creates a column; it does not create any sort of index or key.
What does all that mean? The key takeaway is that MySQL ignores foreign key constraints when they are defined inline with the column definition. Instead, MySQL only recognizes foreign key constraints when specified as a separate clause using the FOREIGN KEY
keyword.
Even though the SQL code defines the foreign key inline, MySQL treats it as a comment and does not enforce the constraint or perform referential actions. As a result, it allows inserting records that violate the foreign key relationship, leading to inconsistencies in the data. Not great, and unfortunately not something evident if you are porting your code over from Postgres to MySQL.
Here’s what MySQL wants you to do:
Here, the foreign key constraint is defined as a separate clause using the FOREIGN KEY
keyword, followed by the column name and the REFERENCES
keyword to specify the referenced table and column. The ON DELETE
and ON UPDATE
clauses can be added to define the referential actions.
By defining the foreign key constraint in this manner, MySQL properly enforces the constraint and maintains the referential integrity between the tables. This is simply good practice in both Postgres and MySQL. Explicitly naming and defining the FOREIGN KEY helps self-document the code better. The inline version is a bit of a footgun for any future changes.
There are a few other differences between the SQL standard for foreign keys and MySQL foreign keys. This example highlights something most database developers know but don’t always grok–Postgres != MySQL != SQL. All look the same but have different implementations. If you expect clear errors as you switch between different versions, you will be disappointed when your database doesn’t act as expected.
Conclusion
If you are planning to move from MySQL to Postgres:
- Review your schema definitions: ensure that all foreign key constraints are explicitly defined using the FOREIGN KEY clause, especially if you are porting over schemas from MySQL.
- Test rigorously!
If you’re looking for a developer-friendly Postgres with database branching, check out Neon. It’s free to get started!