Nice piece. Solid Channel. Article here
1. What are DDL and DML languages? Give an example.
2. What is the difference between DELETE and TRUNCATE?
3. Why do we use CASE statements in SQL? Give an example.
4. What is the difference between LEFT, RIGHT, FULL outer join and INNER join?
5. What is the difference between DISTINCT and GROUP BY?
6. What are the rules to follow when using UNION operator?
7. What are aggregate functions? Name and explain different types of aggregate functions in SQL?
8. What is the difference between RANK, DENSE_RANK and ROW_NUMBER window functions?
9. Can we use aggregate functions as a window function? If yes, then how do we do it?
10. How can you convert a text date into date format? Consider the given text as “31-01-2021.”
11. Imagine there is a FULL_NAME column in a table which has values like “Elon Musk”, “Bill Gates”, “Jeff Bezos”, etc. So each full name has a first name, a space, and a last name. Which functions would you use to fetch only the first name from this FULL_NAME column? Give an example.
12. What are subqueries? Where can we use them?
13. Is it good to have the same subquery multiple times in your query? If not, then how can you solve this?
14. Difference between WHERE and HAVING clause?
15. What are indexes and why do we use it?
16. What are steps you would take to tune an SQL query?
17. What is the difference between primary key, unique key, and foreign key?
18. What is the difference between a view and a synonym?
19. When can a function NOT be called from SELECT query?
20. What is a trigger?
21. What is the difference between a views and a materialized views?
22. What is a MERGE statement?
23. Which function can be used to fetch yesterday’s date? Provide an example.
24. What is the difference between a function and a procedure?
25. What is PRAGMA AUTONOMOUS TRANSACTION?
1. What are DDL and DML languages? Give an example.
DDL stands for Data Definition Language. They include CREATE, DROP, ALTER, and TRUNCATE statements. You do not need to commit the changes after running DDL commands.
DML stands for Data Manipulation Language. DML includes INSERT, UPDATE, DELETE, and MERGE statements.
2. What is the difference between DELETE and TRUNCATE?
DELETE can be used to remove either a few or all records from a table whereas TRUNCATE will always remove all the records from a table. TRUNCATE cannot have a WHERE condition.
3. Why do we use CASE statements in SQL? Give an example.
The CASE statement is similar in IF ELSE statement from any other programming languages. We can use it to fetch or show a particular value based on a certain condition.
4. What is the difference between LEFT, RIGHT, FULL outer join and INNER join?
INNER JOIN will fetch only those records which are present in both the joined tables. LEFT JOIN will fetch all records from the left table even if those records are not present in the right table. RIGHT JOIN will fetch all records from the right table even if those records are not in the left table. FULL JOIN will fetch all records from both the left and right table. It’s kind of a combination of INNER, LEFT, and RIGHT join.
A sub question here is what is SELF join, NATURAL join, and CROSS join? SELF JOIN is when you join a table to itself. NATURAL JOIN is similar to INNER join but we do not need to use the ON clause during the join. Meaning in a natural join we just specify the tables. We do not specify the columns based on which this join should work. CROSS JOIN will join all the records from left table with all the records from the right table. Meaning the cross join is not based on matching any column.
5. What is the difference between DISTINCT and GROUP BY?
DISTINCT clause will return unique column values. Depending on the list of columns you provide, the DISTINCT clause will fetch the unique combination of values for all those combined columns. If you provide just a single column in DISTINCT then it fetches just the unique values in that specific column.
GROUP BY clause will group together the data based on the columns specified in group by. Which will then return just one record for each unique value in the column specified in group by. In other words, GROUP BY can also be used to fetch unique records from a table but this is not why group by clause is used for. The main purpose of the group by clause is to perform some aggregation (using the aggregate functions like MIN, MAX, COUNT, SUM, AVG) based on the group by column values.
6. What are the rules to follow when using UNION operator?
UNION operator can be used to combine two different SQL Queries. The output would be the result combined from both these queries. Duplicate records would not be returned. You can combine two queries using UNION operator if they follow the following rules:
7. What are aggregate functions? Name and explain different types of aggregate functions in SQL?
Aggregate functions can be used to perform calculations on a set of values, which will then return a single value. We can use aggregate functions either with the GROUP BY clause or without it.
SUM
AVG
MIN
MAX
COUNT
8. What is the difference between RANK, DENSE_RANK and ROW_NUMBER window functions?
RANK() will assign a rank to each row within each partitioned result set. If multiple rows have the same value then each of these rows will share the same rank. However the rank of the following (next) rows will get skipped. Meaning for each duplicate row, one rank value gets skipped.
The DENSE_RANK() window function will assign a rank to each row within each partitioned result set. If multiple rows have the same value then each of these rows will share the same rank. However the dense_rank of the following (next) rows will NOT get skipped. This is the only difference between rank and dense_rank.
RANK() function skips a rank if there are duplicate rows whereas the DENSE_RANK() function will never skip a rank.
The ROW_NUMBER window function will assign a unique row number to every row within each partitioned result set. It does not matter if the rows are duplicate or not.
9. Can we use aggregate functions as a window function? If yes, then how do we do it?
Yes we can use aggregate functions as a window function by using the OVER clause. Aggregate functions will reduce the number of rows or records since they perform calculation of a set of row values to return a single value. Whereas window functions do not reduce the number or records.
10. How can you convert a text date into date format? Consider the given text as “31-01-2021.”
Different RDBMS have different date functions to convert a text to date format. Let’s see the date functions to be used in the four most popular RDBMS, (Oracle, MySQL, SQL Server, PostgresQL):
## Oracle
SELECT TO_DATE('31-01-2021', 'DD-MM-YYYY') as date_value FROM DUAL;
## MySQL
SELECT DATE_FORMAT('31-01-2021', '%d-%m-%Y') as date_value;
## Microsoft SQL Server (MSSQL)
SELECT CAST('31-01-2021' as DATE) as date_value;
## PostgreSQL
SELECT TO_DATE('31-01-2021', 'DD-MM-YYYY') as date_value;
11. Imagine there is a FULL_NAME column in a table which has values like “Elon Musk”, “Bill Gates”, “Jeff Bezos”, etc. So each full name has a first name, a space, and a last name. Which functions would you use to fetch only the first name from this FULL_NAME column? Give an example.
Again, there will be different ways to solve this in the different RDBMS.
## MYSQL
SELECT SUBSTRING(full_name, 1, INSTR(full_name, ' ', 1, 1) - 1) as first_name FROM dual;
## MS SQL SERVER
SELECT SUBSTRING(full_name, 1, CHARINDEX(' ', full_name) - 1) as first_name;
12. What are subqueries? Where can we use them?
A SELECT query statement which is placed inside another SELECT query is termed as a subquery. Subquery can also be termed as inner query.
The SELECT query which holds the subquery can be termed as main query or outer query. Subquery can also reside within a INSERT, UPDATE, DELETE statement or inside another subquery.
In a SELECT statement, subquery may occur in the SELECT clause, FROM clause or the WHERE clause.
When a query statement containing subquery execute, the inner query or the subquery will execute first. The outer query will then use the result from the subquery while processing the outer query.
13. Is it good to have the same subquery multiple times in your query? If not, then how can you solve this?
It’s not a good practice to use the same subquery multiple times in your query. Repeating the same subquery multiple times in your query can impact the query performance (since the same query would execute multiple times) and also becomes difficult to maintain (since any change to the subquery will need to be made in multiple different places).
We can avoid this by using the WITH clause. We can place the subquery just once inside the WITH clause and then use this multiple times in our query. This way SQL will execute the subquery just once (which is at the start of the query execution).
14. Difference between WHERE and HAVING clause?
WHERE clause is used to filter records from the table. We can also specify join conditions between two tables in the WHERE clause. If a SQL query has both WHERE and GROUP BY clause then the records will first get filtered based on the conditions mentioned in WHERE clause before the data gets grouped as per the GROUP BY clause.
Whereas HAVING clause is used to filter records returned from the GROUP BY clause. So if a SQL query has WHERE, GROUP BY and HAVING clause then first the data gets filtered based on WHERE condition, only after this grouping of data takes place. Finally based on the conditions in HAVING clause the grouped data again gets filtered.
15. What are indexes and why do we use them?
Index is a database object which is applied on one or more columns of a table. When a column (or list of columns) from the table is Indexed, database creates a pointer to each value stored in that column. This significantly improves the query execution time since the database will have a more efficient way to find a particular value from the column based on its index.
Imagine you have a table with one million records and there is an ID column in it along with many other columns. Let’s say you are given a task to write a SQL query which is expected to return just 100 records from this table.
Imagine if you did not create an index on this ID column then the SQL Query which you write will have to scan through all the one million records to find the desired 100 records. This is going to be a very slow and the performance of the query would take a blow.
Now, let’s imagine you did create an index on the ID column, then SQL will have a pointer to every value stored in this column hence the SQL Query you write will be much faster since the database will know where to find the 100 required records by referring to the index of this ID column.
This in a nutshell is what index is used for. Basically, Index creates a pointer to each value in the column which in turn helps in finding any specific value from this column in a much faster way.
Please note, different indexes have different functionalities and will behave differently from each other but in a nutshell index is used to identify any value faster from the table column(s).
16. What are steps you would take to tune an SQL query?
See the article for a more in depth description of each of these.
17. What is the difference between primary key, unique key, and foreign key?
These are all constraints we can create on a table.
When you make a column in the table as primary key then this column will always have unique or distinct values. Duplicate values and NULL value will not be allowed in a primary key column. A table can only have one primary key. Primary key can be created either on one single column or a group of columns.
When you make a column in the table as unique key then this column will always have unique or distinct values. Duplicate values will not be allowed. However, NULL values are allowed in a column which has unique key constraint. This is the major difference between primary and unique key.
Foreign key is used to create a master/parent child kind of relationship between two tables. When we make a column in a table as foreign key, this column will then have to be referenced from another column from some other table.
18. What is the difference between a view and a synonym?
View is a database object which is created based on a SQL Query. It’s like giving a name to the results returned from a SQL Query and storing it in the database as a view.
If the query result changes then the data in view also changes. View is directly linked to the SQL Query over which it was created.
Synonym on the other hand is just an alias or an alternate name that you can provide to any database objects such as tables, views, sequences, procedures etc.
Synonym is created for a single database object whereas view can be created on a query where the query may have been formed by multiple tables.
19. When can a function NOT be called from SELECT query?
If the function includes DML operations like INSERT, UPDATE, DELETE etc then it cannot be called from a SELECT query. Because SELECT statement cannot change the state of the database.
20. What is a trigger?
Trigger is a database object which is similar to a stored procedure which will automatically get invoked or executed (or run) when the specified event occurs in the database.
21. What is the difference between views and materialized views?
Similar to views, materialized views are also database objects which are formed based on a SQL Query however unlike views, the contents or data of the materialized views are periodically refreshed based on its configuration.
The contents of view will get updated automatically when the underlying table (forming the query) data gets changed. However, materialised views can be configured to refresh its contents periodically or can be manually refreshed when needed.
Creating materialized views can be a very good approach for performance tuning especially when dealing with remote tables.
22. What is a MERGE statement?
Merge is part of the DML commands in SQL which can be used either perform INSERT or UPDATE based on the data in the respective table.
If the desired data is present then merge will update the records. If desired data is not present then merge will insert the records.
23. Which function can be used to fetch yesterday’s date? Provide an example.
Different RDBMS would have different date functions to add or subtract a day value from the current date. Let’s see the date functions to be used in the four most popular RDBMS.
## ORACLE
SELECT SYSDATE - 1 as previous_day FROM DUAL;
## MYSQL
SELECT DATE_SUB(SYSDATE(), INTERVAL 1 DAY) as previous_day;
## MSSQL
SELECT DATEADD(DAY, -1, GETDATE());
## PostgreSQL
SELECT CURRENT_DATE - 1 as previous_day FROM DUAL;
24. What is the difference between a function and a procedure?
A function should always return a value whereas for a procedure it’s not mandatory to return a value.
Function can be called from a SELECT query whereas procedure cannot be called from a SELECT query.
Function is generally used to perform some calculation and return a result. Whereas procedure is generally used to implement some business logic.
25. What is the PRAGMA AUTONOMOUS TRANSACTION?
We can declare the stored program like a procedure as a PRAGMA AUTONOMOUS TRANSACTION which means that any transaction committed or rolled back in this procedure will not impact any open transactions in the program from where this procedure was called from.
Said another way, it changes the way a subprogram works within a transaction. A subprogram marked with this pragma can do SQL operations and commit or roll back those operations, without committing or rolling back the data in the main transaction.