Aggregate functions in SQL evaluates the set of tuple values and returns a relation with a single attribute carrying single tuple value as a result. SQL has five aggregate functions count, average, maximum, minimum, sum. Among these aggregate functions, only ‘sum’ and ‘average’ function operate only on numeric values only while count, maximum, minimum can also operate on non-numeric data such as string.
In this section, we will discuss the steps to evaluate aggregate functions, type of aggregate functions, aggregate function along with ‘group by’ clause and ‘having’ clause.
Content: Aggregate Function in DBMS
- How to Aggregate Data?
- Aggregate Functions
- Aggregation with Group by Clause
- Aggregation with Having Clause
- Key Takeaways
How to aggregate data in SQL?
- The first thing evaluated in query is the form clause which evaluates the relation to be operated.
- Next to form clause is where clause, the predicate in where clause is evaluated on the outcome of ‘form’ clause.
- The tuples filtered by ‘where’ clause are then grouped using the group by clause if present else all the tuples filtered by where clause are considered as one group.
- The having clause is then applied to each group formed by ‘group by’ clause and the groups satisfying the having clause are then forwarded to the ‘select’ clause.
- Then the select clause applies the aggregate function on the groups and generates a single result tuple for each group.
Aggregate Function in DBMS
The aggregate functions are applied to the set of tuples. And as a result, they return a relation with a single attribute consisting of a single tuple value. It can also be applied to the group of sets of tuples, resulting in a relation with a single attribute and one tuple for each group.
All aggregate function avoids ‘null values’ excluding the count(*) function. Count function has variations regarding Null values.
Below is the list of all aggregate functions in SQL:
This aggregate function counts the number of tuples in the relation. It has certain variations:
count(*) – This count function returns the number of tuples present in a relation including the tuples holding ‘Null values’.
count(attribute_name) – This function counts all the tuples with ‘not-null values’.
count(distinct attribute_name) – This function eliminates tuples with ‘null values’ and ‘duplicate values’. It only counts the tuples with not-null values and distinct values.
Note: SQL standards do not legalize the use of distinct keyword with count(*) function.
Now let us consider an example:
If you want to count the number of instructors in the teaches relation then the query for this would be:
The query above will return 13 as a result. Here the count function avoids Null values. Next, if we add distinct keyword in count function.
select count (distinct inst-id)
This query returns 9 as ‘distinct’ keyword is used along with the attribute name in count function. So, it has eliminated the duplicates along with null values.
The average function returns the average of all the tuple values from a single attribute of a relation. Remember the duplicate tuples must retain while computing an average else it will show an incorrect answer. The average aggregate function avoids the ‘Null value’ in its input as it complicates the aggregation.
Consider the following query that evaluates the “average salary of all the instructors teaching in Comp.Sci. department”.
The result of this query will be a relation with a single attribute carrying a single record. This record has a numerical value that shows the average salary of all instructors teaching in Comp.Sci. department (result 73,333.333).
You can also provide a meaningful name to the attribute of result relation, using as clause else, the database system will give an arbitrary name to the attribute of result relation. The query below shows how can we provide a name to the attribute of result relation?
select avg(salary) as avg_salary
where dept_name =‘Comp.Sci.’
The max() aggregate function evaluates the maximum value among all the tuple values, of the attribute specified in max () function. Like other aggregate functions, the max() avoid the’ null values’ in its input.
The query below will result in a relation containing a single tuple indicating the maximum salary of all the instructors.
select max(salary) as max_salary
Contrary to max aggregate function min evaluates the minimum value among all the tuple values, of the attribute specified to the min aggregate function. The min() aggregate function avoid ‘null values’ in its input.
To find the minimum salary of all instructors in the instructor relation we will write the query:
select min(salary) as min_salary
The sum aggregate function evaluates the sum of all the tuple values, of the attribute specified to the sum function. Like average function, sum also drives on numerical values only. The sum function avoids the null values. The query below will find the total salary of all the instructors.
select sum(salary) as sum_salary
Aggregation with ‘Group by’ clause
The aggregate functions can also be applied to a group consisting of sets of tuples. The tuples consisting the same value for the attributes specified in ‘group by’ clause are kept in one group.
So, if we have to calculate the total salary of instructors each department then the query would be:
select dept_name sum(salary) as sum_salary
group by dept_name;
Aggregation with ‘Having’ Clause
Having clause is the condition that is applied to each group formed by group by clause. The groups formed by ‘group by’ clause if do not satisfy the condition of ‘having’ clause are removed from the result relation.
Suppose we want the department names whose total salary of instructors is greater the 72000. Then the query would be:
select dept_name sum(salary) as sum_salary
group by dept_name
- Aggregate functions are applied to the set of tuple values and they return a relation with a single attribute consisting of a single tuple.
- We have five aggregate functions count, average, maximum, minimum and sum.
- The average and sum function operate only on numerical values; the other aggregate function can also operate on non-numerical values.
- All aggregate functions avoid null values while operating except count(*).
- Aggregate functions can also be used along with, ‘group by‘ clause and ‘having‘ clause.
So this was all about the aggregate functions in SQL. Ensure that you apply the sum and average function to the attribute having numerical tuple values.