• Skip to main content
  • Skip to primary sidebar
  • Computer Architecture
  • Computer Networks
  • DBMS
  • OS
  • Software Engineering
  • Security
  • OOT
binary-terms-logo

Binary Terms

The Computer Science & IT Guide

Aggregate Functions in SQL

5th December 2019 by Neha T Leave a Comment

Aggregate functions in SQL evaluates the set of tuple values and returns a relation with a single attribute carrying single tuple value as a result. SQL has five aggregate functions count, average, maximum, minimum, sum. Among these aggregate functions, only ‘sum’ and ‘average’ function operate only on numeric values only while count, maximum, minimum can also operate on non-numeric data such as string.

In this section, we will discuss the steps to evaluate aggregate functions, type of aggregate functions, aggregate function along with ‘group by’ clause and ‘having’ clause.

Content: Aggregate Function in DBMS

  1. How to Aggregate Data?
  2. Aggregate Functions
  3. Aggregation with Group by Clause
  4. Aggregation with Having Clause
  5. Key Takeaways

How to aggregate data in SQL?

  1. The first thing evaluated in query is the form clause which evaluates the relation to be operated.
  2. Next to form clause is where clause, the predicate in where clause is evaluated on the outcome of ‘form’ clause.
  3. The tuples filtered by ‘where’ clause are then grouped using the group by clause if present else all the tuples filtered by where clause are considered as one group.
  4. The having clause is then applied to each group formed by ‘group by’ clause and the groups satisfying the having clause are then forwarded to the ‘select’ clause.
  5. Then the select clause applies the aggregate function on the groups and generates a single result tuple for each group.

Aggregate Function in DBMS

The aggregate functions are applied to the set of tuples. And as a result, they return a relation with a single attribute consisting of a  single tuple value. It can also be applied to the group of sets of tuples, resulting in a relation with a single attribute and one tuple for each group.

All aggregate function avoids ‘null values’ excluding the count(*) function. Count function has variations regarding Null values.

Below is the list of all aggregate functions in SQL:

  1. Count (count)
  2. Average (avg)
  3. Maximum (max)
  4. Minimum (min)
  5. Total (sum)

We can name the attribute of the result relation obtained by aggregation using ‘as’ clause. We can use ‘as’ clause for all aggregate function.

Count (count)

This aggregate function counts the number of tuples in the relation. It has certain variations:

count(*) – This count function returns the number of tuples present in a relation including the tuples holding ‘Null values’.
count(attribute_name) – This function counts all the tuples with ‘not-null values’.
count(distinct attribute_name) – This function eliminates tuples with ‘null values’ and ‘duplicate values’. It only counts the tuples with not-null values and distinct values.

Note: SQL standards do not legalize the use of distinct keyword with count(*) function.

Now let us consider an example:

teaches relation

If you want to count the number of instructors in the teaches relation then the query for this would be:

select count(inst-id)
from teaches;

The query above will return 13 as a result. Here the count function avoids Null values. Next, if we add distinct keyword in count function.

select count (distinct inst-id)
from teaches;

This query returns 9 as ‘distinct’ keyword is used along with the attribute name in count function. So, it has eliminated the duplicates along with null values.

select count(*)
from teaches;

This query will return 15 as it has counted all the tuples including the Null values.

Average (avg)

The average function returns the average of all the tuple values from a single attribute of a relation. Remember the duplicate tuples must retain while computing an average else it will show an incorrect answer. The average aggregate function avoids the ‘Null value’ in its input as it complicates the aggregation.
instructor relation
Consider the following query that evaluates the “average salary of all the instructors teaching in Comp.Sci. department”.

Select avg(salary)
from instructor
where dept_name=‘Comp.Sci.’;

The result of this query will be a relation with a single attribute carrying a single record. This record has a numerical value that shows the average salary of all instructors teaching in Comp.Sci. department (result  73,333.333).

You can also provide a meaningful name to the attribute of result relation, using as clause else, the database system will give an arbitrary name to the attribute of result relation. The query below shows how can we provide a name to the attribute of result relation?

select avg(salary) as avg_salary
from instructor
where dept_name =‘Comp.Sci.’

average aggregate function new 3

Maximum (max)

The max() aggregate function evaluates the maximum value among all the tuple values, of the attribute specified in max () function. Like other aggregate functions, the max() avoid the’ null values’ in its input.

The query below will result in a relation containing a single tuple indicating the maximum salary of all the instructors.

select max(salary) as max_salary
from instructor;

maximum aggregate function 1

Minimum (min)

Contrary to max aggregate function min evaluates the minimum value among all the tuple values, of the attribute specified to the min aggregate function. The min() aggregate function avoid ‘null values’ in its input.

To find the minimum salary of all instructors in the instructor relation we will write the query:

select min(salary) as min_salary
from instructor;

min aggregate function 1

Total (sum)

The sum aggregate function evaluates the sum of all the tuple values, of the attribute specified to the sum function. Like average function, sum also drives on numerical values only. The sum function avoids the null values. The query below will find the total salary of all the instructors.

select sum(salary) as sum_salary
from instructor;

sum aggregate function 1

Aggregation with ‘Group by’ clause

The aggregate functions can also be applied to a group consisting of sets of tuples. The tuples consisting the same value for the attributes specified in ‘group by’ clause are kept in one group.

So, if we have to calculate the total salary of instructors each department then the query would be:

select dept_name sum(salary) as sum_salary
from instructor
group by dept_name;

group by clause 1

Aggregation with ‘Having’ Clause

Having clause is the condition that is applied to each group formed by group by clause. The groups formed by ‘group by’ clause if do not satisfy the condition of ‘having’ clause are removed from the result relation.

Suppose we want the department names whose total salary of instructors is greater the 72000. Then the query would be:

select dept_name sum(salary) as sum_salary
from instructor
group by dept_name
having sum(salary)>72000;

having by clause 1

Key Takeaways:

  • Aggregate functions are applied to the set of tuple values and they return a relation with a single attribute consisting of a single tuple.
  • We have five aggregate functions count, average, maximum, minimum and sum.
  • The average and sum function operate only on numerical values; the other aggregate function can also operate on non-numerical values.
  • All aggregate functions avoid null values while operating except count(*).
  • Aggregate functions can also be used along with, ‘group by‘ clause and ‘having‘ clause.

So this was all about the aggregate functions in SQL. Ensure that you apply the sum and average function to the attribute having numerical tuple values.

Related Terms:

  1. Integrity Constraints in DBMS
  2. Relational Data Model
  3. SQL Data Definition Language
  4. Null Value in SQL
  5. Set Operations in SQL

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

Most Searched Terms

  • Directory Structure
  • Contiguous Memory Allocation
  • Addressing Mode and its Types
  • Pipelining
  • Vector Processing
  • Register Organization
  • Direct Memory Access (DMA)
  • Process Control Block (PCB)
  • Swapping in Operating System
  • Relational Data Model

Recent Additions

  • Interrupts in Computer Architecture
  • Memory Locations and Addresses
  • Instruction Pipelining
  • Cache Memory
  • Instruction Cycle
  • Optical Memory
  • Synchronization Hardware
  • Difference Between fork() and exec() System Call
  • Multiple Processor Scheduling
  • Threading Issues in OS

Categories

  • Cloud Computing
  • Computer Architecture
  • Computer Networks
  • Data Warehouse and Mining
  • DBMS
  • Object Oriented Technology
  • Operating System
  • Security
  • Software Engineering

Copyright © 2021 · Binary Terms · Contact Us · About Us · Privacy