**Aggregate functions** in SQL evaluates the set of tuple values and returns a relation with a single attribute carrying single tuple value as a result. SQL has five aggregate functions **count**, **average**, **maximum, minimum, sum.** Among these aggregate functions, only ‘sum’ and ‘average’ function operate only on numeric values only while count, maximum, minimum can also operate on non-numeric data such as string.

In this section, we will discuss the steps to evaluate aggregate functions, type of aggregate functions, aggregate function along with ‘group by’ clause and ‘having’ clause.

## Content: Aggregate Function in DBMS

- How to Aggregate Data?
- Aggregate Functions
- Aggregation with Group by Clause
- Aggregation with Having Clause
- Key Takeaways

### How to aggregate data in SQL?

- The first thing evaluated in query is the
**form**clause which evaluates the relation to be operated. - Next to form clause is where clause, the predicate in
**where**clause is evaluated on the outcome of ‘form’ clause. - The tuples filtered by ‘where’ clause are then grouped using the
**group by**clause if present else all the tuples filtered by where clause are considered as one group. - The
**having**clause is then applied to each group formed by ‘group by’ clause and the groups satisfying the having clause are then forwarded to the ‘select’ clause. - Then the
**select**clause applies the**aggregate**function on the groups and generates a**single result tuple**for each group.

### Aggregate Function in DBMS

The aggregate functions are applied to the **set of tuples**. And as a result, they return a **relation** with a **single attribute** consisting of a ** single tuple** value. It can also be applied to the **group of sets of tuples**, resulting in a **relation** with a **single attribute** and **one tuple for each group**.

All aggregate function avoids ‘null values’ excluding the count(*) function. Count function has variations regarding Null values.

Below is the list of all aggregate functions in SQL:

We can **name** the attribute of the result relation obtained by aggregation using **‘as’** clause. We can use ‘as’ clause for all aggregate function.

#### Count (count)

This aggregate function counts the number of tuples in the relation. It has certain variations:

**count(*)** – This count function returns the number of tuples present in a relation including the tuples holding ‘Null values’.

**count(attribute_name)** – This function counts all the tuples with ‘not-null values’.

**count(distinct attribute_name)** – This function eliminates tuples with ‘null values’ and ‘duplicate values’. It only counts the tuples with not-null values and distinct values.

**Note**: SQL standards do not legalize the use of **distinct** keyword with **count(*)** function.

Now let us consider an example:

If you want to count the number of instructors in the teaches relation then the query for this would be:

**select** **count**(inst-id)

**from** teaches;

The query above will return **13** as a result. Here the count function avoids Null values. Next, if we add distinct keyword in count function.

**select** **count** (distinct inst-id)

**from** teaches;

This query returns **9** as ‘distinct’ keyword is used along with the attribute name in count function. So, it has eliminated the duplicates along with null values.

**select** **count**(*****)

**from** teaches;

This query will return **15** as it has counted all the tuples including the Null values.

#### Average (avg)

The average function returns the **average** of all the tuple values from a single attribute of a relation. Remember the **duplicate tuples** must **retain** while computing an average else it will show an incorrect answer. The average aggregate function avoids the ‘Null value’ in its input as it complicates the aggregation.

Consider the following query that evaluates the “average salary of all the instructors teaching in Comp.Sci. department”.

**Select** **avg**(salary)

**from** instructor

**where** dept_name=‘Comp.Sci.’;

The result of this query will be a relation with a single attribute carrying a single record. This record has a numerical value that shows the average salary of all instructors teaching in Comp.Sci. department (result 73,333.333).

You can also provide a meaningful **name** to the attribute of result relation, using **as** clause else, the database system will give an arbitrary name to the attribute of result relation. The query below shows how can we provide a name to the attribute of result relation?

**select** **avg**(salary) **as** avg_salary

**from** instructor

**where** dept_name =‘Comp.Sci.’

#### Maximum (max)

The **max()** aggregate function evaluates the **maximum value** among all the tuple values, of the **attribute** specified in max () function. Like other aggregate functions, the max() avoid the’ null values’ in its input.

The query below will result in a relation containing a single tuple indicating the maximum salary of all the instructors.

**select** **max**(salary) **as** max_salary

**from** instructor;

#### Minimum (min)

Contrary to max aggregate function min evaluates the **minimum value** among all the tuple values, of the **attribute** specified to the min aggregate function. The min() aggregate function avoid ‘null values’ in its input.

To find the minimum salary of all instructors in the instructor relation we will write the query:

**select** **min**(salary) **as** min_salary

**from** instructor;

#### Total (sum)

The sum aggregate function evaluates the sum of all the tuple values, of the attribute specified to the sum function. Like average function, sum also drives on numerical values only. The sum function avoids the null values. The query below will find the total salary of all the instructors.

**select** **sum**(salary) **as** sum_salary

**from** instructor;

### Aggregation with ‘Group by’ clause

The aggregate functions can also be applied to a **group** consisting of **sets of tuples**. The tuples consisting the same value for the attributes specified in ‘group by’ clause are kept in one group.

So, if we have to calculate the total salary of instructors each department then the query would be:

**select** dept_name **sum**(salary) **as** sum_salary

**from** instructor

**group by** dept_name;

### Aggregation with ‘Having’ Clause

Having clause is the **condition** that is applied to each group formed by group by clause. The groups formed by ‘group by’ clause if do not satisfy the condition of ‘having’ clause are **removed** from the result relation.

Suppose we want the department names whose total salary of instructors is greater the 72000. Then the query would be:

**select** dept_name **sum**(salary) **as** sum_salary

**from** instructor

**group by** dept_name

**having** **sum**(salary)>72000;

### Key Takeaways:

- Aggregate functions are applied to the
**set of tuple values**and they return a**relation**with a**single**attribute consisting of a**single**tuple. - We have five aggregate functions
**count**,**average**,**maximum**,**minimum**and**sum**. - The
**average**and**sum**function operate only on**numerical values**; the other aggregate function can also operate on non-numerical values. - All aggregate functions avoid
**null values**while operating except**count(*)**. - Aggregate functions can also be used along with, ‘
**group by**‘ clause and ‘**having**‘ clause.

So this was all about the aggregate functions in SQL. Ensure that you apply the sum and average function to the attribute having numerical tuple values.

## Leave a Reply