Aggregating Data
Using Group Functions
After completing this lesson, you should be able to
do the following:
• Identify the available group functions
• Describe the use of group functions
• Group data using the GROUP BY clause
• Include or exclude grouped rows by using the
HAVING clause
What Are Group Functions?
Group functions operate on sets of rows to give one
result per group.
Group Functions
Unlike single-row functions, group functions operate on sets of rows to give one result per group.
These sets may be the whole table or the table split into groups.
Types of Group Functions
• AVG
• COUNT
• MAX
• MIN
• STDDEV
• SUM
• VARIANCE
Group Functions Syntax
SELECT [column,] group_function(column), ...
FROM table
[WHERE condition]
[GROUP BY column]
[ORDER BY column];
Guidelines for Using Group Functions
• DISTINCT makes the function consider only nonduplicate values; ALL makes it consider every
value including duplicates. The default is ALL and therefore does not need to be specified.
• The data types for the functions with an expr argument may be CHAR, VARCHAR2, NUMBER,
or DATE.
• All group functions ignore null values. To substitute a value for null values, use the NVL, NVL2,
or COALESCE functions.
• The Oracle server implicitly sorts the result set in ascending order when using a GROUP BY
clause. To override this default ordering, DESC can be used in an ORDER BY clause.
Using the AVG and SUM Functions
You can use AVG and SUM for numeric data.
SELECT AVG(salary), MAX(salary),
MIN(salary), SUM(salary)
FROM employees
WHERE job_id LIKE ’%REP%’;
Using the MIN and MAX Functions
You can use MIN and MAX for any data type.
SELECT MIN(hire_date), MAX(hire_date)
FROM employees;
Group Functions (continued)
You can use the MAX and MIN functions for any data type. The slide example displays the most junior
and most senior employee.
The following example displays the employee last name that is first and the employee last name that is
the last in an alphabetized list of all employees.
SELECT MIN(last_name), MAX(last_name)
FROM employees;
Using the COUNT Function
COUNT(*) returns the number of rows in a table.
SELECT COUNT(*)
FROM employees
WHERE department_id = 50;
The COUNT Function
The COUNT function has three formats:
• COUNT(*)
• COUNT(expr)
• COUNT(DISTINCT expr)
COUNT(*) returns the number of rows in a table that satisfy the criteria of the SELECT statement,
including duplicate rows and rows containing null values in any of the columns. If a WHERE clause is
included in the SELECT statement, COUNT(*) returns the number of rows that satisfies the condition
in the WHERE clause.
In contrast, COUNT(expr) returns the number of non-null values in the column identified by expr.
COUNT(DISTINCT expr) returns the number of unique, non-null values in the column identified
by expr.
The slide example displays the number of employees in department 50.
Using the COUNT Function
• COUNT(expr) returns the number of rows with
non-null values for the expr.
• Display the number of department values in the
EMPLOYEES table, excluding the null values.
SELECT COUNT(commission_pct)
FROM employees
WHERE department_id = 80;
The COUNT Function (continued)
The slide example displays the number of employees in department 80 who can earn a commission.
Example
Display the number of department values in the EMPLOYEES table.
SELECT COUNT(department_id)
FROM employees;
Using the DISTINCT Keyword
• COUNT(DISTINCT expr) returns the number of
distinct non-null values of the expr.
• Display the number of distinct department values
in the EMPLOYEES table.
SELECT COUNT(DISTINCT department_id)
FROM employees;
The DISTINCT Keyword
Use the DISTINCT keyword to suppress the counting of any duplicate values within a column.
The example on the slide displays the number of distinct department values in the EMPLOYEES table.
oyees;
Group Functions and Null Values
Group functions ignore null values in the column.
SELECT AVG(commission_pct)
FROM employees;
Using the NVL Function
with Group Functions
The NVL function forces group functions to include
null values.
SELECT AVG(NVL(commission_pct, 0))
FROM employees;
Using the GROUP BY Clause
All columns in the SELECT list that are not in group
functions must be in the GROUP BY clause.
SELECT department_id, AVG(salary)
FROM employees
GROUP BY department_id ;
The GROUP BY Clause (continued)
When using the GROUP BY clause, make sure that all columns in the SELECT list that are not group
functions are included in the GROUP BY clause. The example on the slide displays the department
number and the average salary for each department. Here is how this SELECT statement, containing a
GROUP BY clause, is evaluated:
• The SELECT clause specifies the columns to be retrieved:
– Department number column in the EMPLOYEES table
– The average of all the salaries in the group you specified in the GROUP BY clause
• The FROM clause specifies the tables that the database must access: the EMPLOYEES table.
• The WHERE clause specifies the rows to be retrieved. Since there is no WHERE clause, all rows
are retrieved by default.
• The GROUP BY clause specifies how the rows should be grouped. The rows are being grouped
by department number, so the AVG function that is being applied to the salary column will
calculate the average salary for each department.
Using the GROUP BY Clause
The GROUP BY column does not have to be in the
SELECT list.
SELECT AVG(salary)
FROM employees
GROUP BY department_id ;
The GROUP BY Clause (continued)
The GROUP BY column does not have to be in the SELECT clause. For example, the SELECT
statement on the slide displays the average salaries for each department without displaying the
respective department numbers. Without the department numbers, however, the results do not look
meaningful.
You can use the group function in the ORDER BY clause.
SELECT department_id, AVG(salary)
FROM employees
GROUP BY department_id
Using the GROUP BY Clause
on Multiple Columns
SELECT department_id dept_id, job_id, SUM(salary)
FROM employees
GROUP BY department_id, job_id ;
Groups within Groups (continued)
You can return summary results for groups and subgroups by listing more than one GROUP BY
column. You can determine the default sort order of the results by the order of the columns in the
GROUP BY clause. Here is how the SELECT statement on the slide, containing a GROUP BY clause,
is evaluated:
• The SELECT clause specifies the column to be retrieved:
– Department number in the EMPLOYEES table
– Job ID in the EMPLOYEES table
– The sum of all the salaries in the group that you specified in the GROUP BY clause
• The FROM clause specifies the tables that the database must access: the EMPLOYEES table.
• The GROUP BY clause specifies how you must group the rows:
– First, the rows are grouped by department number.
– Second, within the department number groups, the rows are grouped by job ID.
So the SUM function is being applied to the salary column for all job IDs within each department
number group.
Illegal Queries
Using Group Functions
Any column or expression in the SELECT list that is
not an aggregate function must be in the GROUP BY
clause.
SELECT department_id, COUNT(last_name)
FROM employees;
SELECT department_id, COUNT(last_name)
*
ERROR at line 1:
ORA-00937: not a single-group group function
Excluding Group Results: The HAVING
Clause
Use the HAVING clause to restrict groups:
1. Rows are grouped.
2. The group function is applied.
3. Groups matching the HAVING clause are
displayed.
SELECT column, group_function
FROM table
[WHERE condition]
[GROUP BY group_by_expression]
[HAVING group_condition]
[ORDER BY column];
The HAVING Clause
You use the HAVING clause to specify which groups are to be displayed, and thus, you further restrict
the groups on the basis of aggregate information.
In the syntax:
group_condition restricts the groups of rows returned to those groups for which
the specified condition is true
The Oracle server performs the following steps when you use the HAVING clause:
1. Rows are grouped.
2. The group function is applied to the group.
3. The groups that match the criteria in the HAVING clause are displayed.
The HAVING clause can precede the GROUP BY clause, but it is recommended that you place the
GROUP BY clause first because that is more logical. Groups are formed and group functions are
calculated before the HAVING clause is applied to the groups in the SELECT list.
Using the HAVING Clause
SELECT department_id, MAX(salary)
FROM employees
GROUP BY department_id
HAVING MAX(salary)>10000 ;
Using the HAVING Clause
SELECT job_id, SUM(salary) PAYROLL
FROM employees
WHERE job_id NOT LIKE ’%REP%’
GROUP BY job_id
HAVING SUM(salary) > 13000
ORDER BY SUM(salary);
Nesting Group Functions
Display the maximum average salary.
SELECT MAX(AVG(salary))
FROM employees
GROUP BY department_id;
Summary
In this lesson, you should have learned how to:
• Use the group functions COUNT, MAX, MIN, AVG
• Write queries that use the GROUP BY clause
• Write queries that use the HAVING clause
No comments:
Post a Comment