COVID-19 Update: We are currently shipping orders daily. However, due to transit disruptions in some geographies, deliveries may be delayed. To provide all customers with timely access to content, we are offering 50% off Science and Technology Print & eBook bundle options. Terms & conditions.
Joe Celko's Thinking in Sets: Auxiliary, Temporal, and Virtual Tables in SQL - 1st Edition - ISBN: 9780123741370, 9780080557526

Joe Celko's Thinking in Sets: Auxiliary, Temporal, and Virtual Tables in SQL

1st Edition

Author: Joe Celko
Paperback ISBN: 9780123741370
eBook ISBN: 9780080557526
Imprint: Morgan Kaufmann
Published Date: 22nd January 2008
Page Count: 384
Sales tax will be calculated at check-out Price includes VAT/GST
Price includes VAT/GST

Institutional Subscription

Secure Checkout

Personal information is secured with SSL technology.

Free Shipping

Free global shipping
No minimum order.

Table of Contents

Table of Contents
Preface xvii
1 SQL Is Declarative, Not Procedural
1.1 Different Programming Models
1.2 Different Data Models
1.2.1 Columns Are Not Fields
1.2.2 Rows Are Not Records
1.2.3 Tables Are Not Files
1.2.4 Relational Keys Are Not Record Locators
1.2.5 Kinds of Keys
1.2.6 Desirable Properties of Relational Keys
1.2.7 Unique But Not Invariant
1.3 Tables as Entities
1.4 Tables as Relationships
1.5 Statements Are Not Procedures
1.6 Molecular, Atomic, and Subatomic Data Elements
1.6.1 Table Splitting
1.6.2 Column Splitting
1.6.3 Temporal Splitting
1.6.4 Faking Non-1NF Data
1.6.5 Molecular Data Elements
1.6.6 Isomer Data Elements
1.6.7 Validating a Molecule

2 Hardware, Data Volume, and Maintaining Databases
2.1 Parallelism
2.2 Cheap Main Storage
2.3 Solid-State Disk
2.4 Cheaper Secondary and Tertiary Storage
2.5 The Data Changed
2.6 The Mindset Has Not Changed

3 Data Access and Records
3.1 Sequential Access
3.1.1 Tape-Searching Algorithms
3.2 Indexes
3.2.1 Single-Table Indexes
3.2.2 Multiple-Table Indexes
3.2.3 Type of Indexes
3.3 Hashing
3.3.1 Digit Selection
3.3.2 Division Hashing
3.3.3 Multiplication Hashing
3.3.4 Folding
3.3.5 Table Lookups
3.3.6 Collisions
3.4 Bit Vector Indexes
3.5 Parallel Access
3.6 Row and Column Storage
3.6.1 Row-Based Storage
3.6.2 Column-Based Storage
3.7 JOIN Algorithms
3.7.1 Nested-Loop Join Algorithm
3.7.2 Sort-Merge Join Method
3.7.3 Hash Join Method
3.7.4 Shin’s Algorithm

4 Lookup Tables
4.1 Data Element Names
4.2 Multiparameter Lookup Tables
4.3 Constants Table
4.4 OTLT or MUCK Table Problems
4.5 Defi nition of a Proper Table

5 Auxiliary Tables
5.1 Sequence Table
5.1.1 Creating a Sequence Table
5.1.2 Sequence Constructor
5.1.3 Replacing an Iterative Loop
5.2 Permutations
5.2.1 Permutations via Recursion
5.2.2 Permutations via CROSS JOIN
5.3 Functions
5.3.1 Functions without a Simple Formula
5.4 Encryption via Tables
5.5 Random Numbers
5.6 Interpolation

6.1 Mullins VIEW Usage Rules
6.1.1 Effi cient Access and Computations
6.1.2 Column Renaming
6.1.3 Proliferation Avoidance
6.1.4 The VIEW Synchronization Rule
6.2 Updatable and Read-Only VIEWs
6.3 Types of VIEWs
6.3.1 Single-Table Projection and Restriction
6.3.2 Calculated Columns
6.3.3 Translated Columns
6.3.4 Grouped VIEWs
6.3.5 UNIONed VIEWs
6.3.6 JOINs in VIEWs
6.3.7 Nested VIEWs
6.4 Modeling Classes with Tables
6.4.1 Class Hierarchies in SQL
6.4.2 Subclasses via ASSERTIONs and TRIGGERs
6.5 How VIEWs Are Handled in the Database System
6.5.1 VIEW Column List
6.5.2 VIEW Materialization
6.6 In-Line Text Expansion
6.7.1 WITH CHECK OPTION as CHECK( ) Clause
6.8 Dropping VIEWs
6.9 Outdated Uses for VIEWs
6.9.1 Domain Support
6.9.2 Table Expression VIEWs
6.9.3 VIEWs for Table Level CHECK( ) Constraints
6.9.4 One VIEW per Base Table

7 Virtual Tables
7.1 Derived Tables
7.1.1 Column Naming Rules
7.1.2 Scoping Rules
7.1.3 Exposed Table Names
7.1.4 LATERAL() Clause
7.2 Common Table Expressions
7.2.1 Nonrecursive CTEs
7.2.2 Recursive CTEs
7.3 Temporary Tables
7.3.1 ANSI/ISO Standards
7.3.2 Vendors Models
7.4 The Information Schema
7.4.1 The INFORMATION_SCHEMA Declarations
7.4.2 A Quick List of VIEWS and Their Purposes
7.4.3 DOMAIN Declarations
7.4.4 Defi nition Schema

8 Complicated Functions via Tables
8.1 Functions without a Simple Formula
8.1.1 Encryption via Tables
8.2 Check Digits via Tables
8.2.1 Check Digits Defi ned
8.2.2 Error Detection versus Error Correction
8.3 Classes of Algorithms
8.3.1 Weighted-Sum Algorithms
8.3.2 Power-Sum Check Digits
8.3.3 Luhn Algorithm
8.3.4 Dihedral Five Check Digit
8.4 Declarations, Not Functions, Not Procedures
8.5 Data Mining for Auxiliary Tables

9 Temporal Tables
9.1 The Nature of Time
9.1.1 Durations, Not Chronons
9.1.2 Granularity
9.2 The ISO Half-Open Interval Model
9.2.1 Use of NULL for “Eternity”
9.2.2 Single Timestamp Tables
9.2.3 Overlapping Intervals
9.3 State Transition Tables
9.4 Consolidating Intervals
9.4.1 Cursors and Triggers
9.4.2 OLAP Function Solution
9.4.3 CTE Solution
9.5 Calendar Tables
9.5.1 Day of Week via Tables
9.5.2 Holiday Lists
9.5.3 Report Periods
9.5.4 Self-Updating Views
9.6 History Tables
9.6.1 Audit Trails

10 Scrubbing Data with Non-1NF Tables
10.1 Repeated Groups
10.1.1 Sorting within a Repeated Group
10.2 Designing Scrubbing Tables
10.3 Scrubbing Constraints
10.4 Calendar Scrubs
10.4.1 Special Dates
10.5 String Scrubbing
10.6 Sharing SQL Data
10.6.1 A Look at Data Evolution
10.6.2 Databases
10.7 Extract, Transform, and Load Products
10.7.1 Loading Data Warehouses
10.7.2 Doing It All in SQL
10.7.3 Extract, Load, and then Transform

11 Thinking in SQL
11.1 Warm-up Exercises
11.1.1 The Whole and Not the Parts
11.1.2 Characteristic Functions
11.1.3 Locking into a Solution Early
11.2 Heuristics
11.2.1 Put the Specification into a Clear Statement
11.2.2 Add the Words “Set of All…” in Front of the Nouns
11.2.3 Remove Active Verbs from the Problem Statement
11.2.4 You Can Still Use Stubs
11.2.5 Do Not Worry about Displaying the Data
11.2.6 Your First Attempts Need Special Handling
11.2.7 Do Not Be Afraid to Throw Away Your First Attempts at DDL
11.2.8 Save Your First Attempts at DML
11.2.9 Do Not Think with Boxes and Arrows
11.2.10 Draw Circles and Set Diagrams
11.2.11 Learn Your Dialect
11.2.12 Imagine that Your WHERE Clause Is “Super Amoeba”
11.2.13 Use the Newsgroups, Blogs, and Internet
11.3 Do Not Use BIT or BOOLEAN Flags in SQL
11.3.1 Flags Are at the Wrong Level
11.3.2 Flags Confuse Proper Attributes

12 Group Characteristics
12.1 Grouping Is Not Equality
12.2 Using Groups without Looking Inside
12.2.1 Semiset-Oriented Approach
12.2.2 Grouped Solutions
12.2.3 Aggregated Solutions
12.3 Grouping over Time
12.3.1 Piece-by-Piece Solution
12.3.2 Data as a Whole Solution
12.4 Other Tricks with HAVING Clauses
12.5 Groupings, Rollups, and Cubes
12.5.1 GROUPING SET Clause
12.5.2 The ROLLUP Clause
12.5.3 The CUBE Clause
12.5.4 A Footnote about Super Grouping
12.6 The WINDOW Clause
12.6.1 The PARTITION BY Clause
12.6.2 The ORDER BY Clause
12.6.3 The RANGE Clause
12.6.4 Programming Tricks

13 Turning Specifications into Code
13.1 Signs of Bad SQL
13.1.1 Is the Code Formatted Like Another Language?
13.1.2 Assuming Sequential Access
13.1.3 Cursors
13.1.4 Poor Cohesion
13.1.5 Table-Valued Functions
13.1.6 Multiple Names for the Same Data Element
13.1.7 Formatting in the Database
13.1.8 Keeping Dates in Strings
13.1.9 BIT Flags, BOOLEAN, and Other Computed Columns
13.1.10 Attribute Splitting Across Columns
13.1.11 Attribute Splitting Across Rows
13.1.12 Attribute Splitting Across Tables
13.2 Methods of Attack
13.2.1 Cursor-Based Solution
13.2.2 Semiset-Oriented Approach
13.2.3 Pure Set-Oriented Approach
13.2.4 Advantages of Set-Oriented Code
13.3 Translating Vague Specifications
13.3.1 Go Back to the DDL
13.3.2 Changing Specifications

14 Using Procedure and Function Calls
14.1 Clearing out Spaces in a String
14.1.1 Procedural Solution #1
14.1.2 Functional Solution #1
14.1.3 Functional Solution #2
14.2 The PRD( ) Aggregate Function
14.3 Long Parameter Lists in Procedures and Functions
14.3.1 The IN( ) Predicate Parameter Lists

15 Numbering Rows
15.1 Procedural Solutions
15.1.1 Reordering on a Numbering Column
15.2 OLAP Functions
15.2.1 Simple Row Numbering
15.2.2 RANK( ) and DENSE_RANK( )
15.3 Sections

16 Keeping Computed Data
16.1 Procedural Solution
16.2 Relational Solution
16.3 Other Kinds of Computed Data

17 Triggers for Constraints
17.1 Triggers for Computations
17.2 Complex Constraints via CHECK( ) and CASE Constraints
17.3 Complex Constraints via VIEWs
17.3.1 Set-Oriented Solutions
17.4 Operations on VIEWs as Constraints
17.4.1 The Basic Three Operations
17.4.3 WITH CHECK OPTION as CHECK( ) clause
17.4.4 How VIEWs Behave
17.4.5 UNIONed VIEWs
17.4.6 Simple INSTEAD OF Triggers
17.4.7 Warnings about INSTEAD OF Triggers

18 Procedural and Data Driven Solutions
18.1 Removing Letters in a String
18.1.1 The Procedural Solution
18.1.2 Pure SQL Solution
18.1.3 Impure SQL Solution
18.2 Two Approaches to Sudoku
18.2.1 Procedural Approach
18.2.2 Data-Driven Approach
18.2.3 Handling the Given Digits
18.3 Data Constraint Approach
18.4 Bin Packing Problems
18.4.1 The Procedural Approach
18.4.2 The SQL Approach
18.5 Inventory Costs over Time
18.5.1 Inventory UPDATE Statements
18.5.2 Bin Packing Returns



Perfectly intelligent programmers often struggle when forced to work with SQL. Why? Joe Celko believes the problem lies with their procedural programming mindset, which keeps them from taking full advantage of the power of declarative languages. The result is overly complex and inefficient code, not to mention lost productivity.

This book will change the way you think about the problems you solve with SQL programs.. Focusing on three key table-based techniques, Celko reveals their power through detailed examples and clear explanations. As you master these techniques, you’ll find you are able to conceptualize problems as rooted in sets and solvable through declarative programming. Before long, you’ll be coding more quickly, writing more efficient code, and applying the full power of SQL

Key Features

• Filled with the insights of one of the world’s leading SQL authorities - noted for his knowledge and his ability to teach what he knows.

• Focuses on auxiliary tables (for computing functions and other values by joins), temporal tables (for temporal queries, historical data, and audit information), and virtual tables (for improved performance).

• Presents clear guidance for selecting and correctly applying the right table technique.


Data analysts and database developers from all backgrounds, regardless of which database technology they use; this is the market for all of Joe's other books, all our data modeling books in general as well. This includes database developers working on transactional (OLTP) systems as well as data warehouse design (OLAP systems).

Unlike most of Joe's other books which are for very experienced SQL programmers who want to become gurus, this book has the widest possible audience of programmers new to SQL as well as those who are very experienced.


No. of pages:
© Morgan Kaufmann 2008
22nd January 2008
Morgan Kaufmann
Paperback ISBN:
eBook ISBN:

Ratings and Reviews

About the Author

Joe Celko

Joe Celko

Joe Celko served 10 years on ANSI/ISO SQL Standards Committee and contributed to the SQL-89 and SQL-92 Standards.

Mr. Celko is author a series of books on SQL and RDBMS for Elsevier/MKP. He is an independent consultant based in Austin, Texas.

He has written over 1200 columns in the computer trade and academic press, mostly dealing with data and databases.

Affiliations and Expertise

Independent Consultant, Austin, Texas