COVID-19 Update: We are currently shipping orders daily. However, due to transit disruptions in some geographies, deliveries may be delayed. To provide all customers with timely access to content, we are offering 50% off Science and Technology Print & eBook bundle options. Terms & conditions.
Joe Celko's Thinking in Sets: Auxiliary, Temporal, and Virtual Tables in SQL - 1st Edition - ISBN: 9780123741370, 9780080557526

Joe Celko's Thinking in Sets: Auxiliary, Temporal, and Virtual Tables in SQL

1st Edition

Author: Joe Celko
Paperback ISBN: 9780123741370
eBook ISBN: 9780080557526
Imprint: Morgan Kaufmann
Published Date: 22nd January 2008
Page Count: 384
Sales tax will be calculated at check-out Price includes VAT/GST
Price includes VAT/GST

Institutional Subscription

Secure Checkout

Personal information is secured with SSL technology.

Free Shipping

Free global shipping
No minimum order.

Table of Contents

Table of Contents Preface xvii 1 SQL Is Declarative, Not Procedural 1.1 Different Programming Models 1.2 Different Data Models 1.2.1 Columns Are Not Fields 1.2.2 Rows Are Not Records 1.2.3 Tables Are Not Files 1.2.4 Relational Keys Are Not Record Locators 1.2.5 Kinds of Keys 1.2.6 Desirable Properties of Relational Keys 1.2.7 Unique But Not Invariant 1.3 Tables as Entities 1.4 Tables as Relationships 1.5 Statements Are Not Procedures 1.6 Molecular, Atomic, and Subatomic Data Elements 1.6.1 Table Splitting 1.6.2 Column Splitting 1.6.3 Temporal Splitting 1.6.4 Faking Non-1NF Data 1.6.5 Molecular Data Elements 1.6.6 Isomer Data Elements 1.6.7 Validating a Molecule

2 Hardware, Data Volume, and Maintaining Databases 2.1 Parallelism 2.2 Cheap Main Storage 2.3 Solid-State Disk 2.4 Cheaper Secondary and Tertiary Storage 2.5 The Data Changed 2.6 The Mindset Has Not Changed

3 Data Access and Records 3.1 Sequential Access 3.1.1 Tape-Searching Algorithms 3.2 Indexes 3.2.1 Single-Table Indexes 3.2.2 Multiple-Table Indexes 3.2.3 Type of Indexes 3.3 Hashing 3.3.1 Digit Selection 3.3.2 Division Hashing 3.3.3 Multiplication Hashing 3.3.4 Folding 3.3.5 Table Lookups 3.3.6 Collisions 3.4 Bit Vector Indexes 3.5 Parallel Access 3.6 Row and Column Storage 3.6.1 Row-Based Storage 3.6.2 Column-Based Storage 3.7 JOIN Algorithms 3.7.1 Nested-Loop Join Algorithm 3.7.2 Sort-Merge Join Method 3.7.3 Hash Join Method 3.7.4 Shin’s Algorithm

4 Lookup Tables 4.1 Data Element Names 4.2 Multiparameter Lookup Tables 4.3 Constants Table 4.4 OTLT or MUCK Table Problems 4.5 Defi nition of a Proper Table

5 Auxiliary Tables 5.1 Sequence Table 5.1.1 Creating a Sequence Table 5.1.2 Sequence Constructor 5.1.3 Replacing an Iterative Loop 5.2 Permutations 5.2.1 Permutations via Recursion 5.2.2 Permutations via CROSS JOIN 5.3 Functions 5.3.1 Functions without a Simple Formula 5.4 Encryption via Tables 5.5 Random Numbers 5.6 Interpolation

6 Views 6.1 Mullins VIEW Usage Rules 6.1.1 Effi cient Access and Computations 6.1.2 Column Renaming 6.1.3 Proliferation Avoidance 6.1.4 The VIEW Synchronization Rule 6.2 Updatable and Read-Only VIEWs 6.3 Types of VIEWs 6.3.1 Single-Table Projection and Restriction 6.3.2 Calculated Columns 6.3.3 Translated Columns 6.3.4 Grouped VIEWs 6.3.5 UNIONed VIEWs 6.3.6 JOINs in VIEWs 6.3.7 Nested VIEWs 6.4 Modeling Classes with Tables 6.4.1 Class Hierarchies in SQL 6.4.2 Subclasses via ASSERTIONs and TRIGGERs 6.5 How VIEWs Are Handled in the Database System 6.5.1 VIEW Column List 6.5.2 VIEW Materialization 6.6 In-Line Text Expansion 6.7 WITH CHECK OPTION Clause 6.7.1 WITH CHECK OPTION as CHECK( ) Clause 6.8 Dropping VIEWs 6.9 Outdated Uses for VIEWs 6.9.1 Domain Support 6.9.2 Table Expression VIEWs 6.9.3 VIEWs for Table Level CHECK( ) Constraints 6.9.4 One VIEW per Base Table

7 Virtual Tables 7.1 Derived Tables 7.1.1 Column Naming Rules 7.1.2 Scoping Rules 7.1.3 Exposed Table Names 7.1.4 LATERAL() Clause 7.2 Common Table Expressions 7.2.1 Nonrecursive CTEs 7.2.2 Recursive CTEs 7.3 Temporary Tables 7.3.1 ANSI/ISO Standards 7.3.2 Vendors Models 7.4 The Information Schema 7.4.1 The INFORMATION_SCHEMA Declarations 7.4.2 A Quick List of VIEWS and Their Purposes 7.4.3 DOMAIN Declarations 7.4.4 Defi nition Schema 7.4.5 INFORMATION_SCHEMA Assertions

8 Complicated Functions via Tables 8.1 Functions without a Simple Formula 8.1.1 Encryption via Tables 8.2 Check Digits via Tables 8.2.1 Check Digits Defi ned 8.2.2 Error Detection versus Error Correction 8.3 Classes of Algorithms 8.3.1 Weighted-Sum Algorithms 8.3.2 Power-Sum Check Digits 8.3.3 Luhn Algorithm 8.3.4 Dihedral Five Check Digit 8.4 Declarations, Not Functions, Not Procedures 8.5 Data Mining for Auxiliary Tables

9 Temporal Tables 9.1 The Nature of Time 9.1.1 Durations, Not Chronons 9.1.2 Granularity 9.2 The ISO Half-Open Interval Model 9.2.1 Use of NULL for “Eternity” 9.2.2 Single Timestamp Tables 9.2.3 Overlapping Intervals 9.3 State Transition Tables 9.4 Consolidating Intervals 9.4.1 Cursors and Triggers 9.4.2 OLAP Function Solution 9.4.3 CTE Solution 9.5 Calendar Tables 9.5.1 Day of Week via Tables 9.5.2 Holiday Lists 9.5.3 Report Periods 9.5.4 Self-Updating Views 9.6 History Tables 9.6.1 Audit Trails

10 Scrubbing Data with Non-1NF Tables 10.1 Repeated Groups 10.1.1 Sorting within a Repeated Group 10.2 Designing Scrubbing Tables 10.3 Scrubbing Constraints 10.4 Calendar Scrubs 10.4.1 Special Dates 10.5 String Scrubbing 10.6 Sharing SQL Data 10.6.1 A Look at Data Evolution 10.6.2 Databases 10.7 Extract, Transform, and Load Products 10.7.1 Loading Data Warehouses 10.7.2 Doing It All in SQL 10.7.3 Extract, Load, and then Transform

11 Thinking in SQL 11.1 Warm-up Exercises 11.1.1 The Whole and Not the Parts 11.1.2 Characteristic Functions 11.1.3 Locking into a Solution Early 11.2 Heuristics 11.2.1 Put the Specification into a Clear Statement 11.2.2 Add the Words “Set of All…” in Front of the Nouns 11.2.3 Remove Active Verbs from the Problem Statement 11.2.4 You Can Still Use Stubs 11.2.5 Do Not Worry about Displaying the Data 11.2.6 Your First Attempts Need Special Handling 11.2.7 Do Not Be Afraid to Throw Away Your First Attempts at DDL 11.2.8 Save Your First Attempts at DML 11.2.9 Do Not Think with Boxes and Arrows 11.2.10 Draw Circles and Set Diagrams 11.2.11 Learn Your Dialect 11.2.12 Imagine that Your WHERE Clause Is “Super Amoeba” 11.2.13 Use the Newsgroups, Blogs, and Internet 11.3 Do Not Use BIT or BOOLEAN Flags in SQL 11.3.1 Flags Are at the Wrong Level 11.3.2 Flags Confuse Proper Attributes

12 Group Characteristics 12.1 Grouping Is Not Equality 12.2 Using Groups without Looking Inside 12.2.1 Semiset-Oriented Approach 12.2.2 Grouped Solutions 12.2.3 Aggregated Solutions 12.3 Grouping over Time 12.3.1 Piece-by-Piece Solution 12.3.2 Data as a Whole Solution 12.4 Other Tricks with HAVING Clauses 12.5 Groupings, Rollups, and Cubes 12.5.1 GROUPING SET Clause 12.5.2 The ROLLUP Clause 12.5.3 The CUBE Clause 12.5.4 A Footnote about Super Grouping 12.6 The WINDOW Clause 12.6.1 The PARTITION BY Clause 12.6.2 The ORDER BY Clause 12.6.3 The RANGE Clause 12.6.4 Programming Tricks

13 Turning Specifications into Code 13.1 Signs of Bad SQL 13.1.1 Is the Code Formatted Like Another Language? 13.1.2 Assuming Sequential Access 13.1.3 Cursors 13.1.4 Poor Cohesion 13.1.5 Table-Valued Functions 13.1.6 Multiple Names for the Same Data Element 13.1.7 Formatting in the Database 13.1.8 Keeping Dates in Strings 13.1.9 BIT Flags, BOOLEAN, and Other Computed Columns 13.1.10 Attribute Splitting Across Columns 13.1.11 Attribute Splitting Across Rows 13.1.12 Attribute Splitting Across Tables 13.2 Methods of Attack 13.2.1 Cursor-Based Solution 13.2.2 Semiset-Oriented Approach 13.2.3 Pure Set-Oriented Approach 13.2.4 Advantages of Set-Oriented Code 13.3 Translating Vague Specifications 13.3.1 Go Back to the DDL 13.3.2 Changing Specifications

14 Using Procedure and Function Calls 14.1 Clearing out Spaces in a String 14.1.1 Procedural Solution #1 14.1.2 Functional Solution #1 14.1.3 Functional Solution #2 14.2 The PRD( ) Aggregate Function 14.3 Long Parameter Lists in Procedures and Functions 14.3.1 The IN( ) Predicate Parameter Lists

15 Numbering Rows 15.1 Procedural Solutions 15.1.1 Reordering on a Numbering Column 15.2 OLAP Functions 15.2.1 Simple Row Numbering 15.2.2 RANK( ) and DENSE_RANK( ) 15.3 Sections

16 Keeping Computed Data 16.1 Procedural Solution 16.2 Relational Solution 16.3 Other Kinds of Computed Data

17 Triggers for Constraints 17.1 Triggers for Computations 17.2 Complex Constraints via CHECK( ) and CASE Constraints 17.3 Complex Constraints via VIEWs 17.3.1 Set-Oriented Solutions 17.4 Operations on VIEWs as Constraints 17.4.1 The Basic Three Operations 17.4.2 WITH CHECK OPTION Clause 17.4.3 WITH CHECK OPTION as CHECK( ) clause 17.4.4 How VIEWs Behave 17.4.5 UNIONed VIEWs 17.4.6 Simple INSTEAD OF Triggers 17.4.7 Warnings about INSTEAD OF Triggers

18 Procedural and Data Driven Solutions 18.1 Removing Letters in a String 18.1.1 The Procedural Solution 18.1.2 Pure SQL Solution 18.1.3 Impure SQL Solution 18.2 Two Approaches to Sudoku 18.2.1 Procedural Approach 18.2.2 Data-Driven Approach 18.2.3 Handling the Given Digits 18.3 Data Constraint Approach 18.4 Bin Packing Problems 18.4.1 The Procedural Approach 18.4.2 The SQL Approach 18.5 Inventory Costs over Time 18.5.1 Inventory UPDATE Statements 18.5.2 Bin Packing Returns



Perfectly intelligent programmers often struggle when forced to work with SQL. Why? Joe Celko believes the problem lies with their procedural programming mindset, which keeps them from taking full advantage of the power of declarative languages. The result is overly complex and inefficient code, not to mention lost productivity.

This book will change the way you think about the problems you solve with SQL programs.. Focusing on three key table-based techniques, Celko reveals their power through detailed examples and clear explanations. As you master these techniques, you’ll find you are able to conceptualize problems as rooted in sets and solvable through declarative programming. Before long, you’ll be coding more quickly, writing more efficient code, and applying the full power of SQL

Key Features

• Filled with the insights of one of the world’s leading SQL authorities - noted for his knowledge and his ability to teach what he knows.

• Focuses on auxiliary tables (for computing functions and other values by joins), temporal tables (for temporal queries, historical data, and audit information), and virtual tables (for improved performance).

• Presents clear guidance for selecting and correctly applying the right table technique.


Data analysts and database developers from all backgrounds, regardless of which database technology they use; this is the market for all of Joe's other books, all our data modeling books in general as well. This includes database developers working on transactional (OLTP) systems as well as data warehouse design (OLAP systems).

Unlike most of Joe's other books which are for very experienced SQL programmers who want to become gurus, this book has the widest possible audience of programmers new to SQL as well as those who are very experienced.


No. of pages:
© Morgan Kaufmann 2008
22nd January 2008
Morgan Kaufmann
Paperback ISBN:
eBook ISBN:

Ratings and Reviews

About the Author

Joe Celko

Joe Celko

Joe Celko served 10 years on ANSI/ISO SQL Standards Committee and contributed to the SQL-89 and SQL-92 Standards.

Mr. Celko is author a series of books on SQL and RDBMS for Elsevier/MKP. He is an independent consultant based in Austin, Texas.

He has written over 1200 columns in the computer trade and academic press, mostly dealing with data and databases.

Affiliations and Expertise

Independent Consultant, Austin, Texas