Joe Celko's Thinking in Sets: Auxiliary, Temporal, and Virtual Tables in SQL

Joe Celko's Thinking in Sets: Auxiliary, Temporal, and Virtual Tables in SQL

1st Edition - January 22, 2008

Write a review

  • Author: Joe Celko
  • Paperback ISBN: 9780123741370
  • eBook ISBN: 9780080557526

Purchase options

Purchase options
Available
DRM-free (Mobi, PDF, EPub)
Sales tax will be calculated at check-out

Institutional Subscription

Free Global Shipping
No minimum order

Description

Perfectly intelligent programmers often struggle when forced to work with SQL. Why? Joe Celko believes the problem lies with their procedural programming mindset, which keeps them from taking full advantage of the power of declarative languages. The result is overly complex and inefficient code, not to mention lost productivity.This book will change the way you think about the problems you solve with SQL programs.. Focusing on three key table-based techniques, Celko reveals their power through detailed examples and clear explanations. As you master these techniques, you’ll find you are able to conceptualize problems as rooted in sets and solvable through declarative programming. Before long, you’ll be coding more quickly, writing more efficient code, and applying the full power of SQL

Key Features

• Filled with the insights of one of the world’s leading SQL authorities - noted for his knowledge and his ability to teach what he knows.

• Focuses on auxiliary tables (for computing functions and other values by joins), temporal tables (for temporal queries, historical data, and audit information), and virtual tables (for improved performance).

• Presents clear guidance for selecting and correctly applying the right table technique.

Readership

Data analysts and database developers from all backgrounds, regardless of which database technology they use; this is the market for all of Joe's other books, all our data modeling books in general as well. This includes database developers working on transactional (OLTP) systems as well as data warehouse design (OLAP systems).

Unlike most of Joe's other books which are for very experienced SQL programmers who want to become gurus, this book has the widest possible audience of programmers new to SQL as well as those who are very experienced.

Table of Contents

  • Table of Contents
    Preface xvii
    1 SQL Is Declarative, Not Procedural
    1.1 Different Programming Models
    1.2 Different Data Models
    1.2.1 Columns Are Not Fields
    1.2.2 Rows Are Not Records
    1.2.3 Tables Are Not Files
    1.2.4 Relational Keys Are Not Record Locators
    1.2.5 Kinds of Keys
    1.2.6 Desirable Properties of Relational Keys
    1.2.7 Unique But Not Invariant
    1.3 Tables as Entities
    1.4 Tables as Relationships
    1.5 Statements Are Not Procedures
    1.6 Molecular, Atomic, and Subatomic Data Elements
    1.6.1 Table Splitting
    1.6.2 Column Splitting
    1.6.3 Temporal Splitting
    1.6.4 Faking Non-1NF Data
    1.6.5 Molecular Data Elements
    1.6.6 Isomer Data Elements
    1.6.7 Validating a Molecule

    2 Hardware, Data Volume, and Maintaining Databases
    2.1 Parallelism
    2.2 Cheap Main Storage
    2.3 Solid-State Disk
    2.4 Cheaper Secondary and Tertiary Storage
    2.5 The Data Changed
    2.6 The Mindset Has Not Changed

    3 Data Access and Records
    3.1 Sequential Access
    3.1.1 Tape-Searching Algorithms
    3.2 Indexes
    3.2.1 Single-Table Indexes
    3.2.2 Multiple-Table Indexes
    3.2.3 Type of Indexes
    3.3 Hashing
    3.3.1 Digit Selection
    3.3.2 Division Hashing
    3.3.3 Multiplication Hashing
    3.3.4 Folding
    3.3.5 Table Lookups
    3.3.6 Collisions
    3.4 Bit Vector Indexes
    3.5 Parallel Access
    3.6 Row and Column Storage
    3.6.1 Row-Based Storage
    3.6.2 Column-Based Storage
    3.7 JOIN Algorithms
    3.7.1 Nested-Loop Join Algorithm
    3.7.2 Sort-Merge Join Method
    3.7.3 Hash Join Method
    3.7.4 Shin’s Algorithm

    4 Lookup Tables
    4.1 Data Element Names
    4.2 Multiparameter Lookup Tables
    4.3 Constants Table
    4.4 OTLT or MUCK Table Problems
    4.5 Defi nition of a Proper Table

    5 Auxiliary Tables
    5.1 Sequence Table
    5.1.1 Creating a Sequence Table
    5.1.2 Sequence Constructor
    5.1.3 Replacing an Iterative Loop
    5.2 Permutations
    5.2.1 Permutations via Recursion
    5.2.2 Permutations via CROSS JOIN
    5.3 Functions
    5.3.1 Functions without a Simple Formula
    5.4 Encryption via Tables
    5.5 Random Numbers
    5.6 Interpolation

    6 Views
    6.1 Mullins VIEW Usage Rules
    6.1.1 Effi cient Access and Computations
    6.1.2 Column Renaming
    6.1.3 Proliferation Avoidance
    6.1.4 The VIEW Synchronization Rule
    6.2 Updatable and Read-Only VIEWs
    6.3 Types of VIEWs
    6.3.1 Single-Table Projection and Restriction
    6.3.2 Calculated Columns
    6.3.3 Translated Columns
    6.3.4 Grouped VIEWs
    6.3.5 UNIONed VIEWs
    6.3.6 JOINs in VIEWs
    6.3.7 Nested VIEWs
    6.4 Modeling Classes with Tables
    6.4.1 Class Hierarchies in SQL
    6.4.2 Subclasses via ASSERTIONs and TRIGGERs
    6.5 How VIEWs Are Handled in the Database System
    6.5.1 VIEW Column List
    6.5.2 VIEW Materialization
    6.6 In-Line Text Expansion
    6.7 WITH CHECK OPTION Clause
    6.7.1 WITH CHECK OPTION as CHECK( ) Clause
    6.8 Dropping VIEWs
    6.9 Outdated Uses for VIEWs
    6.9.1 Domain Support
    6.9.2 Table Expression VIEWs
    6.9.3 VIEWs for Table Level CHECK( ) Constraints
    6.9.4 One VIEW per Base Table

    7 Virtual Tables
    7.1 Derived Tables
    7.1.1 Column Naming Rules
    7.1.2 Scoping Rules
    7.1.3 Exposed Table Names
    7.1.4 LATERAL() Clause
    7.2 Common Table Expressions
    7.2.1 Nonrecursive CTEs
    7.2.2 Recursive CTEs
    7.3 Temporary Tables
    7.3.1 ANSI/ISO Standards
    7.3.2 Vendors Models
    7.4 The Information Schema
    7.4.1 The INFORMATION_SCHEMA Declarations
    7.4.2 A Quick List of VIEWS and Their Purposes
    7.4.3 DOMAIN Declarations
    7.4.4 Defi nition Schema
    7.4.5 INFORMATION_SCHEMA Assertions

    8 Complicated Functions via Tables
    8.1 Functions without a Simple Formula
    8.1.1 Encryption via Tables
    8.2 Check Digits via Tables
    8.2.1 Check Digits Defi ned
    8.2.2 Error Detection versus Error Correction
    8.3 Classes of Algorithms
    8.3.1 Weighted-Sum Algorithms
    8.3.2 Power-Sum Check Digits
    8.3.3 Luhn Algorithm
    8.3.4 Dihedral Five Check Digit
    8.4 Declarations, Not Functions, Not Procedures
    8.5 Data Mining for Auxiliary Tables

    9 Temporal Tables
    9.1 The Nature of Time
    9.1.1 Durations, Not Chronons
    9.1.2 Granularity
    9.2 The ISO Half-Open Interval Model
    9.2.1 Use of NULL for “Eternity”
    9.2.2 Single Timestamp Tables
    9.2.3 Overlapping Intervals
    9.3 State Transition Tables
    9.4 Consolidating Intervals
    9.4.1 Cursors and Triggers
    9.4.2 OLAP Function Solution
    9.4.3 CTE Solution
    9.5 Calendar Tables
    9.5.1 Day of Week via Tables
    9.5.2 Holiday Lists
    9.5.3 Report Periods
    9.5.4 Self-Updating Views
    9.6 History Tables
    9.6.1 Audit Trails

    10 Scrubbing Data with Non-1NF Tables
    10.1 Repeated Groups
    10.1.1 Sorting within a Repeated Group
    10.2 Designing Scrubbing Tables
    10.3 Scrubbing Constraints
    10.4 Calendar Scrubs
    10.4.1 Special Dates
    10.5 String Scrubbing
    10.6 Sharing SQL Data
    10.6.1 A Look at Data Evolution
    10.6.2 Databases
    10.7 Extract, Transform, and Load Products
    10.7.1 Loading Data Warehouses
    10.7.2 Doing It All in SQL
    10.7.3 Extract, Load, and then Transform

    11 Thinking in SQL
    11.1 Warm-up Exercises
    11.1.1 The Whole and Not the Parts
    11.1.2 Characteristic Functions
    11.1.3 Locking into a Solution Early
    11.2 Heuristics
    11.2.1 Put the Specification into a Clear Statement
    11.2.2 Add the Words “Set of All…” in Front of the Nouns
    11.2.3 Remove Active Verbs from the Problem Statement
    11.2.4 You Can Still Use Stubs
    11.2.5 Do Not Worry about Displaying the Data
    11.2.6 Your First Attempts Need Special Handling
    11.2.7 Do Not Be Afraid to Throw Away Your First Attempts at DDL
    11.2.8 Save Your First Attempts at DML
    11.2.9 Do Not Think with Boxes and Arrows
    11.2.10 Draw Circles and Set Diagrams
    11.2.11 Learn Your Dialect
    11.2.12 Imagine that Your WHERE Clause Is “Super Amoeba”
    11.2.13 Use the Newsgroups, Blogs, and Internet
    11.3 Do Not Use BIT or BOOLEAN Flags in SQL
    11.3.1 Flags Are at the Wrong Level
    11.3.2 Flags Confuse Proper Attributes

    12 Group Characteristics
    12.1 Grouping Is Not Equality
    12.2 Using Groups without Looking Inside
    12.2.1 Semiset-Oriented Approach
    12.2.2 Grouped Solutions
    12.2.3 Aggregated Solutions
    12.3 Grouping over Time
    12.3.1 Piece-by-Piece Solution
    12.3.2 Data as a Whole Solution
    12.4 Other Tricks with HAVING Clauses
    12.5 Groupings, Rollups, and Cubes
    12.5.1 GROUPING SET Clause
    12.5.2 The ROLLUP Clause
    12.5.3 The CUBE Clause
    12.5.4 A Footnote about Super Grouping
    12.6 The WINDOW Clause
    12.6.1 The PARTITION BY Clause
    12.6.2 The ORDER BY Clause
    12.6.3 The RANGE Clause
    12.6.4 Programming Tricks

    13 Turning Specifications into Code
    13.1 Signs of Bad SQL
    13.1.1 Is the Code Formatted Like Another Language?
    13.1.2 Assuming Sequential Access
    13.1.3 Cursors
    13.1.4 Poor Cohesion
    13.1.5 Table-Valued Functions
    13.1.6 Multiple Names for the Same Data Element
    13.1.7 Formatting in the Database
    13.1.8 Keeping Dates in Strings
    13.1.9 BIT Flags, BOOLEAN, and Other Computed Columns
    13.1.10 Attribute Splitting Across Columns
    13.1.11 Attribute Splitting Across Rows
    13.1.12 Attribute Splitting Across Tables
    13.2 Methods of Attack
    13.2.1 Cursor-Based Solution
    13.2.2 Semiset-Oriented Approach
    13.2.3 Pure Set-Oriented Approach
    13.2.4 Advantages of Set-Oriented Code
    13.3 Translating Vague Specifications
    13.3.1 Go Back to the DDL
    13.3.2 Changing Specifications

    14 Using Procedure and Function Calls
    14.1 Clearing out Spaces in a String
    14.1.1 Procedural Solution #1
    14.1.2 Functional Solution #1
    14.1.3 Functional Solution #2
    14.2 The PRD( ) Aggregate Function
    14.3 Long Parameter Lists in Procedures and Functions
    14.3.1 The IN( ) Predicate Parameter Lists

    15 Numbering Rows
    15.1 Procedural Solutions
    15.1.1 Reordering on a Numbering Column
    15.2 OLAP Functions
    15.2.1 Simple Row Numbering
    15.2.2 RANK( ) and DENSE_RANK( )
    15.3 Sections

    16 Keeping Computed Data
    16.1 Procedural Solution
    16.2 Relational Solution
    16.3 Other Kinds of Computed Data

    17 Triggers for Constraints
    17.1 Triggers for Computations
    17.2 Complex Constraints via CHECK( ) and CASE Constraints
    17.3 Complex Constraints via VIEWs
    17.3.1 Set-Oriented Solutions
    17.4 Operations on VIEWs as Constraints
    17.4.1 The Basic Three Operations
    17.4.2 WITH CHECK OPTION Clause
    17.4.3 WITH CHECK OPTION as CHECK( ) clause
    17.4.4 How VIEWs Behave
    17.4.5 UNIONed VIEWs
    17.4.6 Simple INSTEAD OF Triggers
    17.4.7 Warnings about INSTEAD OF Triggers

    18 Procedural and Data Driven Solutions
    18.1 Removing Letters in a String
    18.1.1 The Procedural Solution
    18.1.2 Pure SQL Solution
    18.1.3 Impure SQL Solution
    18.2 Two Approaches to Sudoku
    18.2.1 Procedural Approach
    18.2.2 Data-Driven Approach
    18.2.3 Handling the Given Digits
    18.3 Data Constraint Approach
    18.4 Bin Packing Problems
    18.4.1 The Procedural Approach
    18.4.2 The SQL Approach
    18.5 Inventory Costs over Time
    18.5.1 Inventory UPDATE Statements
    18.5.2 Bin Packing Returns

    Index

Product details

  • No. of pages: 384
  • Language: English
  • Copyright: © Morgan Kaufmann 2008
  • Published: January 22, 2008
  • Imprint: Morgan Kaufmann
  • Paperback ISBN: 9780123741370
  • eBook ISBN: 9780080557526

About the Author

Joe Celko

Joe Celko
Joe Celko served 10 years on ANSI/ISO SQL Standards Committee and contributed to the SQL-89 and SQL-92 Standards.

Mr. Celko is author a series of books on SQL and RDBMS for Elsevier/MKP. He is an independent consultant based in Austin, Texas.

He has written over 1200 columns in the computer trade and academic press, mostly dealing with data and databases.

Affiliations and Expertise

Independent Consultant, Austin, Texas

Ratings and Reviews

Write a review

There are currently no reviews for "Joe Celko's Thinking in Sets: Auxiliary, Temporal, and Virtual Tables in SQL"