Naming Convention: All constraints named with xpk_ prefix
8.3.3 Key Examples
person_id in person table
visit_occurrence_id in visit_occurrence table
condition_occurrence_id in condition_occurrence table
concept_id in concept table
8.4 3. Indices File (OMOPCDM_postgresql_5.4_indices.sql)
8.4.1 Purpose
Creates database indexes to dramatically improve query performance, especially for large datasets.
8.4.2 Key SQL Syntax
8.4.2.1 Standard Index Creation
CREATEINDEX idx_person_id ON @cdmDatabaseSchema.person (person_id ASC);
8.4.2.2 PostgreSQL Clustering (Physical Data Organization)
CLUSTER @cdmDatabaseSchema.person USING idx_person_id;
8.4.3 Index Strategy by Table Type
8.4.3.1 Patient-Centric Tables
-- Clustered by person_id for patient-level queriesCREATEINDEX idx_condition_person_id_1 ON condition_occurrence (person_id ASC);CLUSTER condition_occurrence USING idx_condition_person_id_1;-- Additional indexes for common queriesCREATEINDEX idx_condition_concept_id_1 ON condition_occurrence (condition_concept_id ASC);CREATEINDEX idx_condition_visit_id_1 ON condition_occurrence (visit_occurrence_id ASC);
8.4.3.2 Vocabulary Tables
-- Clustered by primary concept lookupsCREATEINDEX idx_concept_concept_id ON concept (concept_id ASC);CLUSTER concept USING idx_concept_concept_id;-- Additional indexes for vocabulary searchesCREATEINDEX idx_concept_code ON concept (concept_code ASC);CREATEINDEX idx_concept_vocabulary_id ON concept (vocabulary_id ASC);
8.4.4 Performance Benefits
Clustered Indexes: Physically reorder table data for optimal access patterns
Foreign Key Indexes: Speed up joins between tables
Query-Specific Indexes: Support common analytical queries
8.5.3.1 1. Everything Links to CONCEPT (Vocabulary Standardization)
-- All concept_id fields reference the central vocabularyALTERTABLE person ADDCONSTRAINT fpk_person_gender_concept_idFOREIGNKEY (gender_concept_id) REFERENCES CONCEPT (CONCEPT_ID);ALTERTABLE condition_occurrence ADDCONSTRAINT fpk_condition_occurrence_condition_concept_idFOREIGNKEY (condition_concept_id) REFERENCES CONCEPT (CONCEPT_ID);
8.5.3.2 2. Patient-Centric Design
-- All clinical events link back to patientsALTERTABLE visit_occurrence ADDCONSTRAINT fpk_visit_occurrence_person_idFOREIGNKEY (person_id) REFERENCES PERSON (PERSON_ID);ALTERTABLE condition_occurrence ADDCONSTRAINT fpk_condition_occurrence_person_idFOREIGNKEY (person_id) REFERENCES PERSON (PERSON_ID);
8.5.3.3 3. Visit Hierarchy
-- Clinical events can be linked to visitsALTERTABLE condition_occurrence ADDCONSTRAINT fpk_condition_occurrence_visit_occurrence_idFOREIGNKEY (visit_occurrence_id) REFERENCES VISIT_OCCURRENCE (VISIT_OCCURRENCE_ID);-- Visit details link to main visitsALTERTABLE visit_detail ADDCONSTRAINT fpk_visit_detail_visit_occurrence_idFOREIGNKEY (visit_occurrence_id) REFERENCES VISIT_OCCURRENCE (VISIT_OCCURRENCE_ID);
8.5.3.4 4. Provider and Care Site Tracking
-- Events can be linked to providers and care sitesALTERTABLE condition_occurrence ADDCONSTRAINT fpk_condition_occurrence_provider_idFOREIGNKEY (provider_id) REFERENCES PROVIDER (PROVIDER_ID);ALTERTABLE visit_occurrence ADDCONSTRAINT fpk_visit_occurrence_care_site_idFOREIGNKEY (care_site_id) REFERENCES CARE_SITE (CARE_SITE_ID);
8.6 Execution Order
These files must be executed in the correct sequence:
DDL � Create all table structures
Primary Keys � Add unique constraints and primary indexes
Indices � Add performance optimization indexes
Constraints � Enforce referential integrity
-- Example execution in R (from your cdm-build.qmd)CommonDataModel::executeDdl( connectionDetails = cd, cdmVersion ="5.4", cdmDatabaseSchema ="ohdsi_demo")
8.7 PostgreSQL-Specific Features
8.7.1 Data Types
TIMESTAMP for precise datetime storage
NUMERIC for exact decimal arithmetic
TEXT for unlimited text length
varchar(n) for limited text with specific length
8.7.2 Performance Optimizations
CLUSTER commands physically reorder data
Strategic indexing on person_id for patient-centric queries
Vocabulary indexes for concept lookups
Distribution hints for parallel processing systems
8.7.3 Constraints and Integrity
Comprehensive foreign key relationships
Named constraints for easy identification
Cascade options for data management
8.8 OMOP CDM Design Principles
8.8.1 1. Standardization
All clinical concepts mapped to standard vocabularies
Consistent data types and naming conventions
Uniform structure across institutions
8.8.2 2. Patient-Centricity
Person table as the core entity
All clinical events linked to patients
Longitudinal view of patient data
8.8.3 3. Flexibility
Source value fields preserve original data
Multiple concept fields (standard + source)
Extensible vocabulary system
8.8.4 4. Performance
Optimized for analytical queries
Strategic indexing and clustering
Efficient joins and aggregations
8.9 Common Query Patterns
8.9.1 Patient Demographics
SELECT p.person_id, c_gender.concept_name AS gender, c_race.concept_name AS race, p.year_of_birthFROM person pJOIN concept c_gender ON p.gender_concept_id = c_gender.concept_idJOIN concept c_race ON p.race_concept_id = c_race.concept_id;
8.9.2 Patient Conditions
SELECT p.person_id, c.concept_name AS condition_name, co.condition_start_dateFROM person pJOIN condition_occurrence co ON p.person_id = co.person_idJOIN concept c ON co.condition_concept_id = c.concept_idWHERE c.domain_id ='Condition';
8.9.3 Drug Exposures
SELECT p.person_id, c.concept_name AS drug_name, de.drug_exposure_start_date, de.days_supplyFROM person pJOIN drug_exposure de ON p.person_id = de.person_idJOIN concept c ON de.drug_concept_id = c.concept_idWHERE c.domain_id ='Drug';
This structure provides a robust foundation for healthcare data analysis, enabling standardized research across different healthcare systems and institutions.