Online DDL: switch unique key and column logic to declarative `schemadiff` analysis #16229

shlomi-noach · 2024-06-19T10:51:21Z

vitess migrations require some analysis on participating columns (type, nullability, etc.) and of the table's unique keys (fiding an appropriate iteration key for the migration).

That analysis takes place today via information_schema:

vitess/go/vt/vttablet/onlineddl/schema.go

Lines 456 to 465 in 1cc3e14

    
           	sqlSelectColumnTypes = ` 
        
           		select 
        
           				*, 
        
           				COLUMN_DEFAULT IS NULL AS is_default_null 
        
           			from 
        
           				information_schema.columns 
        
           			where 
        
           				table_schema=%a 
        
           				and table_name=%a 
        
           		`

vitess/go/vt/vttablet/onlineddl/schema.go

Lines 482 to 545 in 1cc3e14

    
           	sqlSelectUniqueKeys = ` 
        
           	SELECT 
        
           		COLUMNS.TABLE_SCHEMA as table_schema, 
        
           		COLUMNS.TABLE_NAME as table_name, 
        
           		COLUMNS.COLUMN_NAME as column_name, 
        
           		UNIQUES.INDEX_NAME as index_name, 
        
           		UNIQUES.COLUMN_NAMES as column_names, 
        
           		UNIQUES.COUNT_COLUMN_IN_INDEX as count_column_in_index, 
        
           		COLUMNS.DATA_TYPE as data_type, 
        
           		COLUMNS.CHARACTER_SET_NAME as character_set_name, 
        
           		LOCATE('auto_increment', EXTRA) > 0 as is_auto_increment, 
        
           		(DATA_TYPE='float' OR DATA_TYPE='double') AS is_float, 
        
           		has_subpart, 
        
           		has_nullable 
        
           	FROM INFORMATION_SCHEMA.COLUMNS INNER JOIN ( 
        
           		SELECT 
        
           			TABLE_SCHEMA, 
        
           			TABLE_NAME, 
        
           			INDEX_NAME, 
        
           			COUNT(*) AS COUNT_COLUMN_IN_INDEX, 
        
           			GROUP_CONCAT(COLUMN_NAME ORDER BY SEQ_IN_INDEX ASC) AS COLUMN_NAMES, 
        
           			SUBSTRING_INDEX(GROUP_CONCAT(COLUMN_NAME ORDER BY SEQ_IN_INDEX ASC), ',', 1) AS FIRST_COLUMN_NAME, 
        
           			SUM(SUB_PART IS NOT NULL) > 0 AS has_subpart, 
        
           			SUM(NULLABLE='YES') > 0 AS has_nullable 
        
           		FROM INFORMATION_SCHEMA.STATISTICS 
        
           		WHERE 
        
           			NON_UNIQUE=0 
        
           			AND TABLE_SCHEMA=%a 
        
           			AND TABLE_NAME=%a 
        
           		GROUP BY TABLE_SCHEMA, TABLE_NAME, INDEX_NAME 
        
           	) AS UNIQUES 
        
           	ON ( 
        
           		COLUMNS.COLUMN_NAME = UNIQUES.FIRST_COLUMN_NAME 
        
           	) 
        
           	WHERE 
        
           		COLUMNS.TABLE_SCHEMA=%a 
        
           		AND COLUMNS.TABLE_NAME=%a 
        
           	ORDER BY 
        
           		COLUMNS.TABLE_SCHEMA, COLUMNS.TABLE_NAME, 
        
           		CASE UNIQUES.INDEX_NAME 
        
           			WHEN 'PRIMARY' THEN 0 
        
           			ELSE 1 
        
           		END, 
        
           		CASE has_nullable 
        
           			WHEN 0 THEN 0 
        
           			ELSE 1 
        
           		END, 
        
           		CASE has_subpart 
        
           			WHEN 0 THEN 0 
        
           			ELSE 1 
        
           		END, 
        
           		CASE IFNULL(CHARACTER_SET_NAME, '') 
        
           				WHEN '' THEN 0 
        
           				ELSE 1 
        
           		END, 
        
           		CASE DATA_TYPE 
        
           			WHEN 'tinyint' THEN 0 
        
           			WHEN 'smallint' THEN 1 
        
           			WHEN 'int' THEN 2 
        
           			WHEN 'bigint' THEN 3 
        
           			ELSE 100 
        
           		END, 
        
           		COUNT_COLUMN_IN_INDEX 
        
           	`

vitess/go/vt/vttablet/onlineddl/schema.go

Line 550 in 1cc3e14

sqlShowColumnsFrom = "SHOW COLUMNS FROM `%a`"

vitess/go/vt/vttablet/onlineddl/schema.go

Lines 558 to 566 in 1cc3e14

    
           	sqlGetAutoIncrement                    = ` 
        
           		SELECT 
        
           			AUTO_INCREMENT 
        
           		FROM INFORMATION_SCHEMA.TABLES 
        
           		WHERE 
        
           			TABLES.TABLE_SCHEMA=%a 
        
           			AND TABLES.TABLE_NAME=%a 
        
           			AND AUTO_INCREMENT IS NOT NULL 
        
           		`

We want to move away from information_schema based analysis and into programmatic and declarative schemadiff analysis. We already ask schemadiff for instant-ddl capabilities and we generally want it to own as much of schema analysis as possible.

At the resolution of this issue, schemadiff should be able to tell, given two before and after tables, which unique keys ar ebest to use as iteration keys (if any) and what specific details we should know about the columns.

The text was updated successfully, but these errors were encountered:

shlomi-noach added Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: Online DDL Online DDL (vitess/native/gh-ost/pt-osc) labels Jun 19, 2024

shlomi-noach self-assigned this Jun 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Online DDL: switch unique key and column logic to declarative `schemadiff` analysis #16229

Online DDL: switch unique key and column logic to declarative `schemadiff` analysis #16229

shlomi-noach commented Jun 19, 2024 •

edited

Loading

Online DDL: switch unique key and column logic to declarative schemadiff analysis #16229

Online DDL: switch unique key and column logic to declarative schemadiff analysis #16229

Comments

shlomi-noach commented Jun 19, 2024 • edited Loading

Online DDL: switch unique key and column logic to declarative `schemadiff` analysis #16229

Online DDL: switch unique key and column logic to declarative `schemadiff` analysis #16229

shlomi-noach commented Jun 19, 2024 •

edited

Loading