So one of the problems I came across Database Administration that is hard to discover and even harder to avoid is Circular Reference.
A Circular Reference happens when you create foreign key references between objects, creating "circle references". This can happen on one table that references its own Primary Key, which is not unusual when you need to create a hierarchy type of relation.
But it can get weird, when you have a table1 that references table2 and another fk from table2 to table1. So you already have a reference between the two tables, why you need another? This is a messy mistake that generates redundant and innacurate information, creating huge mistakes on reports and audits.
A harder level of circular reference is when you create a circular reference between three or four tables. This will lead to so many levels of redundancy and creates infinite paths to retrieve information. Finally, no one knows the right path to the actual real data.
And don't bother asking the devs: chances are the original developer is working on another project in another company or he probably doesn't know what he did wrong or what is the right path to it.
On CodeProject.com, there is this great article on Circular Reference, writen by Pham Dinh Truong. Its full of pictures, quite detailed and simple enough so a baby can understand it.
But, let's move to the harder part: locating these circular references. I wrote a few scripts to help me out on that, but as it turns out it can be a hassle to collect this information from the system views. One script I came up is this:
SELECT Object_schema_name(fk1.parent_object_id)
+ '.' + Object_name(fk1.parent_object_id),
Object_schema_name(fk2.parent_object_id)
+ '.' + Object_name(fk2.parent_object_id)
--OBJECT_SCHEMA_NAME(fk3.parent_object_id)
-- + '.' + OBJECT_NAME(fk3.parent_object_id)
FROM sys.foreign_keys AS fk1
JOIN sys.foreign_keys AS fk2
ON fk1.parent_object_id = fk2.referenced_object_id
AND fk2.parent_object_id = fk1.referenced_object_id
+ '.' + Object_name(fk1.parent_object_id),
Object_schema_name(fk2.parent_object_id)
+ '.' + Object_name(fk2.parent_object_id)
--OBJECT_SCHEMA_NAME(fk3.parent_object_id)
-- + '.' + OBJECT_NAME(fk3.parent_object_id)
FROM sys.foreign_keys AS fk1
JOIN sys.foreign_keys AS fk2
ON fk1.parent_object_id = fk2.referenced_object_id
AND fk2.parent_object_id = fk1.referenced_object_id
It's accurate and it works only for two leves (table1, table2 stuff), but you can have some duplicates rows (table1>table2 table2>table1). I tried writing more scripts, but I just found another great reference. It was writen by Amir Mirkamali on his blog, this is the original post. I'm republishing the script here, hopefully the author won't mind =)
In case you are intending on confronting developers that built c.r. masterpieces, be prepared for war. Really.
SET nocount ON
-- WWB: Create a Temp Table Of All Relationship To Improve Overall Performance
CREATE TABLE #tablerelationships
(
fk_schema NVARCHAR(max),
fk_table NVARCHAR(max),
pk_schema NVARCHAR(max),
pk_table NVARCHAR(max)
)
-- WWB: Create a List Of All Tables To Check
CREATE TABLE #tablelist
(
[schema] NVARCHAR(max),
[table] NVARCHAR(max)
)
-- WWB: Fill the Table List
INSERT INTO #tablelist
([table],
[schema])
SELECT table_name,
table_schema
FROM information_schema.tables
WHERE table_type = 'BASE TABLE'
-- WWB: Fill the RelationShip Temp Table
INSERT INTO #tablerelationships
(fk_schema,
fk_table,
pk_schema,
pk_table)
SELECT FK.table_schema,
FK.table_name,
PK.table_schema,
PK.table_name
FROM information_schema.referential_constraints C
INNER JOIN information_schema.table_constraints FK
ON C.constraint_name = FK.constraint_name
INNER JOIN information_schema.table_constraints PK
ON C.unique_constraint_name = PK.constraint_name
INNER JOIN information_schema.key_column_usage CU
ON C.constraint_name = CU.constraint_name
INNER JOIN (SELECT i1.table_name,
i2.column_name
FROM information_schema.table_constraints i1
INNER JOIN information_schema.key_column_usage i2
ON i1.constraint_name = i2.constraint_name
WHERE i1.constraint_type = 'PRIMARY KEY') PT
ON PT.table_name = PK.table_name
CREATE TABLE #stack
(
[schema] NVARCHAR(max),
[table] NVARCHAR(max)
)
go
-- WWB: Drop SqlAzureRecursiveFind
IF EXISTS (SELECT *
FROM sys.objects
WHERE object_id = Object_id(N'[dbo].[SqlAzureRecursiveFind]')
AND type IN ( N'P', N'PC' ))
DROP PROCEDURE [dbo].[SqlAzureRecursiveFind]
go
-- WWB: Create a Stored Procedure that Recursively Calls Itself
CREATE PROC Sqlazurerecursivefind @BaseSchmea NVARCHAR(max),
@BaseTable NVARCHAR(max),
@Schmea NVARCHAR(max),
@Table NVARCHAR(max),
@Fail NVARCHAR(max) output
AS
SET nocount ON
-- WWB: Keep Track Of the Schema and Tables We Have Checked
-- Prevents Looping
INSERT INTO #stack
([schema],
[table])
VALUES (@Schmea,
@Table)
DECLARE @RelatedSchema NVARCHAR(max)
DECLARE @RelatedTable NVARCHAR(max)
-- WWB: Select all tables that the input table is dependent on
DECLARE table_cursor CURSOR local FOR
SELECT pk_schema,
pk_table
FROM #tablerelationships
WHERE fk_schema = @Schmea
AND fk_table = @Table
OPEN table_cursor;
-- Perform the first fetch.
FETCH next FROM table_cursor INTO @RelatedSchema, @RelatedTable;
-- Check @@FETCH_STATUS to see if there are any more rows to fetch.
WHILE @@FETCH_STATUS = 0
BEGIN
-- WWB: If We have Recurred To Where We Start This
-- Is a Circular Reference
-- Begin failing out of the recursions
IF ( @BaseSchmea = @RelatedSchema
AND @BaseTable = @RelatedTable )
BEGIN
SET @Fail = @RelatedSchema + '.' + @RelatedTable
RETURN
END
ELSE
BEGIN
DECLARE @Count INT
-- WWB: Check to make sure that the dependencies are not in the stack
-- If they are we don't need to go down this branch
SELECT @Count = Count(1)
FROM #stack
WHERE #stack.[schema] = @RelatedSchema
AND #stack.[table] = @RelatedTable
IF ( @Count = 0 )
BEGIN
-- WWB: Recurse
EXECUTE Sqlazurerecursivefind
@BaseSchmea,
@BaseTable,
@RelatedSchema,
@RelatedTable,
@Fail output
IF ( Len(@Fail) > 0 )
BEGIN
-- WWB: If the Call Fails, Build the Output Up
SET @Fail = @RelatedSchema + '.' + @RelatedTable +
' -> '
+ @Fail
RETURN
END
END
END
-- This is executed as long as the previous fetch succeeds.
FETCH next FROM table_cursor INTO @RelatedSchema, @RelatedTable;
END
CLOSE table_cursor;
DEALLOCATE table_cursor;
go
SET nocount ON
DECLARE @Schema NVARCHAR(max)
DECLARE @Table NVARCHAR(max)
DECLARE @Fail NVARCHAR(max)
-- WWB: Loop Through All the Tables In the Database Checking Each One
DECLARE list_cursor CURSOR FOR
SELECT [schema],
[table]
FROM #tablelist
OPEN list_cursor;
-- Perform the first fetch.
FETCH next FROM list_cursor INTO @Schema, @Table;
-- Check @@FETCH_STATUS to see if there are any more rows to fetch.
WHILE @@FETCH_STATUS = 0
BEGIN
-- WWB: Clear the Stack (Don't you love Global Variables)
DELETE #stack
-- WWB: Initialize the Input
SET @Fail = ''
-- WWB: Check the Table
EXECUTE Sqlazurerecursivefind
@Schema,
@Table,
@Schema,
@Table,
@Fail output
IF ( Len(@Fail) > 0 )
BEGIN
-- WWB: Failed, Output
SET @Fail = @Schema + '.' + @Table + ' -> ' + @Fail
PRINT @Fail
END
-- This is executed as long as the previous fetch succeeds.
FETCH next FROM list_cursor INTO @Schema, @Table;
END
-- WWB: Clean Up
CLOSE list_cursor;
DEALLOCATE list_cursor;
DROP TABLE #tablerelationships
DROP TABLE #stack
DROP TABLE #tablelist
DROP PROC sqlazurerecursivefind
Check out my new book about R Language http://www.amazon.com/dp/B00SX6WA06
Nenhum comentário:
Postar um comentário
Leave your comment here!