TSQLMaster: Fighting Database Circular References

So one of the problems I came across Database Administration that is hard to discover and even harder to avoid is Circular Reference.

A Circular Reference happens when you create foreign key references between objects, creating "circle references". This can happen on one table that references its own Primary Key, which is not unusual when you need to create a hierarchy type of relation.

But it can get weird, when you have a table1 that references table2 and another fk from table2 to table1. So you already have a reference between the two tables, why you need another? This is a messy mistake that generates redundant and innacurate information, creating huge mistakes on reports and audits.

A harder level of circular reference is when you create a circular reference between three or four tables. This will lead to so many levels of redundancy and creates infinite paths to retrieve information. Finally, no one knows the right path to the actual real data.

And don't bother asking the devs: chances are the original developer is working on another project in another company or he probably doesn't know what he did wrong or what is the right path to it.

On CodeProject.com, there is this great article on Circular Reference, writen by Pham Dinh Truong. Its full of pictures, quite detailed and simple enough so a baby can understand it.

But, let's move to the harder part: locating these circular references. I wrote a few scripts to help me out on that, but as it turns out it can be a hassle to collect this information from the system views. One script I came up is this:

SELECT Object_schema_name(fk1.parent_object_id)
       + '.' + Object_name(fk1.parent_object_id),
       Object_schema_name(fk2.parent_object_id)
       + '.' + Object_name(fk2.parent_object_id)
--OBJECT_SCHEMA_NAME(fk3.parent_object_id)
-- + '.' + OBJECT_NAME(fk3.parent_object_id)
FROM   sys.foreign_keys AS fk1
       JOIN sys.foreign_keys AS fk2
         ON fk1.parent_object_id = fk2.referenced_object_id
            AND fk2.parent_object_id = fk1.referenced_object_id

It's accurate and it works only for two leves (table1, table2 stuff), but you can have some duplicates rows (table1>table2 table2>table1). I tried writing more scripts, but I just found another great reference. It was writen by Amir Mirkamali on his blog, this is the original post. I'm republishing the script here, hopefully the author won't mind =)

In case you are intending on confronting developers that built c.r. masterpieces, be prepared for war. Really.

SET nocount ON

-- WWB: Create a Temp Table Of All Relationship To Improve Overall Performance
CREATE TABLE #tablerelationships
  (
     fk_schema NVARCHAR(max),
     fk_table  NVARCHAR(max),
     pk_schema NVARCHAR(max),
     pk_table  NVARCHAR(max)
  )

-- WWB: Create a List Of All Tables To Check
CREATE TABLE #tablelist
  (
     [schema] NVARCHAR(max),
     [table]  NVARCHAR(max)
  )

-- WWB: Fill the Table List
INSERT INTO #tablelist
            ([table],
             [schema])
SELECT table_name,
       table_schema
FROM   information_schema.tables
WHERE  table_type = 'BASE TABLE'

-- WWB: Fill the RelationShip Temp Table
INSERT INTO #tablerelationships
            (fk_schema,
             fk_table,
             pk_schema,
             pk_table)
SELECT FK.table_schema,
       FK.table_name,
       PK.table_schema,
       PK.table_name
FROM   information_schema.referential_constraints C
       INNER JOIN information_schema.table_constraints FK
               ON C.constraint_name = FK.constraint_name
       INNER JOIN information_schema.table_constraints PK
               ON C.unique_constraint_name = PK.constraint_name
       INNER JOIN information_schema.key_column_usage CU
               ON C.constraint_name = CU.constraint_name
       INNER JOIN (SELECT i1.table_name,
                          i2.column_name
                   FROM   information_schema.table_constraints i1
                          INNER JOIN information_schema.key_column_usage i2
                                  ON i1.constraint_name = i2.constraint_name
                   WHERE  i1.constraint_type = 'PRIMARY KEY') PT
               ON PT.table_name = PK.table_name

CREATE TABLE #stack
  (
     [schema] NVARCHAR(max),
     [table]  NVARCHAR(max)
  )

go

-- WWB: Drop SqlAzureRecursiveFind
IF EXISTS (SELECT *
           FROM   sys.objects
           WHERE  object_id = Object_id(N'[dbo].[SqlAzureRecursiveFind]')
                  AND type IN ( N'P', N'PC' ))
  DROP PROCEDURE [dbo].[SqlAzureRecursiveFind]

go

-- WWB: Create a Stored Procedure that Recursively Calls Itself
CREATE PROC Sqlazurerecursivefind @BaseSchmea NVARCHAR(max),
                                  @BaseTable  NVARCHAR(max),
                                  @Schmea     NVARCHAR(max),
                                  @Table      NVARCHAR(max),
                                  @Fail       NVARCHAR(max) output
AS
    SET nocount ON

    -- WWB: Keep Track Of the Schema and Tables We Have Checked
    -- Prevents Looping
    INSERT INTO #stack
                ([schema],
                 [table])
    VALUES      (@Schmea,
                 @Table)

    DECLARE @RelatedSchema NVARCHAR(max)
    DECLARE @RelatedTable NVARCHAR(max)
    -- WWB: Select all tables that the input table is dependent on
    DECLARE table_cursor CURSOR local FOR
      SELECT pk_schema,
             pk_table
      FROM   #tablerelationships
      WHERE  fk_schema = @Schmea
             AND fk_table = @Table

    OPEN table_cursor;

    -- Perform the first fetch.
    FETCH next FROM table_cursor INTO @RelatedSchema, @RelatedTable;

    -- Check @@FETCH_STATUS to see if there are any more rows to fetch.
    WHILE @@FETCH_STATUS = 0
      BEGIN
          -- WWB: If We have Recurred To Where We Start This
          -- Is a Circular Reference
          -- Begin failing out of the recursions
          IF ( @BaseSchmea = @RelatedSchema
               AND @BaseTable = @RelatedTable )
            BEGIN
                SET @Fail = @RelatedSchema + '.' + @RelatedTable

                RETURN
            END
          ELSE
            BEGIN
                DECLARE @Count INT

                -- WWB: Check to make sure that the dependencies are not in the stack
                -- If they are we don't need to go down this branch
                SELECT @Count = Count(1)
                FROM   #stack
                WHERE  #stack.[schema] = @RelatedSchema
                       AND #stack.[table] = @RelatedTable

                IF ( @Count = 0 )
                  BEGIN
                      -- WWB: Recurse
                      EXECUTE Sqlazurerecursivefind
                        @BaseSchmea,
                        @BaseTable,
                        @RelatedSchema,
                        @RelatedTable,
                        @Fail output

                      IF ( Len(@Fail) > 0 )
                        BEGIN
                            -- WWB: If the Call Fails, Build the Output Up
                            SET @Fail = @RelatedSchema + '.' + @RelatedTable +
                                        ' -> '
                                        + @Fail

                            RETURN
                        END
                  END
            END

          -- This is executed as long as the previous fetch succeeds.
          FETCH next FROM table_cursor INTO @RelatedSchema, @RelatedTable;
      END

    CLOSE table_cursor;

    DEALLOCATE table_cursor;

go

SET nocount ON

DECLARE @Schema NVARCHAR(max)
DECLARE @Table NVARCHAR(max)
DECLARE @Fail NVARCHAR(max)
-- WWB: Loop Through All the Tables In the Database Checking Each One
DECLARE list_cursor CURSOR FOR
  SELECT [schema],
         [table]
  FROM   #tablelist

OPEN list_cursor;

-- Perform the first fetch.
FETCH next FROM list_cursor INTO @Schema, @Table;

-- Check @@FETCH_STATUS to see if there are any more rows to fetch.
WHILE @@FETCH_STATUS = 0
  BEGIN
      -- WWB: Clear the Stack (Don't you love Global Variables)
      DELETE #stack

      -- WWB: Initialize the Input
      SET @Fail = ''

      -- WWB: Check the Table
      EXECUTE Sqlazurerecursivefind
        @Schema,
        @Table,
        @Schema,
        @Table,
        @Fail output

      IF ( Len(@Fail) > 0 )
        BEGIN
            -- WWB: Failed, Output
            SET @Fail = @Schema + '.' + @Table + ' -> ' + @Fail

            PRINT @Fail
        END

      -- This is executed as long as the previous fetch succeeds.
      FETCH next FROM list_cursor INTO @Schema, @Table;
  END

-- WWB: Clean Up
CLOSE list_cursor;

DEALLOCATE list_cursor;

DROP TABLE #tablerelationships
DROP TABLE #stack
DROP TABLE #tablelist
DROP PROC sqlazurerecursivefind

Check out my new book about R Language http://www.amazon.com/dp/B00SX6WA06

TSQLMaster

sexta-feira, 1 de abril de 2016

Fighting Database Circular References

Nenhum comentário:

Postar um comentário