- Don’t normalize that data
- Intro to DB Compression
- Normalization is not the answer
- Stop wasting your time with normalization
- Traffic tracking for the rest of us
- How I learned to stop worrying and learn to love data compression
- My cat’s data is not normal, and neither is yours
Resources
- https://github.com/jmohler1970/DB_Compression
- https://stackoverflow.com/questions/654921/how-big-can-a-user-agent-string-get
- https://docs.microsoft.com/en-us/sql/relational-databases/data-compression/enable-compression-on-a-table-or-index?view=sql-server-2017#TsqlProcedure
- https://docs.microsoft.com/en-us/sql/relational-databases/data-compression/data-compression?view=sql-server-2017
- https://docs.microsoft.com/en-us/sql/relational-databases/views/create-indexed-views?view=sql-server-2017
Good approach! One thing to note about using data compression in SQL Server, if your table does not have a clustered index on it, compression may not behave like you would expect.
When compressing a Heap (a table w/o clustered index), SQL Server will only compress the data that is currently there, and not compress any future incoming data. You will have to periodically rebuild the Heap by removing and reapplying compression. This is not the case with a clustered index table, once you apply compression it will continue to compress new incoming data.
Also, adding or removing compression to a Heap forces all nonclustered indexes on the table to also be rebuilt behind the scenes so that they have pointers to the new row locations in the Heap. However, on tables with clustered indexes the table and nonclustered indexes can be compressed and rebuilt independently from each other.
sp_estimate_data_compression_savings just provides an estimate of what the table size would be if you applied the specified compression level. (typically “NONE”,”PAGE”, or “ROW”)
You can run the below query to determine whether or not compression is enabled on a table or index, and whether the object is clustered, nonclustered, or a heap. How it “works” is dependent upon whether the table is clustered or a heap. That’s why I posted my original comment, because MS isn’t always clear on what’s going on and it’s hard to know that the same compression settings works differently on each.
SELECT DISTINCT SCHEMA_NAME(o.schema_id) + ‘.’ + OBJECT_NAME(o.object_id) AS TableName,
i.name AS IndexName, p.data_compression_desc AS CompressionType, i.type_desc AS StorageType
FROM sys.partitions p
INNER JOIN sys.objects o ON p.object_id = o.object_id
JOIN sys.indexes i ON p.object_id = i.object_id AND i.index_id = p.index_id
WHERE p.data_compression > 0
AND SCHEMA_NAME(o.schema_id) <> ‘SYS’
I think you are right on this one.
I have updated Github to have an index on dbo.Traffic.
https://github.com/jmohler1970/DB_Compression
Thanks for the feed back
You must be logged in to post a comment.