These partition notes have been very quickly published to help out someone with a query that they have (right now!) so if you happen to stumble on this page before I have had change to format, tidy up and add anything then apologies but it won’t be like this for long! These notes were taken from the training video here :-http://technet.microsoft.com/en-us/sqlserver/gg545008.aspx
Why partition? ..or even why should you use Partition Views?
- To eliminate or reduce resource contention
- To vary access patterns
- If you you maintenance restrictions
- If you have availability requirements
- To remove resource blocking or to minimize maintenance costs
Using Partition Views does not automatically mean to use Distributed Partition Views.
Distributed Partitioned Views are a are great form of scale out.
Partitioned Tables is more of a Scale-Up manageability enhancement, Distributed Partition Views are a form of Scale-Out partitioning.
Why should we use Partitioned Tables?
If you have a single large table (for instance 50,60,70 GB can be classed as large).
If your table is difficult to manage.
Index create builds or rebuilds
backup restore and recovery
May have different access patterns
some data new OLTP
some data old for historical lookups – small singleton for OLTP loops or larger for analysis.
DOES NOT HAVE TO BE ONE TABLE
smaller for insert update and delete activity
speed/ durability of highest importance
should be highly available
usually smaller portion of VLDB
Backups should be more frequent
Larger range queries, more indexes
Query access patterns may conflict with write activity
Less critical data, backups less frequent
(me-data might be static also if separated will reduce index maintenance activity (don’t know whether b-tree splits often).
SQL 2000 or non Enterprise editions of SQL database was either ONLINE or not. When recovery begins the database is offline until it has completed.
SQL 2005+ for Enterprise
provides partial database availability and online piecemeal restore which give ability to have Database online and available for any portion of database that is not damaged.
Files also have states which impact the entire filegroup, above is ONLY true of secondary files -they have state independent of db. The filegroup in which a damaged file resides can be offline and or recovering while remainder of Database is available.
If primary file damaged then this doesnt apply
State of the Database IS based on the state of the primary filegroup and the log portion of the database so takes a bit of time to architect a Database implementation.
smaller table takes less time
think about the maintenance costs and think about transaction log usage for indexes.
Way you partition
is to functionally parition, think about the way you will break down the data.
do this based on access patterns, index defrag patterns and sliding window control.
selective Indexes on partitions. Kimberly says there are limitations exist on filtered indexes.
Gives you more backup/ restore choices.
File/ filegroup backups and restores
Can restore PAGE, FILE, FILGROUP or FULL from FULL backup
Can restore PAGE, FILE, FILGROUP from FILEGROUP backup
Can restore PAGE, FILE from FILE backup
It all improves availability!
Horizontal Range Partitioning history
SQL 7.0 came proportional fill for filegroups with more than one file. Because multiple extention allocation can be contended, kimberely recommends that 2 or more files be allocated for ReadWrite portions of database.
SQL 7.0 Queryable Partition Views – used partition elimination during optimization only for queries
SQL 2000 Updateable partitioned views -there are limitations and Kimberly Tripp generally doesnt use the views for mods. One such is the partioning column must be the leading column of the primary key. Base table of Partition View cannot have identity columns. Kimberly Tripp says for updates she generally dynamically generates DML statement directly against the particular base table…
SQL 2005 Partitioned tables and indexes – is a single defined table (therefore things like creating indexes is easier). Lots of benefits but still some BIG negatives.
Creating partition tables you are effectively creating boundaries and they wont have any gaps
Functionally breaking down data
Very large table with large amount of data
you could have escalation problems ← need to look at SQL escalation even in SQL 2008.
Index maintenance problems and sort problems
Backups and recover problems
Could partition or create table on years for instance and put view on tables to join up. Part tables is only available on Enterprise edition, Partition View avaliable in any and a max of 255 tables in one view.
KT likes a hybrid solution and use HP using view on tables which use partitions. e.g view on years and partitioning on months.
Partition Views are on any edition but partition tables are only on enterprise.
Part tables prior to SP2 can have 1000 parts, but 2008 SP2 allows 10,000
@32m00s she starts doing a Partition View demo
KT has got a datetime White Paper
KT says that when she is loading data in she usually uses a staging area. She also like to check the data by putting check constraints on the staging table however also suggests that it can be faster to disable the constraints, load the data and then enable and check.
@38m30s Talks about best practice of what to use as your clustered key which will support not only partitioned views but good for range queries and because it will cause ever increasing pattern will reduce fragmentation.
@40m00s Kimberly Tripp changes table to use filegroups and does HP demo ←go over again
Cannot create index on a partitioned view.
If you create wrong index on one of the base tables then when partition elimination occurs this could really mess up the query plan.
SQL also has to trust your constraints on your base tables otherwise partition elimination could go to multiple tables. There is an objectproperty ‘cnstisnottrusted’ to check and if ‘1’s then you have a possible problem such as constraints could be turned off or wrong.
Also to do an update on a partitioned view the Primary key needs to be UNIONed otherwise you will reieve an error suggesting its not. So Kimberly Tripp is saying that this kind of situation can be dynamically handled so that updates are applied directly against the base table.
@50m00s Partitioned Table DEMO
Her staging area is cheaper and slower and not on redundant drives.
Partition function and partition scheme are seperate objects which can be reused for other tables.
@55m10s she discusses that most people like to have RANGE RIGHT and the first partition empty because otherwise if it is not and you slide out the partition at this stage you are not able to merge the boundary point until the next time you slide out the next one. If the first is empty then the situation is the same from the start.
The partition function simply defines the boundary points but doesnt physically map to anything
The partition scheme is the thing that binds the boundaries to their physical locations on disk.
←so I wonder whether you should really reused the schemes.
The scheme references the function
<need to fill in this bit>
This is all joined up by creating a primary clustered key on the partition scheme.
-is clustered key required?
-do the cluster key need to be related at all to the partition scheme column/s see that in Kimberly Tripp example that it is.
@1h01m37s If the table doesnt have any LOB columns then the partitioning operation can be performed online whilst users are updating, deleting and inserting data -very cool and there is a white paper detailing how ONLINE operations work.
@1h03m Can determine which partition is going to be queried by passing a value into $partition.<partitionfunction> and this will return the partition number. Basically you can do a query on all the data in a partition and aggregate counts and use this function to show how many rows in a partition. If a partition is empty then obviously this would not be displayed in the aggregation, the first partition though is ‘0’
In the properties of the query plan index access method if you do a properties you can see how many partitions access and if you do an F4 and select the object you can see what partition id was accessed.
So using table partitions is performing its own partition elimination.
Remember when you structurally change the database, ensure that you perform a FULL backup.
@1h06m28s Very good slide on comparing and contrasting Partition Views v PTs.
KT says that in order to really leverage Partition Views then you should be in enterprise edition since they are particulary useful for partial database availability or piecemeal restore.
KT says that Partition Views are great for table stats (since) the tables are smaller whereas PTs table stats can be less accurate since they can be very large and when there is a lot of skew to the data.
Another reason is that SQL server does not currently support ONLINE Index REBUILDs at the Partition level wheras in Partition View this is not going to be as big an issue to you! However partition level rebuilds are possible online for PTs as long as you rebuild the ENTIRE table!
@1h10m31s On Partition Views can do multiple constraints over one or more columns and essentially do part elimination over multiple values. ← need to try and understand this.
@1h11m15s When and why slide.
Essentially Kimberly Tripp says to use both together and for the Partition View table update decision she prefers application decision making.
NEED to go over this section again.
Saw a tweet saying that Partition merge is an offline operation and apparently is still the same in Denali. Must look into this.