MCM Training Video #17 – Partitioning

These partition notes have been very quickly published to help out someone with a query that they have (right now!) so if you happen to stumble on this page before I have had change to format, tidy up and add anything then apologies but it won’t be like this for long! These notes were taken from the training video here :-http://technet.microsoft.com/en-us/sqlserver/gg545008.aspx

Why partition? ..or even why should you use Partition Views?

  • To eliminate or reduce resource contention
  • To vary access patterns
  • If you you maintenance restrictions
  • If you have availability requirements
  • To remove resource blocking or to minimize maintenance costs

Using Partition Views does not automatically mean to use Distributed Partition Views.

@4m10s
Distributed Partitioned Views are a are great form of scale out.
Partitioned Tables is more of a Scale-Up manageability enhancement, Distributed Partition Views are a form of Scale-Out partitioning.

Why should we use Partitioned Tables?
If you have a single large table (for instance 50,60,70 GB can be classed as large).
If your table is difficult to manage.
Index create builds or rebuilds
backup restore and recovery
escalation
May have different access patterns
some data new OLTP
some data old for historical lookups – small singleton for OLTP loops or larger for analysis.
DOES NOT HAVE TO BE ONE TABLE

Transaction Processing
smaller for insert update and delete activity
speed/ durability of highest importance
should be highly available
usually smaller portion of VLDB
Backups should be more frequent
Decision Support
Larger range queries, more indexes
Query access patterns may conflict with write activity
Less critical data, backups less frequent
(me-data might be static also if separated will reduce index maintenance activity (don’t know whether b-tree splits often).

SQL 2000 or non Enterprise editions of SQL database was either ONLINE or not. When recovery begins the database is offline until it has completed.
SQL 2005+ for Enterprise
provides partial database availability and online piecemeal restore which give ability to have Database online and available for any portion of database that is not damaged.
Files also have states which impact the entire filegroup, above is ONLY true of secondary files -they have state independent of db. The filegroup in which a damaged file resides can be offline and or recovering while remainder of Database is available.
If primary file damaged then this doesnt apply
State of the Database IS based on the state of the primary filegroup and the log portion of the database so takes a bit of time to architect a Database implementation.

Index creation
smaller table takes less time
think about the maintenance costs and think about transaction log usage for indexes.

Way you partition
is to functionally parition, think about the way you will break down the data.
do this based on access patterns, index defrag patterns and sliding window control.
selective Indexes on partitions. Kimberly says there are limitations exist on filtered indexes.

Gives you more backup/ restore choices.
File/ filegroup backups and restores
Can restore PAGE, FILE, FILGROUP or FULL from FULL backup
Can restore PAGE, FILE, FILGROUP from FILEGROUP backup
Can restore PAGE, FILE from FILE backup

It all improves availability!

Horizontal Range Partitioning history
SQL 7.0 came proportional fill for filegroups with more than one file. Because multiple extention allocation can be contended, kimberely recommends that 2 or more files be allocated for ReadWrite portions of database.
SQL 7.0 Queryable Partition Views  – used partition elimination during optimization only for queries
SQL 2000 Updateable partitioned views -there are limitations and Kimberly Tripp generally doesnt use the views for mods. One such is the partioning column must be the leading column of the primary key. Base table of Partition View cannot have identity columns. Kimberly Tripp says for updates she generally dynamically generates DML statement directly against the particular base table…
SQL 2005 Partitioned tables and indexes – is a single defined table (therefore things like creating indexes is easier). Lots of benefits but still some BIG negatives.
@25m12s
Creating partition tables you are effectively creating boundaries and they wont have any gaps

Functionally breaking down data
Very large table with large amount of data

you could have escalation problems ← need to look at SQL escalation even in SQL 2008.

Index maintenance problems and sort problems

Backups and recover problems

Archiving?

Could partition or create table on years for instance and put view on tables to join up. Part tables is only available on Enterprise edition, Partition View avaliable in any and a max of 255 tables in one view.
KT likes a hybrid solution and use HP using view on tables which use partitions. e.g view on years and partitioning on months.
@31m00s
Partition Views are on any edition but partition tables are only on enterprise.
Part tables prior to SP2 can have 1000 parts, but 2008 SP2 allows 10,000
@32m00s she starts doing a Partition View demo
KT has got a datetime White Paper
KT says that when she is loading data in she usually uses a staging area. She also like to check the data by putting check constraints on the staging table however also suggests that it can be faster to disable the constraints, load the data and then enable and check.
@38m30s Talks about best practice of what to use as your clustered key which will support not only partitioned views but good for range queries and because it will cause ever increasing pattern will reduce fragmentation.
@40m00s Kimberly Tripp changes table to use filegroups and does HP demo ←go over again
Cannot create index on a partitioned view.
If you create wrong index on one of the base tables then when partition elimination occurs this could really mess up the query plan.
SQL also has to trust your constraints on your base tables otherwise partition elimination could go to multiple tables. There is an objectproperty ‘cnstisnottrusted’ to check and if ‘1’s then you have a possible problem such as constraints could be turned off or wrong.
Also to do an update on a partitioned view the Primary key needs to be UNIONed otherwise you will reieve an error suggesting its not. So Kimberly Tripp is saying that this kind of situation can be dynamically handled so that updates are applied directly against the base table.
@50m00s Partitioned Table DEMO
Her staging area is cheaper and slower and not on redundant drives.

Partition function and partition scheme are seperate objects which can be reused for other tables.
@55m10s she discusses that most people like to have RANGE RIGHT and the first partition empty because otherwise if it is not and you slide out the partition at this stage you are not able to merge the boundary point until the next time you slide out the next one. If the first is empty then the situation is the same from the start.

The partition function simply defines the boundary points but doesnt physically map to anything
The partition scheme is the thing that binds the boundaries to their physical locations on disk.
←so I wonder whether you should really reused the schemes.
The scheme references the function

@57m50s
<need to fill in this bit>
This is all joined up by creating a primary clustered key on the partition scheme.
-is clustered key required?
-do the cluster key need to be related at all to the partition scheme column/s see that in Kimberly Tripp example that it is.

@1h01m37s If the table doesnt have any LOB columns then the partitioning operation can be performed online whilst users are updating, deleting and inserting data -very cool and there is a white paper detailing how ONLINE operations work.

@1h03m Can determine which partition is going to be queried by passing a value into $partition.<partitionfunction> and this will return the partition number. Basically you can do a query on all the data in a partition and aggregate counts and use this function to show how many rows in a partition. If a partition is empty then obviously this would not be displayed in the aggregation, the first partition though is ‘0’
In the properties of the query plan index access method if you do a properties you can see how many partitions access and if you do an F4 and select the object you can see what partition id was accessed.

So using table partitions is performing its own partition elimination.

Remember when you structurally change the database, ensure that you perform a FULL backup.

@1h06m28s Very good slide on comparing and contrasting Partition Views v PTs.
KT says that in order to really leverage Partition Views then you should be in enterprise edition since they are particulary useful for partial database availability or piecemeal restore.
KT says that Partition Views are great for table stats (since) the tables are smaller whereas PTs table stats can be less accurate since they can be very large and when there is a lot of skew to the data.
Another reason is that SQL server does not currently support ONLINE Index REBUILDs at the Partition level wheras in Partition View this is not going to be as big an issue to you! However partition level rebuilds are possible online for PTs as long as you rebuild the ENTIRE table!

@1h10m31s On Partition Views can do multiple constraints over one or more columns and essentially do part elimination over multiple values. ← need to try and understand this.

@1h11m15s When and why slide.
Essentially Kimberly Tripp says to use both together and for the Partition View table update decision she prefers application decision making.
NEED to go over this section again.

Foot Notes
Saw a tweet saying that Partition merge is an offline operation and apparently is still the same in Denali. Must look into this.

4 Responses to MCM Training Video #17 – Partitioning

  1. Hello, and thanks for making this available.
    One point that I did not see noted in your notes: Partitioned views can direct Inserts and Updates on the Partitioned View to the relevant base table, however this will only work in SQL Server 2000 Enterprise: “Note You can modify data through a partitioned view only if you install Microsoft SQL Server 2000 Enterprise Edition or Microsoft. SQL Server 2000 Developer Edition.”

    http://msdn.microsoft.com/en-us/library/aa933141%28v=sql.80%29.aspx

    That is a little frustrating, but in my application I can target the base tables for inserts.

    As a developer, my view for my current application is that I should implement a partitioned view.
    1/ Easier licensing for my customers
    2/ The management overhead seems similar to me – PVs require creation of separate tables and indexes, and in my case this would be managed by my application code. PTs require definition of Partition Schema and Partition Function and management when you need to add new partitions to the schema. I’d expect to have to update the Partition Schema quite commonly in a deployment of our application.
    3/ I can code around the lack of data modification operations on the PV

    So, all else being equal, I’ll go with PVs to avoid a tie-in to Enterprise Edition. Let me know if I’ve missed the point of PTs!

    • retracement says:

      Hi Steven, firstly you are most welcome and glad it was of some help. Very good spot by you regarding PVs in SQL 2000, I would have known about this at one point but over the sands of time … :)
      The good news for you is that this restriction of edition for updatable partitioned views has been lifted http://msdn.microsoft.com/en-us/library/ms187067.aspx and only exists in distributed partitioned views. Yes you have really hit the nail on the head, although PTs can be very useful, their use is definitely restricting you customer base.

      Please be aware that if you do ever go for distributed partitioned views there are other considerations that need to be taken into account due to the way in which they work across linked servers.

  2. Neil Hambly (@neil_hambly) mentioned the Partitioned View option to me after his talk in Southampton.

    I just need to get some release work out of the way before I jump in and get going with this idea. I’m currently worrying that the composite PK on my table might get in the way. PK is 4 colomns, but I want to partition so that each unique value of one of the columns has it’s own base table. I’ll let you know how I get one :-)

    • retracement says:

      Thanks for dropping by again Steven, obviously I dont necessarily know the use case as the reason why you want partitioned view or partitioned tables in the first place, but depending upon what that is there might be an alternative? If you do want me to take a look at want you are trying to achieve and give my suggestion then feel free to drop me an email. Either send me one through the hire page (I wont charge :) – or send me a tweet and I’ll DM you my email.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s