Tuning the Buffer Size for the data flow task in SSIS
The buffer size used is the smaller of 2 figures.
1) defaultbuffersize
2) defaultmaxbufferrows * row width
defaultbuffersize has a default of 10 MB (10485760 B)
You can change this to amaximum of 100 MB (104857600 B), which I have done and is the figure I have used here.
Calculating Row Width
I'm using a simple example of a single target table here.
The row width calculation gets messy if you try it from the table definition i.e. adding the sizes of columns (and remembering 2 bytes per character for unicode columns etc)
I chose to use the table size data and work out the average -
sp_spaceused 'dbo.bigtable'
name | rows | reserved | data | index_size | unused |
bigtable | 144451461 | 63064384 KB | 44740776 KB | 18317304 KB | 6304 KB |
So, data size in KB 44740776
data in Bytes = 44740776 * 1024 = 45814554624 B
Divide by row count ,
= 45814554624 / 144451461
317.16 Bytes/row
Row Width = 317 Bytes.
From the row width figure and the desire to put as much in the buffer as possible (to match the 100MB defaultbuffersize) we can calculate defaultmaxbufferrows.
Calculating DefaultMaxBufferRows
100MB buffer = 104857600 bytes
Row Width = 317 Bytes.
defaultmaxbufferrows = BufferSize / Row Width
= 104857600 / 317
= 330781 rows
So to ensure maximum buffer usage here we set -
defaultbuffersize = 104857600
defaultmaxbufferrows = 330781 (though I'd most likely round down to a the nearest hundred).
5 comments:
Good one and the Topic covers is informative.
data in Bytes = 44740776 / 1024 = 45814554624 B
small correction in above formula, its not division its multiplication.
Thank you Siva, have updated the post. :)
Good info, but I have a question: I create columns in the dataset (through derived columns task), do I also need to account for them when performing these calculations?
I woulod have said yes.
If anyone knows different, feel free to post.
Nice!
Tnx for sharing,
Lee
Post a Comment