หน้าแรก | กระดานข่าว |  ปฏิทิน |  ค้นหา |  ติดต่อเรา VMWARE.IN.TH | Largest unofficial VMware Thailand User
หน้า: [1]   ลงล่าง
  พิมพ์  
ผู้เขียน หัวข้อ: MS SQL : Large Query Performance from SQL Server 2000 to 2008, 32 & 64-bit  (อ่าน 485 ครั้ง)
vmware.in.th
"วีเอ็มแวร์ อิน ไทย" .. สังคมคน virtualization ออน์ไลน์
Administrator
Hero Member
*****
ออฟไลน์ ออฟไลน์

กระทู้: 1,251


pumpithu@gmail.com
ดูรายละเอียด เว็บไซต์ อีเมล์
I had been meaning to do a somewhat comprehensive review of SQL Server performance from versions 2000 to 2008 for both 32 and 64-bit on Data Warehouse type queries, with in depth examination of scaling in parallel execution plans.
 
For now, I can provide a short summary. The test platform is a Dell PowerEdge 2900 with 2 quad-core Xeon E5330 2.66GHz processors, and 24GB memory. The operating system is Windows Server 2008 64-bit for both 32 and 64-bit SQL Server versions. Technically SQL Server 2000 is not supported, but this is just a performance comparison, not a production environment. The database is generated using the TPC-H dbgen kit for scale factor 10, meaning the Lineitem table is approximately 10GB, and the entire database is approximately 17GB, which fits entirely in memory. There was some tempdb activity, which is spread across 10 15K drives.
 
All tests are run twice to load data into memory and pre-compile the execution plan for the second run. All results shown are for the second run. For SQL Server 2008, the tables use the new Date data type in place of Date Time, and queries are modified to avoid conversion anomalies as noted below. Below is the total (sum) CPU time in milli-seconds to execute the 22 queries in sequence  for max degree of parallelism: 1, 2, 4, and 8.
 
อ้างถึง
                        DOP 1         DOP 2         DOP4          DOP 8
2000 RTM             534,912    663,848    656,232    697,794
2000 bld 2187      514,881    589,245    657,543    770,272
2005 RTM 32        463,526    444,479    456,567    498,623
2005 SP2 32        464,478    403,668    413,685    452,134
2005 RTM 64        379,363    377,570    394,962    474,200
2005 SP2 64        370,206    327,149    345,155    436,491
2008 RTM             375,136    324,264    343,250    410,220

Duration in milli-seconds to run 22 queries by max DOP.
 
อ้างถึง
                        DOP 1         DOP 2         DOP4          DOP 8
2000 RTM             553,900    293,411    191,552    149,568
2000 bld 2187      566,333    276,085    188,497    164,677
2005 RTM 32        480,839    237,933    134,644    84,721
2005 SP2 32        483,842    214,804    119,525    72,515
2005 RTM 64        379,563    194,199    107,409    65,094
2005 SP2 64        370,374    166,579    94,844      59,388
2008 RTM             375,135    171,390    94,028      56,795

On SQL Server 2000 build 2187, notice that CPU increases from 514.8 to 589.2 seconds going from Max DOP 1 to 2 and so on to Max DOP 8. This is expected because there is overhead to employing a parallel execution plan, and the overhead increases with the number of threads involved. Between SQL Server 2000 RTM and build 2187, there was a sharp jump in the CPU required at Max DOP 8. I will disregard this as there were significant changes and code fixes between the two builds concerning correctness of parallel execution plan results. Still, there is an overall performance gain from Max DOP 4 to 8. Several years ago, I mentioned that SQL Server 2000 performance is very problematic beyond Max DOP 4. That was before multi-core processors, and there were at most, 4 cores per NUMA node. So the more correct interpretation is that SQL Server 2000 is very problematic on NUMA systems. An earlier look at SQL Server 2005 RTM showed no such problems on NUMA.
 
In SQL Server 2005, and 2008, there is actually a decrease in CPU going from Max DOP 1 to 2. This is mostly attributed to the bitmap filter in hash operations. Some queries show a significant drop in CPU from DOP 1 to 2, others no change, and some an increase. From DOP 2 to 4 there is a slight increase in CPU and more significant in going from DOP 4 to 8. This might indicate that DOP 2 and 4 are very good for overall efficiency, benefitting from bitmap filters in hash join operations, yet without incurring excessive parallelism overhead. (This is unrelated to the recommendation of Max DOP 4 on Itanium systems based on cores per NUMA node). Unrestricted parallelism on the 8 core system yields the best single stream completion times, although this should really be tested on 16 or more cores before setting any rules.
 
In the transition from SQL Server 2000 to 2005 RTM, both 32-bit, the duration performance gain is a modest 15% for non-parallel plans and a very substantial 49% at Max DOP 8. From SQL Server 2005 32-bit to 64-bit, both RTM builds, the performance gain was a solid 20%. The CPU efficiency improvement was a little less, so the tempdb configuration affects the results. Even though the entire data fits in memory, a large query with intermediate results is more likely to spool to tempdb in 32-bit than 64-bit. From SQL Server 2005 64-bit RTM to Service Pack 2, an additional 10% was realized at DOP 2 and higher.
 
SQL Server 2008 RTM is marginally better than SQL 2005 SP2. There is significant variation from query to query, so improvements should be expected over time hopefully to correct the query plans that are slower while maintaining the performance advantage of plans that are better. One of the big disasters in 2008 parallel execution plans occurs on Query 5, Local Supplier Volume. The query is:
 
อ้างถึง
SELECT N_NAME, SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)) AS REVENUE
FROM CUSTOMER, ORDERS, LINEITEM, SUPPLIER, NATION, REGION
WHERE C_CUSTKEY = O_CUSTKEY AND L_ORDERKEY = O_ORDERKEY
 AND L_SUPPKEY = S_SUPPKEY AND C_NATIONKEY = S_NATIONKEY
 AND S_NATIONKEY = N_NATIONKEY AND N_REGIONKEY = R_REGIONKEY
 AND R_NAME = 'ASIA'
 AND O_ORDERDATE >= '1994-01-01'
 AND O_ORDERDATE < CONVERT(DATE,DATEADD(YY, 1, '1994-01-01'))
GROUP BY N_NAME
ORDER BY REVENUE DESC
 
The MaxDOP 1 plan is essentially:
 
SELECT N_NAME, SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)) AS REVENUE
FROM SUPPLIER
INNER JOIN (
 SELECT N_NATIONKEY, N_NAME, L_EXTENDEDPRICE, L_DISCOUNT, L_SUPPKEY
 FROM NATION
 INNER JOIN REGION ON N_REGIONKEY = R_REGIONKEY
 INNER JOIN CUSTOMER ON C_NATIONKEY = N_NATIONKEY
 INNER JOIN ORDERS ON C_CUSTKEY = O_CUSTKEY
 INNER JOIN LINEITEM ON L_ORDERKEY = O_ORDERKEY
 WHERE R_NAME = 'ASIA'
  AND O_ORDERDATE >= '1994-01-01'
  AND O_ORDERDATE < CONVERT(DATE,DATEADD(YY, 1, '1994-01-01'))
 ) x ON L_SUPPKEY = S_SUPPKEY AND S_NATIONKEY = N_NATIONKEY
GROUP BY N_NAME ORDER BY REVENUE DESC
OPTION (FORCE ORDER)

At MaxDOP 1, the actual CPU is 21,965 ms for the original query,
the MaxDOP 2, the CPU is 28,721ms for the original.
MaxDOP 2 CPU for the forced query is 13,323.
So this one query added 15.4 CPU-sec to the total 22 query 324.3 CPU-sec,
close to 5%, and about 8.0sec duration.
Query 8 was also bad news on the parallel plans, with about 5 CPU-sec lost on the MaxDOP 2 parallel plan compared with a forced parallel plan modeled on the non-parallel plan. One might think that MS should have caught these anomalies. I think the reason they do not is that MS does not look at SF1-30 TPC-H results. The minimum for publication is 100GB, and that will probably increase to 300GB soon, because 30GB is not a real data warehouse. I do think MS should look very carefully at SF1-30. The queries are at the onset of eligibility for parallelism. The really big queries in SF100 and higher are less likely to encounter plan problems. While not strictly a data warehouse, most transactional databases I have seen do not remotely resemble TPC-C or E. I would say most have TPC-H SF1-10 sized queries mixed in with smaller transactions. So a bad execution plan can be really bad news.
 
I am sufficiently satisfied that SQL Server 2008 has a very powerful engine, and a decent optimizer. However, I have complained in the past about the rigid assumptions that all query costs factor in IO time, the use a fixed random to sequential IO performance model (320 IOPS to 10.5MB/sec) and an out of balance IO-CPU ratios. If a proper calibration of the true cost formulas were to be done, there would probably be fewer silly mistakes resulting in goofy execution plans. Given that many people do not know how to diagnose this type of problem, a simple test of 2000 or 2005 and 2008 can encounter this matter, leading to a decision to stay with 2000/2005, when a few simple adjustments would have corrected the 2008 results.
 
SQL Server Settings
อ้างถึง
Generally I follow the HP TPC-H publications on optimization settings, particularly -E and -T834. Neither changed results by more than 1% either way. I had also looked at -T2301 in the past finding no apparent differences. I really would like MS to provide more details on T2301. Are there set points below which it has no effect?


SQL Server 2008 new Date data type changes
The 3 datatime columns in the LineItem table from 2005 become Date columns, for an apparent savings of 12 bytes. The 2005 tpch SF10 database was 13.77GB (rather million KB) data and 3.68G indexes for a total of 17.46G. In 2008, using the Date data type in place of datetime, the size is 12.77 data and 2.96G index for a total of 15.74G. The average bytes per row of LineItem drops from 169 to 153, because one of the DateTime/Date columns was the cluster key.
Nornally, a simple reduction in size on column width, not row count, does not improve performance unless it impacts fit in memory. I always try to exclude this factor because one can generate any difference in performance by adjusting the amount of disk IO.
The original TPC-H queries may have SARG of the form
  AND O_ORDERDATE >= '1994-01-01'
  AND O_ORDERDATE < DATEADD(YY, 1, '1994-01-01'))
Even before SQL 2008, the date functions would return a datetime or smalldatetime result as appropriate. In SQL 2008, the nature extension is to return a date type when the comparison is a date column. I made this request in connect and was told to bugger off. So SQL 2008 will convert the column to date time to equate with the function, losing the benefit of a proper SARG. Anyone upgrading to SQL 2008 with the date type and not changing code as below may get a nasty suprise.
  AND O_ORDERDATE >= '1994-01-01'
  AND O_ORDERDATE < CONVERT(DATE,DATEADD(YY, 1, '1994-01-01'))
Little things like this can cause people to refuse to budge from SQL 2000, which really needs to be retired.
 
Duration for SQL 2008 64-bit
1P D   2P D   4P D   8P D
Q1   50,013   26,317   12,591   7,159
Q2   504   268   150   107
Q3   16,296   5,186   3,158   1,902
Q4   19,232   5,288   3,452   2,340
Q5   21,648   16,150   8,371   5,120
Q6   1,845   929   496   312
Q7   17,397   4,369   2,388   1,376
Q8   5,734   6,765   3,628   1,849
Q9   48,361   22,034   11,335   6,372
Q10   15,281   5,822   3,595   2,425
Q11   4,423   1,238   657   600
Q12   9,363   4,828   4,356   2,365
Q13   21,699   11,310   5,751   2,967
Q14   2,146   1,033   547   334
Q15   1,368   970   521   249
Q16   6,599   3,615   2,018   1,848
Q17   1,243   521   294   213
Q18   50,909   27,945   15,439   9,365
Q19   2,096   1,093   607   378
Q20   841   430   255   165
Q21   69,191   22,064   12,826   8,337
Q22   8,946   3,213   1,592   1,010
375,135   171,390   94,028   56,795

my apologies, Linchi post SQL 2005 64-bit results, so my duration results for SQL 2005 64-bit, SP2 (no cu) below

1P D   2P D   4P D   8P D
Q1   64,761   32,553   16,428   8,344
Q2   504   295   158   106
Q3   14,733   4,782   3,003   1,989
Q4   17,506   5,338   3,747   2,519
Q5   19,716   7,376   4,654   3,159
Q6   1,609   893   471   309
Q7   15,855   5,472   3,306   2,403
Q8   5,225   2,391   1,333   2,147
Q9   44,611   23,291   12,213   7,222
Q10   13,989   6,384   3,934   2,754
Q11   4,093   1,192   669   495
Q12   8,166   4,497   4,022   1,714
Q13   25,830   13,566   7,521   4,260
Q14   2,060   1,020   526   352
Q15   1,358   1,931   1,139   235
Q16   6,476   3,476   2,429   1,215
Q17   1,012   524   291   199
Q18   46,954   26,156   13,896   9,209
Q19   2,133   1,172   623   450
Q20   830   446   253   172
Q21   64,231   20,850   12,536   9,087
Q22   8,722   2,972   1,692   1,049
370,374   166,579   94,844   59,388
บันทึกการเข้า

Freelance implementer Microsoft, Linux System, Data Storage and Virtualization Technology.

Twitter : http://twitter.com/pumpithu
Facebook : http://www.facebook.com/pumpithu
 
หน้า: [1]   ขึ้นบน
  พิมพ์  
 
กระโดดไป:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.11 | SMF © 2006-2008, Simple Machines LLC | Thai language by ThaiSMF Valid XHTML 1.0! Valid CSS!