Diamond Notes

Just another WordPress weblog

Do You Need a SANdwich?? Part II

So, drumroll…here are the results of testing:

In the quantitative testing I tested against an internal hard drive (formatted with ext3 and then reiserfs), a partition on the Coraid formatted with ext3 and a partition on the Coraid formatted with reiserfs. I also tested with a failed hard drive in the Coraid (while it was rebuilding).

/dev/sda reiserfs (mirrored)

/testing (sdb) reiserfs write (Kbytes/s) /testing (sdb) reiserfs read (Kbytes/s)
17408 64646
17118 64368
17140 64339
16347 64451
17080 64510
17055 64464
17024.67 (AVG) 64463 (AVG)

/dev/sdb1 ext3

/testing (sdb) ext3 write (Kbytes/s) /testing (sdb) ext3 read (Kbytes/s)
41196 54154
40455 54078
40612 54246
40620 54229
41053 54160
40198 54217
40698 (AVG) 54180.67 (AVG)

/dev/sdb1 reiserfs

/testing (sdb) reiserfs write (Kbytes/s) /testing (sdb) reiserfs read (Kbytes/s)
37041 34897
36559 34650
37201 34823
36867 34487
36420 34835
35064 34788
36525.33 (AVG) 34746.67 (AVG)

/dev/sdb1 reiserfs (noatime,notail)

/testing (sdb) reiserfs write (Kbytes/s) /testing (sdb) reiserfs read (Kbytes/s)
37116 35243
36916 35246
36587 35051
36467 35058
37102 34999
36690 35221
36813 (AVG) 35136.33(AVG)

Coraid w/ext3 filesystem, switch and no jumbo frames

/data – ext3 write (Kbytes/s) /data – ext3 read (Kbytes/s)
50266 59311
46901 77020
46478 76883
48248 76896
49829 76969
49381 74925
48517.17 (AVG) 73667.33 (AVG)

Coraid w/reiserfs filesystem, switch and no jumbo frames

/data2 – reiserfs write (Kbytes/s) /data2 – resierfs read (Kbytes/s)
59311 73519
58776 74742
52935 73526
59458 75665
56600 75375
60726 75814
57967.67 (AVG) 74773.5 (AVG)

Coraid w/reiserfs filesystem, no switch and jumbo frames

Jumbo Frames Direct Connect
/data2 – reiserfs write (Kbytes/s) /data2 – resierfs read (Kbytes/s)
63894 99924
62547 96795

Coraid w/reiserfs filesystem, switch and jumbo frames

Jumbo Frames Switch Connect
/data2 – reiserfs write (Kbytes/s) /data2 – resierfs read (Kbytes/s)
62328 95355
64189 98888
60961 96272
63776 95773
64258 97866
62483 98469
62999.17 (AVG) 97103.83 (AVG)

Coraid w/ext3 filesystem, switch and jumbo frames

Jumbo Frames Switch Connect
/data - ext3 write (kbytes/s) /data - ext3 read (Kbytes/s)
58595 101549
58145 98725
58344 100478
59689 102201
59244 101434
63951 99432
59661.33 (AVG) 100636.67 (AVG)

Coraid w/ext3 filesystem, switch, jumbo frames and degraded raid

Jumbo Frames Switch Connect
/data - ext3 write (kbytes/s) /data - ext3 read (Kbytes/s)
12102 23846
27402 24224
3916 28248
15560 45066
38592 53528
30365 30616
21322.83 (AVG) 34254.67 (AVG)

Coraid w/reiserfs filesystem, switch, jumbo frames and degraded raid

Jumbo Frames Switch Connect
/data2 - reiserfs write (kbytes/s) /data2 - reiserfs read (Kbytes/s)
38317 42851
42463 43326
31546 51028
32177 24707
33981 26320
34643 27002
35521.17 (AVG) 35872.33 (AVG)

Coraid w/ocfs2 filesystem, switch and jumbo frames

Jumbo Frames Switch Connect
/data3 - ocfs2 write (kbytes/s) /data3 - ocfs2 read (Kbytes/s)
36849 83958
40973 84855
38983 83853
40748 85188
39143 83710
40535 83146
39538.5 (AVG) 84118.33 (AVG)

Qualitative Testing Results

For the qualitative testing I loaded up a stock binary install of MySQL 5.0. The database (icengine3_2) was loaded from a dump of a production server. I used mybench to execute a query against the database pulling random records from one of the tables. The number of connections utilized are included below with the summary data for each test run. To establish a baseline I ran the test against the MySQL server pulling data off the interal drives (mirrored reiserfs) and a single drive dedicated only to MySQL data. Then I ran the tests against data loaded onto ext3 and reiserfs Coraid partitions. Nothing else was happening to the the Coraid. This would be similar to a single server attached to the Coraid. Then I ran the test against both the ext3 and reiserfs partitions on the Coraid with a throughput test being performed on the partition not being used to hold the mysql data. As an example: if the ext3 partition was holding the data for the test than the reiserfs partition of the Coraid had a throughput (both read and write) test being performed at the same time. This would more closely simulate what it would be like with multiple servers attached to different partitions of the Coraid. Finally, these tests were repeated while one drive of the Coraid was being rebuilt.

/dev/sda drive w/reiserfs (mirrored)

10 connections 20 connections 40 connections 100 connections 400 connections
2429 2862 2736 2636 2417
2997 2874 2747 2631 2416
3081 2851 2758 2634 2414
3069 2841 2759 2646 2413
3104 2869 2750 2640 2417
3085 2862 2773 2631 2415
3097 2853 2771 2632 2410
3068 2855 2755 2629 2414
3078 2858 2767 2621 2412
3113 2842 2753 2641 2420
3012 (AVG QPS) 2856 (AVG QPS) 2756 (AVG QPS) 2634 (AVG QPS) 2414 (AVG QPS)

/dev/sdb1 drive w/reiserfs (noatime, notail)

10 connections 20 connections 40 connections 100 connections 400 connections
2236 2876 2773 2641 2416
2944 2886 2776 2640 2417
3097 2872 2764 2644 2428
3085 2874 2772 2639 2420
3110 2862 2755 2636 2422
3117 2883 2780 2647 2425
3092 2859 2773 2639 2422
3101 2880 2783 2644 2418
3143 2837 2755 2641 2427
3147 2861 2762 2640 2418
3007 (AVG QPS) 2869 (AVG QPS) 2769 (AVG QPS) 2641 (AVG QPS) 2421 (AVG QPS)

/dev/sdb1 drive w/ext3

10 connections 20 connections 40 connections 100 connections 400 connections
2294 2840 2758 2636 2422
2986 2853 2750 2633 2413
3059 2857 2742 2634 2415
3089 2867 2769 2629 2418
3064 2852 2747 2629 2416
3080 2854 2765 2633 2416
3083 2848 2751 2626 2417
3058 2852 2740 2617 2409
3043 2841 2754 2640 2416
3096 2894 2747 2633 2418
2985 (AVG QPS) 2855 (AVG QPS) 2752 (AVG QPS) 2631 (AVG QPS) 2416 (AVG QPS)

Coraid w/reiserfs filesystem, switch, jumbo frames

10 connections 20 connections 40 connections 100 connections 400 connections
3170 2873 2754 2625 2410
3183 2862 2769 2630 2413
3169 2870 2759 2608 2410
3143 2867 2751 2621 2410
3165 2881 2739 2622 2411
3172 2900 2746 2621 2420
3195 2908 2758 2628 2436
3176 2923 2760 2631 2438
3223 2899 2761 2630 2439
3189 2920 2754 2625 2436
3178 (AVG QPS) 2890 (AVG QPS) 2755 (AVG QPS) 2624 (AVG QPS) 2422 (AVG QPS)

Coraid w/ext3 filesystem, switch, jumbo frames

10 connections 20 connections 40 connections 100 connections 400 connections
3061 2849 2714 2572 2361
3070 2812 2715 2573 2364
3051 2824 2711 2577 2363
3060 2831 2695 2576 2367
3026 2819 2713 2561 2364
3072 2822 2707 2567 2362
3088 2813 2706 2564 2363
3031 2815 2702 2566 2363
2966 2845 2704 2581 2365
3092 2832 2699 2568 2355
3051 (AVG QPS) 2826 (AVG QPS) 2706 (AVG QPS) 2570 (AVG QPS) 2362 (AVG QPS)

Coraid w/ext3 filesystem, switch, jumbo frames

These tests were run while iozone was running against the reiserfs partition on the Coraid.

10 connections 20 connections 40 connections 100 connections 400 connections
2989 2837 2749 2637 2418
3077 2850 2747 2633 2421
3093 2857 2750 2629 2420
3137 2870 2746 2635 2417
3083 2880 2743 2631 2422
3015 2853 2745 2640 2420
2938 2857 2749 2640 2424
3087 2849 2747 2635 2418
3063 2890 2749 2646 2421
2748 2841 2764 2639 2423
3023 (AVG QPS) 2858 (AVG QPS) 2748 (AVG QPS) 2636 (AVG QPS) 2420 (AVG QPS)

Coraid w/ext3 filesystem, switch, jumbo frames

These tests were run while iozone was running against the reiserfs partition on the Coraid. In addition the Coraid had a degraded array.

10 connections 20 connections 40 connections 100 connections 400 connections
3053 2844 2726 2621 2446
3060 2845 2728 2624 2421
3061 2837 2746 2636 2420
3117 2828 2734 2640 2419
3018 2855 2740 2625 2421
3001 2934 2731 2623 2420
3177 2792 2753 2626 2425
3006 2889 2744 2619 2428
3001 2844 2752 2622 2418
3052 2861 2733 2614 2412
3054 (AVG QPS) 2852 (AVG QPS) 2738 (AVG QPS) 2625 (AVG QPS) 2423 (AVG QPS)

Conclusions

I think this conclusively proves that the Coraid is capable of handling multiple servers while maintaining reasonable throughput. Even with a degraded drive the actual (read) performance of the MySQL server does not suffer. Heavy write performance to a database would suffer as the write throughput during a rebuild of an array is roughly half of the normal throughput. As for the question about using reiserfs or ext3 as a filesystem the performance numbers are close enough that it would be wise to consider that reiserfs has better functionality under the LVM system that we use on our hard drives. Currently there are six drives (five active) in the Coraid. Increasing the drive count in the coraid will also improve the throughput. According to the numbers released by Coraid it should be a fairly dramatic increase (on the order of 100% faster write performance with a full complement of 14 drives vs the current complement of six drives). Of course increasing the drive count will increase the number of platters and spindles so it would be expected. I also tried bonding two Ethernet ports. This did not increase throughput.

With Ian’s help I did test the Oracle clustering filesystem. At this point it is really to fragile to consider using. In addition, the performance throughput testing that I did perform indicated that it was going to be significantly slower than both ext3 and the reiserfs. While we have to partition off the Coraid and dedicate each specific parition to a server - I think that this is certainly justified for more a more reliable filesystem that gives better performance.

Jumbo frames make a difference. I proved early in the testing that just configuring the switch and the ethernet card for jumbo frames increases raw throughput by around 20 Mbytes a second. NOT ALL ETHERNET CARDS SUPPORT MTUS ABOVE 1500!!! Check with the vendor before purchasing to see if this is supported.

Thanks to both Ian and Justin for their help with LVM, the ocfs and general system crap

2 Comments so far

  1. Stewart Smith July 18th, 2007 8:47 pm

    Would be interesting to see XFS results…

  2. bmurphy July 19th, 2007 6:54 am

    I did think about testing XFS. I have heard some horror stories about it corrupting data files though (if I recall..zeroing thing out essentially). Now that SGI is no longer I am really not sure what level of support there is either.

    I would love to hear from other people about experiences with XFS though. If people are generally positive about it I can run the tests and see what happens. The equipment is still set up.

Leave a reply