Tuesday, March 20, 2012

Cheap and Safe File Storage on Linux

I needed to setup a Samba file server using off the shelf desktop components.
It will be used in a small office with 5-10 clients.

Here are the relevant hardware specs:
Processor: i7
Memory: 16GB
Drives: 4 x 1TB sata drives
OS Installed: Ubuntu Sever 10.04LTS

The primary requirement is for the files to be fairly safe i.e. if the server crashes in the middle of saving a file then on restart, the file should return to the consistent state before the save operation. Performance is secondary to safety.

My initial plan was to use one drive for the OS and configure 3 drives in an mdraid level 5. After some research, I came across several references that state that mdraid level 10 works on odd numbered drives. It will work on 3 drives and even on 2. This looked like something I should try. I then had to decide what filesystem to use. I initially wanted to use xfs as I've heard and read a lot of good things about it.

After studying how to setup both mdraid and xfs, I started a thread on the linux-raid mailing list asking for best practice guidelines and guidance on using xfs with mdraid 10n2.

Several days of additional research guided by the comments on the mailing list thread resulted in a lot of learnings for me:
  • raid10n2 really works on 3 drives, I had my doubts at first but now I finally get how it can work with just three drives
  • drive sector size can be either 512b or 4k which can lead to mis-aligned partitions and have a negative impact on performance
  • ext4 provides the option: data=journal which makes it safer than xfs at the cost of some performance

The table below shows the average of the benchmark results I got. Yes, I know bonnie++ only tests sequential write/rewrite/read and some file operations and is not necessarily representative of the workload but IMHO, it is better than nothing. This was how things were setup for the benchmark:
  • a 50GB primary partition was created on each device located 1GB from the start of the disk for partition alignment
  • the md raid device is created on top of the primary partition created on each device
  • the chunk was set to 64k
  • the results below are for raid5 and raid10n2 only but I also tried raid10f2 and raid10n2
  • raid 10f2 had better read performance but slower write compared to raid 10n2
  • raid 10o2 performed quite closely to raid 10n2
  • the benchmark was run three times and the average was taken

Bonnie 1.96 Sequential Output Sequential Input Random

64k chunk Size Block Rewrite Block
K/sec % CPU K/sec % CPU K/sec % CPU /sec % CPU
32G 144430 11 68725 9.67 249364 21 280.33 9.33
32G 25077 4.67 21605 5 267283 22.33 293.67 6
32G 137365 8 69940 9 209180 16 365.03 11.67
32G 56249 6 37491 6 206548 16 389.00 8

For those interested, you can find/get the script I used here.

I modified this script to run my tests.

You can find the raw results here.

You can also find the debug info here.

I am considering re-running the tests but this time with iozone instead of bonnie++ so that I can get scores for random read and write.

Next, I plan to do pull-the-plug testing on ext4,data=journal on top of raid10n2 while doing streaming writes of a huge file over samba. I plan to use 2 different video files greater than 3GB. First I will copy file1 to the samba share and time it. I will then compute a checksum for file1 both locally and on the samba server. The computed checksum should be equal. Next, I will compute a checksum for file2 locally. It should be different from the value computed for file1. I will then move file1 to another directory and rename file2 to the the same name as file1. I will then attempt to copy the new file1 to the samba server. Halfway into the copy, I will pull the plug of the samba server. Next I will restart the samba server and recompute the checksum of file1. It should still be equal to the computed copy before.

If the system behaves as expected, I will go with ext4,data=journal on top of raid10n2.

I hope you found the post useful. You can subscribe via email or subscribe via a feed reader to get relevant updates from this blog. Have a nice day.

No comments:

Post a Comment