...making Linux just a little more fun!
By Jim Dennis, Karl-Heinz Herrmann, Breen, Chris, and... (meet the Gang) ... the Editors of Linux Gazette... and You!
From Jimen Ching
Answered By: Jimmy O'Regan
Hi TAG,
I have a problem of reading and writing very large (4MB) buffers to a disk via Fibre Channel. Fibre Channel works best when you send large amounts of data over the wire (light).
I've done google searches and found approaches like O_DIRECT and mmap. Mmap doesn't sound like what I'm looking for, because it still uses the buffer cache. And with 4MB of data, I don't want the extra copy. Also, I won't be reading the data back from the disk. So the buffer cache doesn't buy me anything...
The O_DIRECT approach sounds better. But it requires aligned buffers, and some people say the throughput is worst than non-O_DIRECT. My target throughput is 95MB/s. This shouldn't be a problem for the hardware since I'm using the CompactPCI bus and SCSI RAID over Fibre Channel with theoretical throughput of 150MB/s. The aligned buffers issue is only a problem because of the file header that I must prefix to the 4MB raw data. It would be preferable if I didn't have to align my buffers. But I can work around it if it is absolutely necessary.
I've done some basic benchmarks using regular fopen/fread/fwrite, and I'm only getting 50MB/s with ext3fs. This is half the throughput I need and 1/3th the theoretical throughput of the hardware. So I was wondering if your team has come across any ideas on how to solve this problem. Note, I'm not setting any special options. So this benchmark is just the baseline. I'm looking for ways to optimize the reading and writing of this 4MB data buffer.
[Jimmy] If you have enough RAM, try using a ramdisk - create a filesystem as usual, but on one of the ramdisk devices - /dev/ram* or /dev/rd/*
The ramdisk will be 4M by default, but if you have it compiled as a module you can specify the size as an option to insmod:
insmod rd rd_size=20000
(which sets it as 20M) or as an option in /etc/conf.modules
options rd rd_size=20000
[Thomas] Note that it used to be the case that /etc/conf.modules was synmlinked to /etc/modules.conf . On many systems this is not usually the case anymore, and so /etc/modules.conf is generally the prefered location.
[Jimmy] If your ramdisk support is compiled into the kernel, you'll need to set the size at boot. You can append the option (in LILO, or as a boot option) like this:
ramdisk_size=20000
I'm not sure I understand the answer. Or maybe I didn't explain my question clearly.
[Jimmy] No, I the misunderstanding was on my part. I was simply answering this: "I'm looking for ways to optimize the reading and writing of this 4MB data buffer."
I want to write raw data to a disk via Fibre Channel. Each block of raw data is 4MB large. I need to write 95MB/s of data for about half an hour or so. 95MB/s, at 60 sec per minute, and 30 minutes equals 171Gig.
I guess I could put one second worth of raw data into ramdisk, and do a copy to the Fibre Channel SCSI RAID. Then write the next second of raw data to another ramdisk and switch back-and-forth. But I'm not sure if a cp is any faster than a fwrite. Is this what you're suggesting?
[Jimmy] No, I was placing more importance on the step where you add a file header to the data in the buffer.
Going by this: http://linuxgazette.net/102/piszcz.html you'd be much better off accessing the disk as ext2 instead of ext3 - the journal is probably what's slowing you down.