Tux

...making Linux just a little more fun!

large file server/backup system: technical opinions?

Karl-Heinz Herrmann [kh1 at khherrmann.de]


Sun, 20 Jan 2008 18:09:08 +0100

Hi Tags's,

at work we are suffering from the ever increasing amount of data. This is a Medical Physics Group working with MRI (magnetic resonance imaging) data. In worst case scenarios we can produce something like 20GB of data in an hour scantime. Luckily we are not scanning all the time .-) Data access safety is mostly taken care of by firewalls and access control outside our responsibility. But storing and backups are our responsibility.

Currently we have about 4-6 TB distributed over two "fileservers" (hardware raid5 systems) and two systems are making daily backups of the most essential part of these data (home, original measurement data). The backup machines are taking more than a full night by now and can't handle anything while backuppc is still sorting out the new data. The machine the backup is from is fine by morning.

We will have a total of three number crunching machines over the year and at least these should have speedy access to these data. Approx. 20 hosts are accessing the data as well.

Now we got 10k EU (~15k $US) for new backup/file storage and are thinking about our options:

* Raid system with iSCSI connected to the two (optimally all three)
  number crunchers which are exporting the data to the other hosts via
  NFS. (eSATA any good?)
 
* an actual machine (2-4 cores, 2-4GB RAM) with hardware raid (~24*1TB)
  serving the files AND doing the backup (e.g. one raid onto another
  raid on these disks) 
 
* A storage solution using fibre-channel to the two number crunchers.
  But who does the backup then? The oldest number cruncher might be
  able to handle this nightly along with some computing all day. But it hasn't
  got the disk space right now. 
The surrounding systems are all ubuntu desktops, the number crunchers will run ubuntu 64bit and the data sharing would be done by NFS -- mostly because I do not know of a better/faster production solution.

The occasional Win-access can be provided via samba-over-nfs on one of the machines (like it does now).

Now I've no experience with iSCSI or fibre channel under Linux. Will these work without too much of trouble setting things up? Any specific controllers to get/not to get? Would the simultaneous iSCSI access from two machines to the same raid actually work?

I also assume all of the boxes have 2x 1Gbit ethernet so we might be able to set up load balancing -- but the IP and load balancing would also have been tought to our switches I guess -- And these are "outside our control", but we can talk to them. Is a new multi core system (8-16 cores, plenty RAM) able to saturate the 2xGbit? Will something else max out (hypertransport, ... )?

Any ideas -- especially ones I did not yet think of -- or experiences with any of the exotic hardware is very much welcome....

Karl-Heinz


Top    Back


Justin Piszcz [jpiszcz at lucidpixels.com]


Sun, 20 Jan 2008 16:26:19 -0500 (EST)

On Sun, 20 Jan 2008, Karl-Heinz Herrmann wrote:

> Hi Tags's,
>
> at work we are suffering from the ever increasing amount of data.
> This is a Medical Physics Group working with MRI (magnetic resonance
> imaging) data. In worst case scenarios we can produce something like
> 20GB of data in an hour scantime. Luckily we are not scanning all the
> time .-) Data access safety is mostly taken care of by firewalls and
> access control outside our responsibility. But storing and backups
> are our responsibility.
>
>
> Currently we have about 4-6 TB distributed over two
> "fileservers" (hardware raid5 systems) and two systems are making daily
> backups of the most essential part of these data (home, original
> measurement data). The backup machines are taking more than a full
> night by now and can't handle anything while backuppc is still sorting
> out the new data.  The machine the backup is from is fine by morning.
>
> We will have a total of three number crunching machines over the year
> and at least these should have speedy access to these data. Approx. 20
> hosts are accessing the data as well.
>
>
> Now we got 10k EU (~15k $US) for new backup/file storage and are
> thinking about our options:
>
> * Raid system with iSCSI connected to the two (optimally all three)
>  number crunchers which are exporting the data to the other hosts via
>  NFS. (eSATA any good?)
>
> * an actual machine (2-4 cores, 2-4GB RAM) with hardware raid (~24*1TB)
>  serving the files AND doing the backup (e.g. one raid onto another
>  raid on these disks)
>
> * A storage solution using fibre-channel to the two number crunchers.
>  But who does the backup then? The oldest number cruncher might be
>  able to handle this nightly along with some computing all day. But it hasn't
>  got the disk space right now.
>
>
> The surrounding systems are all ubuntu desktops, the number crunchers
> will run ubuntu 64bit and the data sharing would be done by NFS --
> mostly because I do not know of a better/faster production solution.
>
> The occasional Win-access can be provided via samba-over-nfs on one of
> the machines (like it does now).
>
>
> Now I've no experience with iSCSI or fibre channel under Linux. Will
> these work without too much of trouble setting things up? Any specific
> controllers to get/not to get? Would the simultaneous iSCSI access from
> two machines to the same raid actually work?
>
> I also assume all of the boxes have 2x 1Gbit ethernet so we might be
> able to set up load balancing -- but the IP and load balancing
> would also have been tought to our switches I guess -- And these are
> "outside our control", but we can talk to them. Is a new multi core
> system (8-16 cores, plenty RAM) able to saturate the 2xGbit? Will
> something else max out (hypertransport, ... )?
>
>
> Any ideas -- especially ones I did not yet think of --  or experiences
> with any of the exotic hardware is very much welcome....
>
>
>
> Karl-Heinz

Not sure on your budget but if you got a tape library and an SL500 and some tape drives, use Veritas NetBackup it would take care of that no problem.

Although a tape library for 4-6TB is probably over-kill, if you had 100TB+ you may want tape :)

But if you want a real solution, I'd go with an SL500 and 2-4 LTO-3 or LTO-4 drives. LTO-3 tape is 400GB uncompressed, LTO-4 is 800GB, but LTO-3 is currently the sweet spot for $38-40/tape.

Justin.


Top    Back


Ramon van Alteren [ramon at forgottenland.net]


Mon, 21 Jan 2008 13:02:37 +0100

Karl-Heinz Herrmann wrote:

> Hi Tags's,
>
> at work we are suffering from the ever increasing amount of data. 
> This is a Medical Physics Group working with MRI (magnetic resonance
> imaging) data. In worst case scenarios we can produce something like
> 20GB of data in an hour scantime. Luckily we are not scanning all the
> time .-) Data access safety is mostly taken care of by firewalls and
> access control outside our responsibility. But storing and backups
> are our responsibility. 
>
>
> Currently we have about 4-6 TB distributed over two
> "fileservers" (hardware raid5 systems) and two systems are making daily
> backups of the most essential part of these data (home, original
> measurement data). The backup machines are taking more than a full
> night by now and can't handle anything while backuppc is still sorting
> out the new data.  The machine the backup is from is fine by morning. 
>
> We will have a total of three number crunching machines over the year
> and at least these should have speedy access to these data. Approx. 20
> hosts are accessing the data as well. 
>
>
> Now we got 10k EU (~15k $US) for new backup/file storage and are
> thinking about our options:
>
> * Raid system with iSCSI connected to the two (optimally all three)
>   number crunchers which are exporting the data to the other hosts via
>   NFS. (eSATA any good?)
>
> * an actual machine (2-4 cores, 2-4GB RAM) with hardware raid (~24*1TB)
>   serving the files AND doing the backup (e.g. one raid onto another
>   raid on these disks) 
>
> * A storage solution using fibre-channel to the two number crunchers.
>   But who does the backup then? The oldest number cruncher might be
>   able to handle this nightly along with some computing all day. But it hasn't
>   got the disk space right now. 
>   
Have a look at coraid, they make very reasonably priced appliances with upto 15Tb capacity depending on the raid-config you create. It's AoE storage but has been working reasonably well for us, don't expect stellar performance but it should sufficient for your backup needs.

I have them deployed in two different configs:

* GFS-clustered with a 5-node cluster on top
* Standalone node with hot-standby
The latter option provides good performance, GFS1 is suffering from lock-contention due to heavy writing and many many files in our setup. I would definitly not recommend that if you need speedy access.

We currently buy dells at a reasonable pricepoint with 4Tb storage each, maybe that would be interesting for the number-crunchers? Fibre storage and backup is going to be a tight fit with you budget...

Additionally I would want to seperate the workloads:

* fast-diskaccess for numbercrunching
* reliable but slow access for backup
I'd try and move the backupschedule to contiunuously if at all possible, but if that's possible at all is impossible to extract from your problem description. That way you'd open up your backupwindow. It does assume seperate architectures for backup and crunching. Depending on the time requirements for the data on the fileservers you could move them to the backup ?

Best Regards,

Ramon


Top    Back


Karl-Heinz Herrmann [kh1 at khherrmann.de]


Tue, 22 Jan 2008 13:35:31 +0100

Hi Justin,

thanks for your suggestion to look into the SUN Tape Solutions.

> Date: Sun, 20 Jan 2008 16:26:19 -0500 (EST)
> From: Justin Piszcz <[email protected]>
> To: The Answer Gang <[email protected]>
> Subject: Re: [TAG] large file server/backup system: technical
> opinions?

I've used an old SCSI tape drive way back when these had 4GB/tape -- and frankly the data handling was a pain in the ass (tar streams).

> Not sure on your budget but if you got a tape library and an SL500
> and some tape drives, use Veritas NetBackup it would take care of
> that no problem.

Well -- the SL500 would be outside our budget. Also the specs are quite a bit more than what we would need in the few coming years.

The SL48 on the other hand might just about fall into budget range.

> Although a tape library for 4-6TB is probably over-kill, if you had
> 100TB+ you may want tape :)

Right now we have about these 6TB on drives. This is growing and we have to archive the old data, we can't just throw them out at some time.

> But if you want a real solution, I'd go with an SL500 and 2-4 LTO-3
> or LTO-4 drives.  LTO-3 tape is 400GB uncompressed, LTO-4 is 800GB,
> but LTO-3 is currently the sweet spot for $38-40/tape.

One thing is not yet quite clear to me. I connect that SL500 (or SL48) via FC or SCSI to a computer. Then the whole SL500 looks like one giant tape? Or how is this represented to the outside? So for archive/retrieval I would definitely need an additional software (like Veritas NetBack you mentioned)?

With the current budget of ~15k$ we need basically both -- new disk space and a way to back up the new disk space. So we might have to stick to backuppc as software and two raids -- one data, one backup and plan for a tape archiving system next year.

Can the tape handle something like "raid1"? I've no good feeling putting data on the tape and deleting all other copies. That's also the reason why I woul try to adapt the hard drive space first so we at least can accomodate this years growing data needs (including a second copy on different hard drives).

Is there also software around which would transparently pull old data of a disk array and store them on tape? and retrieve the files if you access them? Research center Juelich had that years ago when I was doing my PhD there. What price tags would we be talking then?

Ah.. I see: Suns "BakBone NetVault" can do D2D2T..... I'll go read some more.... Thanks again for pointing these Tape systems out to me.

K.-H.


Top    Back


Justin Piszcz [jpiszcz at lucidpixels.com]


Tue, 22 Jan 2008 13:20:45 -0500 (EST)

On Tue, 22 Jan 2008, Karl-Heinz Herrmann wrote:

> Hi Justin,
>
> thanks for your suggestion to look into the SUN Tape Solutions.
>
> I've used an old SCSI tape drive way back when these had 4GB/tape --
> and frankly the data handling was a pain in the ass (tar streams).

Without (enterprise backup software) such as NetBackup (#1) or other, (Legato, or others) it is very painful.

>> Not sure on your budget but if you got a tape library and an SL500
>> and some tape drives, use Veritas NetBackup it would take care of
>> that no problem.
>
>
> Well -- the SL500 would be outside our budget. Also the specs are quite
> a bit more than what we would need in the few coming years.
>
> The SL48 on the other hand might just about fall into budget range.

Ok.

>> Although a tape library for 4-6TB is probably over-kill, if you had
>> 100TB+ you may want tape :)
>
> Right now we have about these 6TB on drives. This is growing and we
> have to archive the old data, we can't just throw them out at some
> time.

That is where tape comes in, 6TB is nothing and if its compressible data you'll see great returns.

>> But if you want a real solution, I'd go with an SL500 and 2-4 LTO-3
>> or LTO-4 drives.  LTO-3 tape is 400GB uncompressed, LTO-4 is 800GB,
>> but LTO-3 is currently the sweet spot for $38-40/tape.
>
> One thing is not yet quite clear to me. I connect that SL500 (or SL48)
> via FC or SCSI to a computer. Then the whole SL500 looks like one giant
> tape?  Or how is this represented to the outside? So for
> archive/retrieval I would definitely need an additional software
> (like Veritas NetBack you mentioned)?

The SL500 connects via (either Fiber Channel or SCSI)- that is the robotic controller, which is at the top of the unit.

The drives are connected separately via either (Fiber Channel or SCSI).

> With the current budget of ~15k$ we need basically both -- new disk
> space and a way to back up the new disk space. So we might have to
> stick to backuppc as software and two raids -- one data, one backup and
> plan for a tape archiving system next year.

One nice thing about tape is it does not require power and it also is nice in the event of a disaster or someone accidentally running rm -rf on the wrong directory or directory/ext3/filesystem corruption/etc.

> Can the tape handle something like "raid1"? I've no good feeling
> putting data on the tape and deleting all other copies. That's also the
> reason why I woul try to adapt the hard drive space first so we at
> least can accomodate this years growing data needs (including a second
> copy on different hard drives).

You could backup what you have on disk and then run incrementals over them, LTO-2/3/4 technology it quite good, as long as you keep you clean your tape drives regularly, they're fairly reliable.

> Is there also software around which would transparently pull old data
> of a disk array and store them on tape? and retrieve the files if you
> access them? Research center Juelich had that years ago when I was
> doing my PhD there. What price tags would we be talking then?

Some companies actually do this for their web orders/etc-- you would need to create scripts that pull the files off the tapes and backup as needed. It is a single command either way in NetBackup (bpbackup or bprestore).

>
> Ah.. I see: Suns "BakBone NetVault" can do D2D2T..... I'll go read some
> more.... Thanks again for pointing these Tape systems out to me.

NetBackup can also do this.

Including the veritas-bu mailing list on this thread as well, they may also have some good insight into your problem.

Justin.


Top    Back


Karl-Heinz Herrmann [kh1 at khherrmann.de]


Wed, 23 Jan 2008 00:23:45 +0100

Hi Ramon,

On Mon, 21 Jan 2008 13:02:37 +0100 Ramon van Alteren <[email protected]> wrote:

> Have a look at coraid, they make very reasonably priced appliances

Hm... these coraid systems look intersting as well.

> with upto 15Tb capacity depending on the raid-config you create.
> It's AoE storage but has been working reasonably well for us, don't 
> expect stellar performance but it should sufficient for your backup
> needs.

I had never heard of AoE before.... the kernel module works reliable I understand from the above? When they say 2x1GBethernet -- can this be easily load balanced? Or would that be useful for connecting to two different hosts only?

> I have them deployed in two different configs:
> * GFS-clustered with a 5-node cluster on top
> * Standalone node with hot-standby

Can you comment on GFS vs. NFS for a small number (~10) of hosts with mostly read access? Might GFS be something to consider for NFS replacement?

> We currently buy dells at a reasonable pricepoint with 4Tb storage
> each, maybe that would be interesting for the number-crunchers?

we are shopping for an AMD 8x quad core as soon as they exist in bug free stepping and want to put some 64GB RAM in that. We were thinking quite a while about cluster vs. SMP multi core system. Finally we decided for regular image reconstruction and post processing it doesn't matter and some people in our workgroup do finite element grid caclulations and inverse problems (EEG source localisations) and for that LOTS of RAM to keep the grid data out of swap are a very good thing. Also standard tools like matlab and toolboxes are able to make good use of multiple cores and less so of distributed clusters it seems. The other number crunsher will probably be Intel with less cores but better performance per core for the less parallellisable stuff.

> Fibre storage and backup is going to be a tight fit with you budget...

We would have tried to put the two FC controllers in the budget for the two number crunshers. Otherwise yes, that would be to big a chunk out of the 15k$.

> Additionally I would want to seperate the workloads:
> * fast-diskaccess for numbercrunching
> * reliable but slow access for backup

Hm.. yes. Right now planning to run some scratch drives (maybe even raid0) in the crunshers for fast local access. Once doen the data can be put out on some storage via NFS.

> I'd try and move the backupschedule to contiunuously if at all
> possible, but if that's possible at all is impossible to extract from
> your problem description.

Not with the current software (backuppc). We've one rather mediocre box which handles a secondary backup without a hitch for quite some time now. But that's the offsite remote backup which doesn't do much otherwise. The "primary backup" is simply a second raid in the main fileserver and while that is running the fileserver is awfully slow. So we need to get the backuper away from the backupped data. Or maybe plenty of cores and two individual RAID controllers might help?

during daytime lots of files will change but basically there wouldn't be a serious problem with backing stuff up as soon as it changes. Could you recommend software doing that?

> That way you'd open up your backupwindow. It does assume seperate 
> architectures for backup and crunching.
> Depending on the time requirements for the data on the fileservers
> you could move them to the backup ?

That's under discussion here. We've plenty of dicom files (i.e. medical images) which are basically sets of files in a dir, size varies from a few k to maybe 3MB each. Now we don't use much of the older data, so these could be moved into some kind of long time storage and some time penalty to get them back wouldn't hurt much. But attached to these is a data base keeping track of meta data and we have to be careful not to break anything. The dicom server ctn handles this data base and accepts files from other dicom nodes (like MR scanner) and stores the files. Unfortunately the guys writing ctn forgot the cleaning tools (move, remove, ...) and we are putting some effort in writing tools right now.

Also from analyzing the disk space usage these dicom images seem to grow steadily but rather slowly. The major mass of data recently are raw data from the scanner which can easily become 4 to 6 GB each and represent a short expermient of 15 minutes. We will work more and more on these raw data for experimental image reconstruction. One of the number crunchers jobs will be to read these 5GB junks and spit out 50-300 of single images, so reading large continuous data and writing many small files (no GFS for that I presume).

Coming back to the coraid AoE boes...... apart from that extensibility by plugging in another AoE device once the first is full I can't really see a big difference to an actual computer (lets say 4x4 cores), 8-16GB RAM and 24 1TB drives connectzed to 2 PCI express(x8) RAID controllers. We got an offer for something similar at 12kEU (a little bit too much right now, but drive cost should be dropping). But the coraid had some 5 to 7 k$ price tags without the drives. And we don't get the computer running the backup software. The hot swap bays and redundant power was also there. Am I missing something the coraid can do natively that a computer running Linux could not easily replicate?

boy this got long..... anyway. Any thoughts welcome .-)

K.-H.


Top    Back


Ramon van Alteren [ramon at forgottenland.net]


Wed, 23 Jan 2008 21:11:17 +0100

Hi Karl-Heinz,

It's been a long day, sorry for the late reply.

Karl-Heinz Herrmann wrote:

>> with upto 15Tb capacity depending on the raid-config you create.
>> It's AoE storage but has been working reasonably well for us, don't 
>> expect stellar performance but it should sufficient for your backup
>> needs.
>>     
>
> I had never heard of AoE before.... the kernel module works reliable I
> understand from the above? When they say 2x1GBethernet -- can this be
> easily load balanced? Or would that be useful for connecting to two
> different hosts only? 
>   
New versions support loadbalancing out of the box without any advanced trickery, make sure you get those if you buy them. The older once had a hardware issue, the secondary nic shared the PCI-bus with something else (forgot what) which ate up so much PCI bandwidth that loadbalancing the network traffic would actually result in a reduction of performance :-(

If you use them make sure you use the kernel-module supplied by coraid, the kernel lags several versions usually and the coraid ones perform much better in general.

> Can you comment on GFS vs. NFS for a small number (~10) of hosts with
> mostly read access? Might GFS be something to consider for NFS
> replacement? 
>   
Sure, GFS is a filesystem designed by Sistina and bought by RedHat. It's primary goal is to allow several hosts in a cluster to share the same shared storage pool over network and write to it concurrently. Nodes in the cluster see each others updates. It performs reasonably well but very poorly under specific workloads. RedHat is aware of this problem and has redesigned the cluster locking and filesystem symantics to counter the problem, this is integrated into the mainline kernel. Sadly no one considers it production quality yet(!).

If you do straight readonly access with only ~10-20 hosts NFS is definitly the way to go. If properly tuned it scales nicely in the read-only version and performs better than gfs. Downside is that you introduce a Single Point of Failure with the NFS server, but the downside of a GFS cluster is the overhead of locking between nodes. Apart from that GFS needs a cluster and thus cluster architecture, main requirement is that each node needs to be able to powerdown a non-communicating node to prevent run-away nodes to cause filesystem corruption. It's not really complicated or expensive but it adds up :-(

We've been able to scale NFS readonly to roughly 250-400 hosts without any problems, though not in the data-volumes you are talking about.

>> We currently buy dells at a reasonable pricepoint with 4Tb storage
>> each, maybe that would be interesting for the number-crunchers?
>>     
> we are shopping for an AMD 8x quad core as soon as they exist in bug
> free stepping and want to put some 64GB RAM in that. We were thinking
> quite a while about cluster vs. SMP multi core system. Finally we
> decided for regular image reconstruction and post processing it doesn't
> matter and some people in our workgroup do finite element grid
> caclulations and inverse problems (EEG source localisations) and for
> that LOTS of RAM to keep the grid data out of swap are a very good
> thing. Also standard tools like matlab and toolboxes are able to make
> good use of multiple cores and less so of distributed clusters it
> seems. The other number crunsher will probably be Intel with less cores
> but better performance per core for the less parallellisable stuff.
>   
I would need a much better understanding of the process and workloads involved to be able to say something meaningful in a technical sense. Wouldn't it be possible to split and parallelize work so it can process chunks of data, that would allow you to use more but lower powered machines.

In my experience, anything that you need to buy at the top of the performance spectrum is overpaid. if you can work out a way to do the same work with 8 quad-cores with 8Gb RAM servers, you might spend 25% of what you would spend on a top-of-the-line server.

>> Additionally I would want to seperate the workloads:
>> * fast-diskaccess for numbercrunching
>> * reliable but slow access for backup
>>     
>
> Hm.. yes. Right now planning to run some scratch drives (maybe even
> raid0) in the crunshers for fast local access. Once doen the data can
> be put out on some storage via NFS.
>   
Sounds like a plan depending on the data-security the company or organisation needs. If the data is hard or impossible to reproduce, some people are bound to get extremely pissed if one disk in the stripe-set fails.

> Not with the current software (backuppc). We've one rather mediocre box
> which handles a secondary backup without a hitch for quite some time
> now. But that's the offsite remote backup which doesn't do much
> otherwise. The "primary backup" is simply a second raid in the main
> fileserver and while that is running the fileserver is awfully slow. So
> we need to get the backuper away from the backupped data. Or maybe
> plenty of cores and two individual RAID controllers might help? 
>   
Maybe the problem is in the current software you are using ? I've never heard from backuppc. For continuous backup I'd look at the usual suspects at first, rsync, tar etc.

Continous backup needs to be designed carefully, if someone actually deletes a file from primary storage and the backup is near instantaneous, you will find that you have no way to restore "human errors"

> during daytime lots of files will change but basically there wouldn't
> be a serious problem with backing stuff up as soon as it changes. Could
> you recommend software doing that? 
>   
I'd look into drbd, it keeps two disks in sync over the network.

http://www.drbd.org/

I'm hearing rumours that they are working on two-way syncing and even three-way syncing but haven't had time to research yet. Be aware that drbd is near instantaneous and thus suffers from the problem above.

> That's under discussion here. We've plenty of dicom files (i.e. medical
> images) which are basically sets of files in a dir, size varies from a
> few k to maybe 3MB each.  Now we don't use much of the older data, so
> these could be moved into some kind of long time storage and some time
> penalty to get them back wouldn't hurt much. But attached to these is a
> data base keeping track of meta data and we have to be careful not to
> break anything. The dicom server ctn handles this data base and accepts
> files from other dicom nodes (like MR scanner) and stores the files.
> Unfortunately the guys writing ctn forgot the cleaning tools (move,
> remove, ...) and we are putting some effort in writing tools right now. 
>   
Great :-) It's been a struggle to get that through here as well but we currently have redistribution software written which does the following: * calculate % fill-ratio compared to the other nodes in the storagepool * redistribute (aka pull in case of a lower ratio, push in case of a higher ratio) * update meta-index * delete data after verification on new location

We run that every time we expand the storage pool with extra machines. This is the distributed storage pool which has replaced the coraids by the way.

> Also from analyzing the disk space usage these dicom images seem to grow
> steadily but rather slowly. The major mass of data recently are raw
> data from the scanner which can easily become 4 to 6 GB each and
> represent a short expermient of 15 minutes. We will work more and more
> on these raw data for experimental image reconstruction. One of the
> number crunchers jobs will be to read these 5GB junks and spit out
> 50-300 of single images, so reading large continuous data and writing
> many small files (no GFS for that I presume).
>   
Mmm interesting problem. * So you need to keep the originals (4-6GB) around to do processing on. * Each processing job spits out 50-300 files between 5K - 3000K * Crunchers are cpu-bound but need fast disk-access to originals * Everything needs to be accessible for at least 6-12 months * You need backup

correct ?

> Coming back to the coraid AoE boes...... apart from that extensibility
> by plugging in another AoE device once the first is full I can't really
> see a big difference to an actual computer (lets say 4x4 cores), 8-16GB
> RAM and 24 1TB drives connectzed to 2 PCI express(x8) RAID controllers.
> We got an offer for something similar at 12kEU (a little bit too much
> right now, but drive cost should be dropping). But the coraid had some
> 5 to 7 k$ price tags without the drives.  And we don't get the
> computer running the backup software. The hot swap bays and redundant
> power was also there. Am I missing something the coraid can do natively
> that a computer running Linux could not easily replicate? 
>   
Difference is price, IIRC we've been buying them for 7k including a full set of 750Gb SATA disks. They're listed on the coraid site for $4k without drives. I've bought the large RAID systems as well and the price you quote doesn't sound strange to me, I would definitly not bet on getting a better offer for that configuration. Harddisk prices are dropping but not that fast.

Coraids can be attached to a single host, depending on traffic a single host could be the basis for 3-4 coraids.

Apart from that, you're right, actually internally it looks like the coraids run a modified version of plan9 judging from the cli.

Hope some of this was/is useful

Ramon


Top    Back