Copy a large number of small files

August 29th, 2008 | Tags:

I’ve recently come up with the need to copy a huge number of small files from one system to another.  Basically we are backing up about 20 million files that total a little under 10 terabytes.  So how is the best way to do this?  I first tried just setting up an extra gigabit ethernet adapted in each server and using a cross-over cable started copying with Microsoft’s Robocopy program.  If you’re not familiar with this it is available in the Windows 2003 resource kit and is a bit like xcopy with a lot more features.  My problem is that it just seemed to be going very slowly, taking about 2.5 days to copy 1 terabyte.  Perhaps that is as fast as it should be, I don’t know, but it was sure slower than I expected. 

I suppose I should explain the hardware involved to give the full picture.  The older server is a Pentium 4 server running Windows 2003.  The system has an external drive array with 16 drives.  The external drive system is configured as 4 RAID 5’s each about 1.5 TB.  This array is connected to the server via a SCSI adapter. 

The new server is a Xeon Quad Core also running Windows 2003 and has 24 driver bays internal to the server chasis.  The drives in the new server are configures with the first 16 drives as a RAID 60 (two 8 drive raid 6’s striped together for better performance) this provides about 11.2 TB, a 6 drive RAID 5 (4.5 TB) and 2 drives in a RAID 1 for the O/S.  The reason for the odd splitting is that I used two RAID controllers to connect all of the drives, a 16 port and an 8 port.  All of the drives in both systems are SATA II.

I decided to take the network cards out of the loop assuming that this was where my bottleneck was, although I never saw network utilization on the cards in question going above 10%.  I moved the SCSI card from the old server to the new and attached the external drive array to the new server.  This I assumed would give me a much faster transfer.  Unfortunately this has not been the case.  After 20 hours of copying only 650 GB have transferred to the new system. 

I’ve started testing some utilities that claim to improve copy speed (teracopy, fastcopy, totalcopy) and will post the results of the results of these tests here.  In the mean time it’s slow going.  Comments or suggestions?

————–

Update:

Well I tried a few of the copy programs I mentioned above.  I finally wound up using FastCopy.  There is one confusing part in that program depeding on if you include a trailing \ or not will affect what is copied, but this is documented as a feature, not a bug and overall the program did an excellent job.  One very nice feature of it is to queue copy jobs so that they will start one after the other.

  1. chichi
    Reply | Quote | #1

    Teaming the NICs and use Richcopy. Copy files in safe mode with network