Tip a really fast file copying algorithm don't miss it!!

biramino

New member
Joined
Sep 3, 2013
Messages
4
Programming Experience
5-10
i have made a really fast copier here is a screenshot : adf.ly/V3tsH
and here is the source code adf.ly/V3pou
 
I don't see this as being very fast. In fact I see it as being very slow. Reading and writing all bytes for a copy in managed code is nutz. The fastest way to copy a file is unequivocally "copy [SOURCE] [DESTINATION]", or even robocopy if you need to copy multiple files at once.
 
File.Copy method (or My.Computer.FileSystem.CopyFile method) should be the fastest file copy in .Net, it uses native FileCopy/SHFileOperation functions.
 
I think My.Computer.FileSystem.CopyFile will definitely be slower because of the UI, not sure if there is any difference between the two if you switch the UI off. For the same reason, Directory.Copy is slow because it counts everything upfront. I still largely prefer copy/xcopy/robocopy...
 
I think My.Computer.FileSystem.CopyFile will definitely be slower because of the UI, not sure if there is any difference between the two if you switch the UI off.
They are the same when without UI.
For the same reason, Directory.Copy is slow because it counts everything upfront.
What are you referring to here? Directory class does not have a Copy method.
 
dear Herman, JohnH

i actually tested my method using a stopwatch multiple times against all others ones, and i found it to be the fastest among them all.
and the advantage my method has is that it can be tweaked to support a very accurate progress report ;)
 
Sorry John, I was referring to My.Computer.FileSystem.CopyDirectory...

But in any case, biramino, you are most definitely mistaken. What was your test exactly? I don't see it as being very reliable unless you were copying a couple of hundred files, of varying sizes, in multiple directories... Your method might well be faster for single small file copies, but that is just because the overhead of other methods is larger than the workload.
 
Sorry John, I was referring to My.Computer.FileSystem.CopyDirectory...
That creates directories if necessary and otherwise works like the .Net file copy methods (native FileCopy function), I don't see it counting anything that you mentioned.
 
Hmmm... I just ran this:

Imports System.Threading.Tasks
Imports System.IO

Public Class Form1
    Dim elapsedTime As New TimeSpan

    Private Const SourcePath As String = "C:\SourceFolder"
    Private Const DestinationPath As String = "C:\DestinationFolder"

    Private Sub Form1_Load(sender As System.Object, e As System.EventArgs) Handles MyBase.Load
        Dim methodTime As New TimeSpan

        ' Try the first method 10 times, average.
        For i = 1 To 10
            CopyFilesParallelFileCopyMethod()
            methodTime += elapsedTime
            CleanDestination()
        Next
        Dim ParallelFileCopyMethodAvgTimeMS As Double = methodTime.TotalMilliseconds / 10

        ' Try the second method 10 times, average.
        methodTime = TimeSpan.Zero
        For i = 1 To 10
            CopyFilesParallelFileReadAndWriteMethod()
            methodTime += elapsedTime
            CleanDestination()
        Next
        Dim ParallelFileReadAndWriteMethodAvgTimeMS As Double = methodTime.TotalMilliseconds / 10

        MessageBox.Show("Parallel File.Copy method, average over 10 runs: " & Math.Round(ParallelFileCopyMethodAvgTimeMS, 2) & "ms" & vbCrLf & _
                        "Parallel file read and write method, average over 10 runs: " & Math.Round(ParallelFileReadAndWriteMethodAvgTimeMS, 2) & "ms")
    End Sub


    Private Sub CleanDestination()
        Directory.Delete(DestinationPath, True)
        Directory.CreateDirectory(DestinationPath)
    End Sub

    Private Sub CopyFilesParallelFileCopyMethod()
        Dim chrono As Stopwatch = Stopwatch.StartNew

        Dim lstFiles As List(Of String) = SafeEnumerator.EnumerateFiles(SourcePath, "*", SearchOption.AllDirectories).ToList

        Parallel.ForEach(lstFiles, Sub(f)
                                       Try
                                           File.Copy(f, DestinationPath & "\" & Path.GetFileName(f), True)
                                       Catch
                                       End Try
                                   End Sub)

        chrono.Stop()

        elapsedTime = chrono.Elapsed
    End Sub

    Private Sub CopyFilesParallelFileReadAndWriteMethod()
        Dim chrono As Stopwatch = Stopwatch.StartNew

        Dim lstFiles As List(Of String) = SafeEnumerator.EnumerateFiles(SourcePath, "*", SearchOption.AllDirectories).ToList

        Parallel.ForEach(lstFiles, Sub(f)
                                       Try
                                           Dim FileContents() As Byte = File.ReadAllBytes(f)
                                           File.WriteAllBytes(DestinationPath & "\" & Path.GetFileName(f), FileContents)
                                       Catch
                                       End Try
                                   End Sub)
        chrono.Stop()

        elapsedTime = chrono.Elapsed
    End Sub

    Friend Class SafeEnumerator
        Public Shared Function EnumerateFiles(strPath As String, strFileSpec As String, soOptions As SearchOption) As IEnumerable(Of String)
            Try
                Dim DirEnum = Enumerable.Empty(Of String)()
                If soOptions = SearchOption.AllDirectories Then
                    DirEnum = Directory.EnumerateDirectories(strPath).SelectMany(Function(x) EnumerateFiles(x, strFileSpec, soOptions))
                End If
                Return DirEnum.Concat(Directory.EnumerateFiles(strPath, strFileSpec, SearchOption.TopDirectoryOnly))
            Catch ex As UnauthorizedAccessException
                Return Enumerable.Empty(Of String)()
            End Try
        End Function
    End Class

End Class


And ended up with 44627.44ms average over 10 runs for File.Copy vs 26072.52ms average for ReadAllBytes/WriteAllBytes... The source folder is 1GB in size, with files ranging from 1KB to 200MB... How can that be? It seems the OP was actually right!
 
NM the results above, apparently the enumerator was making too great of a difference, most likely because some files in my original source were inaccessible. I made a safer copy, and moved the StopWatch.StartNew after the enumeration, now File.Copy is CLEARLY on top with 3518.48ms on average, VS 21440.08ms on average for ReadAllBytes/WriteAllBytes.

That makes more sense.
 
Herman said:
File.Copy is CLEARLY on top with 3518.48ms on average, VS 21440.08ms on average for ReadAllBytes/WriteAllBytes.
I get similar results (1:10 ratio). There is also a better alternative for "FileReadAndWrite" method using Stream.CopyTo, in context of your samples it would be something like this:
                                       Using source = IO.File.OpenRead(f)
                                           Using target = IO.File.Create(DestinationPath & "\" & Path.GetFileName(f))
                                               source.CopyTo(target)
                                           End Using
                                       End Using

This would be almost as fast as File.Copy, but it isn't surprising it can't beat the native routine. I also improved the performance of that slightly by doubling the buffer
source.CopyTo(target, 8192)

Btw, "Avoid Executing Parallel Loops on the UI Thread" Potential Pitfalls in Data and Task Parallelism
 
Hi guys,
First i want to thank you for your contributions, suggestions.

So the reason why i didn't use native routine, is to be able to have more flexibility and control over my code,
And to be honest i think that Microsoft Developers have designed these methods to b Ready-To-Use and not exactly for speed.

After all we are here to learn from each other , and i will post lot more threads in the near future :courage:
 
Back
Top