Question HELP! A complex threading / third-party DLL problem

irza

Member
Joined
Apr 13, 2011
Messages
10
Programming Experience
10+
Hi everyone, I'm new to the forum and require some help from someone who may know more about threading then me (which, I presume is EVERYONE :rolleyes:).

The problem I have is very complex and relates to an 3-rd party DLL which require hardware components and special software installs, so unfortunately, I can't post any code so the problem can be replicated. What I can do is provide a very detailed text description and ask for some opinions...

So, here goes...

I'm doing a project that uses an old DLL that exposes functions that communicate with a specialized program and call these all the time. I've imported the functions using <DLLImport> tags. The functions themselves return a int > 0 if it run ok and <= 0 if it failed. Part of the problem is that only one function can be performed at a time and if several commands are activated in short amount of time, the connection to the DLL stops working and all function calls return fails. Initially I've had problems using threading with this DLL until I removed it all (some functions would not perform at all, others returned fails, etc.).

My program communicates with the DLL with no problem... All calls are normally returned and executed and everything works perfect... UNTIL the user clicks a little too fast.

So, the user has a form on which he activates these function calls, and if he doesn't wait long enough for the first call to the functions to end, the DLL connection 'fails', after which the program has to be restarted (reinstancing the class containing the <DLLImports> doesn't help for some reason?).

After a week of testing and GoogleFixing, I've tried almost everything:
1) I've made the functions calling DLLs store a bool variable Working and then wait for that to be free - this didn't help as apparently the connection breaks before even coming to the code, which suggests a threading problem (explained further down)
2) I've tried adding <STAThread> to DLL function declarations
3) I've tried messing around with <DLLImport> attributes
4) I've added SyncLock to function calls
5) I've added a loop with sleep on the calling code to make it to 'wait' for the work to end

All of these didn't help, just showed that the DLL connection fails even WITHOUT repeated calls to the DLL... So, the only thing that did help is disabling the controls starting the calling process on the form... for example:
bttnDoWork_click
bttnDoWork_click.enabled = false
Call FunctionForWork
bttnDoWork_click.enabled = true

This is of course a stupid way to treat the problem's symptoms (the source eludes me) and inpractical since the calls are made from about 40 different places in the code and I'd have to disable ALL forms/controls starting this calls to be fully safe, which is nearly impossible in practice.

So, entering any code already causes a problem, which is why I've came to the conclusion that this is a threading problem...

I execute all the function calls from the Main Thread (like I said, all past threading was removed due to problems communicating with DLL), yet for some reason the form keeps the focus and the user can click away (most likely because the DLL is working at the time and the form keeps the focus?), either messing around with the Main Thread or creating a new one which seems to break the DLL.


So, assuming that no one comes up with a magical one-line solution :cool: at least answering these questions would be very helpful:

1) Is there a way to fully disable control over the program to the user in a simple way while the code is executing (something like Application.enabled = false)?
2) Can Main Thread be disabled from reacting at other points in the program in some way until a code finishes?
3) How can <DLLImport> lines be restarted if reinstancing the class containing them doesn't help (I'm seeing this as a memory/cache problem of some sort, since after program restart, everything works fine)?


Please help, this problem is causing me to lose a years' worth of work, not to mention my mind :p Also, sorry for so much text, but like I said, code would be useless here :(
 
Here's what I would try first:
VB.NET:
''' <summary>
''' Wrapper class for unmanaged library.
''' </summary>
Friend Class LibraryWrapper

    ''' <summary>
    ''' Enables marshalling method calls back to the thread on which this object was created.
    ''' </summary>
    Private context As SynchronizationContext = SynchronizationContext.Current

    ''' <summary>
    ''' Object to lock on to ensure only one unmanaged call is made at a time.
    ''' </summary>
    Private Shared syncRoot As New Object


    ''' <summary>
    ''' Indicates that the operation started with DoSomethingAsync has completed.
    ''' </summary>
    Public Event DoSomethingCompleted As EventHandler(Of AsyncResultEventArgs)


    ''' <summary>
    ''' Proxy method for unmanaged function.
    ''' </summary>
    <DllImport("somelibrary.dll")>
    Private Shared Function DoSomething(input As Integer) As Integer
    End Function

    ''' <summary>
    ''' This method is called by the application to make a call to the unmanaged function asynchronously.
    ''' </summary>
    Public Sub DoSomthingAsync(input As Integer)
        'Create a new thread on which to execute the unmanaged function.
        Dim t As New Thread(AddressOf DoSomthingInternal)

        t.Start(input)
    End Sub

    Private Sub DoSomthingInternal(input As Object)
        Dim data As Integer = CInt(input)
        Dim result As Integer

        'Only let one unmanaged call be made at a time.
        SyncLock syncRoot
            result = DoSomething(data)
        End SyncLock

        'Signal that the operation has completed.
        'Raise the event on the thread on which the current object was created.
        context.Post(AddressOf RaiseDoSomethingCompleted, result)
    End Sub

    ''' <summary>
    ''' This method is required because SynchronizationContext.Post can only call methods with a single Object parameter.
    ''' </summary>
    Private Sub RaiseDoSomethingCompleted(result As Object)
        OnDoSomethingCompleted(New AsyncResultEventArgs(CInt(result)))
    End Sub

    ''' <summary>
    ''' Raises the event to notify listeners that the operation is complete.
    ''' </summary>
    Protected Overridable Sub OnDoSomethingCompleted(e As AsyncResultEventArgs)
        RaiseEvent DoSomethingCompleted(Me, e)
    End Sub

End Class


Friend Class AsyncResultEventArgs
    Inherits EventArgs

    Private _result As Integer

    Public ReadOnly Property Result() As Integer
        Get
            Return _result
        End Get
    End Property

    Public Sub New(result As Integer)
        _result = result
    End Sub

End Class
Your application calls DoSomethingAsync and handles the DoSomethingCompleted event. The result of the actual unmanaged call can be obtained from the e.Result property in the event handler. If you have more unmanaged functions then you would need to add the one event and six methods for each one. If you want to be able to have multiple pending calls to the same unmanaged function at the same time then you would need to pass in some sort of token that you can get back in the event handler to know which operation it is that completed.
 
Hi, thanks for the reply :)

I'll try playing with SynchronizationContext later today, but I've already tried SyncLock and it didn't help...

I've done some more testing and came up with an interesting conclusion which unfortunately confirms my suspicions.

The standard way of execution is:
- Form A calls function B which calls DLL function C, then after completion, returns to B, then to A.
or A>B>C<B<A

What I've concluded is that when the DLL function C is called, the Main Thread actually takes a BREAK ... or probably passes the execution to a different thread, one that STARTS within the DLL and is not related to my Main Thread... so while the DLL function is executing, the Main thread is actually FREE, allowing the user to continue working. Problem is some of these take more then a split second to run.

What actually happens at thread level is this:
A>B>enters C--thread freed, waits for finish, finished, thread busy again--exits C<B<A

When the code breaks, the following scenario happens:
A>B>enters C--thread free, user clicks away, starting the same process A>B>C--cannot continue as the DLL is still busy--breaks execution--exits C<B<A

Notice that the first process never actually ends (1st:ABC,then 2nd:ABCBA). The second the second process started the first stopped for infinity; possibly the DLL finished ok, but the Main thread was no longer there to take up the result and continue, thus stuck in mid-work!

When I force an error message in the second process, I also get an error trace that confirms this... by writing roughly:
an error occurred at:
C at
B at
A at
C
--- notice that the first (and last) in starting queue is a C which is impossible codewise as it would suggest the DLL function called the other code calling itself, which would confirm my idea that the thread pauses for a while and somehow starts the next process while it is still in mid-work. I've done testing a hundred times over and it only breaks between logs "entering C" and "exiting C" so this is surely the culprit!

Now, as for a solution to this:
1) multithreading I had before (actually removed it when starting with the DLL as I saw it caused a lot of problems, some of them possibly related to this), I think it wouldn't work. plus, with 20+ DLL functions, all with parameters, it would cause a whole lot of excess code and other problems
2) Blocking the user input until the code completes... this seems like easiest way to treat the symptoms, if the solution eludes us. I'll try a function that blocks user input at the most critical positions (with 20+ forms, it's not going to be all-inclusive, but it'll give the user a fighting chance :D)
3) Finding some smart way to block the Main Thread while in DLL functions... probably not doable as I would have to pause it while it is inside the DLL but not block it before... i.e. i'd have to find a nifty command Thread.WaitForNextCommandToFinish :-\ somehow doubt that exists, but don't know much about Monitor and Mutex to say for sure :-S
4) separating the calling functions to an external application which would monitor changes and then work one by one. There I could at least block new callers effectively with a single line... Oh, another application, I already feel a headache :(

any other ideas?
 
Your theory about the DLL function executing in a different thread and your calling thread continuing on its way can't possibly be correct. How could the DLL function return a value before it's actually finished executing? I could be wrong but I suspect that you used SyncLock incorrectly. By the way, SyncLock is just a basic interface for a Monitor.
 
Not a different thread in program, but a different thread inside the DLL.
So, the Main Thread calls an external DLL then waits for it to complete. Once it completes, the Main Thread regains control and continues. I can see no other explanation as to why the Main Thread is formally running and the user can still do actions in it... because that's what happens, the DLL is executing (from inside a method which returned ThreadID 8 in debugging) and the user regains control over the form, and continues working within the thread (says it's ID is 8) - even through it should be busy elsewhere. So the way I see it, the Main Thread is programed with a "Do: Thread.sleep: Loop While ExternalDLLIsWorking" command or something to that effect...

And that's logical, isn't it? If you start 10 programs separately, all act in their own threads, which aren't connected to each other (even though the programs themselves can intercommunicate via DLLs, ports, services, etc. and wait for the other to respond). The only question here, is if DLL acts just like any other program (starting its own thread) or does it attach itself to the program's thread...? If I'm right, then its the first option.

About SyncLock, I just used it on one place and the problem existed long before that (this was one attempt at fixing it). I'll try removing it, but I doubt anything will change.
 
I'll try it later today and then report. I need to get to my computer at home which is the only one that can work with this DLL.
 
hi, I've tried running the code against the DLL functions and no luck. I remember having multiple threads before, but removing them because they were problematic, now I remember the problem...

I don't know if the same problem would happen again, because the first DLL function to call is called roughly OpenConnection, and when called in a thread other then the main, it ends with a negative return value and I can't do anything more if the connection isn't opened.

So, I'm constrained to the Main Thread :-S
 
You're constrained to the main thread for that first call or for all calls? Is it maybe that all you DLL calls must be done on the same thread, without it actually mattering what thread that is?
 
For all calls.
I've tried running the first call on Main, then others on other threads - first was ok, others returned errors
I've tried running all of them on other threads - the first returned errors (and I suspect all others would have)
Only when everything is run on Main Thread it works, so I guess I'm stuck there.

What's the difference - systemwise - between a Main thread and others, manually generated. Does the Main Thread have some sort of Windows handle allowing more functionality outside the program or should they - in theory at least - be totally the same?
 
As I see it, the only difference is that starting the functions from other threads is done without waiting for an immediate reply / async... maybe that's the problem? If the DLL wants to push a response and there's no one waiting to take it...? This also fits in with my problem, the Main Thread calls the DLL and then goes on working while the DLL is ready for a response while the thread no longer listens for a response so nothing happens... wow, this actually makes some logical sense, even if it's a little far-fetched... :-S
 
I'm afraid that that doesn't make any sense. It doesn't matter what thread you make a method call on, if it's a synchronous method then it's a synchronous method. The thread you make the call on will block until the operation is complete. Maybe you should show us some code that does work and then an example of what you tried that didn't.
 
Like I said, code is problematic, but I'll try to narrow it down to the main problem...

Dim Working as Boolean, Lock as New Object

<STAThread()> Public Function CallMethodToSaveFile(ByVal Param1 As FileClass, ByVal Param2 As Integer) As Boolean'--> the STAThread tag was one of the attempts to solve the problem, it acts the same with or without it

Dim OK as Boolean = False
3: If Param1 Is Nothing Then Exit Function

Threading.Thread.BeginCriticalRegion()
Threading.Thread.BeginThreadAffinity() '--> these 2 were also one of the attempts to solve the problem, it acts the same with or without it

SyncLock Lock '--> again... more attempts
Try

1: CallMethodToSaveFile = False

Cursor.Current = Cursors.WaitCursor

'Fail safe to prevent full program lock-up due to error while one call has already been made

Dim StartTimer As Date = Now
Do While Working And DateDiff(DateInterval.Second, StartTimer, Now) <= 5
Threading.Thread.Sleep(50)
Loop
If Working Then
LogWrite("Waiting problem again. Gave up after 5 seconds!")
End If

Working = True

15: LogWrite("Saving file")

16: OK = CtrlDLL.SaveToFile(Param1, Param2)

17: Working = False
Catch ex as exception
End Try

Threading.Thread.EndCriticalRegion()
Threading.Thread.EndThreadAffinity()
End SyncLock

Return OK
End Function

----
CtrlDLL is a class containing <DLLImport> lines of which one is SaveToFile.

When done slowly, everything runs fine, but when it is called twice faster, the first call to SaveToFile never returns an answer and the second attempt waits for 5 seconds (coded that way), then tries, but gets an error from the same function. So even if it looks so on the surface, the problem really isn't in the second call to the function, its in the fact that the first one never officially ended. Like I said, this problem only occurs when the user goes on clicking between lines 16 and 17 - which is the only time he gets to click as during the other lines, the Main Thread is blocked.
 
Last edited:
Back
Top