Question Simple substitution for a long string?

Joined
Aug 14, 2008
Messages
7
Programming Experience
Beginner
I have a 180,000 strings with each string being 300-1,100 characters in length. I want to do a simple substitution for each character of every string.

For X0=1 to 180000
LineIn = Origin(X0)
Y0 = 0
Do
Y0 = Y0 + 1
If asc(Mid(Linein, Y0, 1)) < 126 then Mid(Linein, Y0, 1) = chr(asc(Mid(Linein, Y0, 1)) + 126) else Mid(Linein, Y0, 1) = chr(asc(Mid(Linein, Y0, 1)) - 126)
Loop until Y0 = Len(LineIn)
Origin(X0) = LineIn
Next

The Do : Loop is a time killer. Is there a function that does simple substitution on the string without having to manually check every position?
 
Last edited:
This is an example of the sort of operation that gets a speed boost from using pointers, which are not supported by VB. You would benefit from creating a C# library that used unsafe code to perform that processing.
 
Well, when you say simple substitution and it turns out not to be so simple ....

What are you actually achieving here? As far as I can see, you'd be converting "z" to something that I probably can't show on this forum (or at least I'm too lazy to work out how) whilst a-umlaut becomes Backspace? If this is intended as some form of encryption there are far better ways to go about it, surely?

For a start on your code, I'm not sure why you didn't go for ...

For Y0 = 0 to LineIn.Length-1
....
Next

... which would cut out at least one function.

You'd also certainly benefit from creating a list from your strings rather than using the array (if not indeed replacing the array altogether) which would expose some useful methods. But it's hard to be specific until I have some idea of what you're actually trying to do (or indeed why!)
 
Here is the actual code I'm trying to optimize...

It's the character by character substitution for 'Ask' (180,000 records with variable lengths) that's slowing me down.

---

'Load Master Records
X0 = 0 : Z0 = 0 : FileOpen(1, Path + "Master_File.ilx", OpenMode.Input, , , )
Do
X0 = X0 + 1 : Ask = LineInput(1)

If X0 / Bar = Int(X0 / Bar) Then Form6.ProgressBar1.Increment(1) : Form6.Show()

For Y0 = 1 To Len(Ask)
If Asc(Mid(Ask, Y0, 1)) < 128 Then Mid(Ask, Y0, 1) = Chr(Asc(Mid(Ask, Y0, 1)) + 127) Else Mid(Ask, Y0, 1) = Chr(Asc(Mid(Ask, Y0, 1)) - 127)
Next

Items(X0) = Ask

PC = Items(X0).Split("|")
Dim exp As New Regex("-", RegexOptions.IgnoreCase)
If exp.Matches(PC(22)).Count > 0 Then Z0 = Z0 + (26 - exp.Matches(PC(22)).Count)

Loop Until EOF(1) : FileClose(1)
 
Yes I know where the problem is. What I still don't know is why you need to do this at all. Why aren't the strings in the file in the form that you're actually going to use in the program? And it's difficult to suggest an alternative to a character by character substitution without some idea of what the substitution is actually meant to achieve. Does the new string absolutely have to have this very odd format? Why? And what do the original strings actually represent? The code doesn't really tell me anything when I have no real idea what all the variables actually represent.
 
Personally I would use a static table for this, and a byte array or Stringbuilder. Get rid of the Mid(), Asc(), and conditional blocks.

Build a table in the form of an array with 256 elements (assuming ASCII). That will map replacement characters. Build it beforehand and keep it in memory.

Example:
...
arrLookupTable(126) = 253
arrLookupTable(127) = 254
arrLookupTable(128) = 1
arrLookupTable(129) = 2
...

Then:
For i As Integer = 0 to arrInputString.Length - 1
    arrInputString(i) =  arrLookupTable(arrInputString(i))
Next


That is really the same as you would do it using a lookup table and pointers in C. Although in C I would likely use a bitwise left circular shift.
 
Last edited:
I would likely use a bitwise left circular shift.

Oooh, fancy! :nightmare:

The array is obviously the way to go but I'm not sure it really merits a look-up table given that it's only a choice of + or -127 and it's yet to be confirmed that all 256 characters are used in the master strings anyway. It may well turn out to be possible to do this in a calculation (the equivalent of your shift) which is why I was angling after information as to exactly what the process is meant to achieve (being a bear of little brain, I honestly can't imagine any practical purpose it serves)!
 
The purpose of the lookup table is to eliminate the CPU cycles it takes to repeat the same operations thousands of times. Instead you calculate your table once and you are done with it.
 
Fair 'enuff but shouldn't it be ...

1
2
3​
For i As Integer = 0 to arrInputString.Length - 1
arrInputString(i) = arrLookupTable(Val(arrInputString(i)))
Next

 
No need to cast, the arrays are already both of type Byte. The value of the input byte is used as the index to the lookup table, and the lookup table returns a replacement value of type Byte, that you just stick back into the original array. Val() would serve no purpose here.
 
Back
Top