CharConv - Text transformation based on character set mapping

Windows and Microsoft Word GUI for Text transformation given character set mapping (from/to) and calculation of mapping given matching (portions of) source and target text.
Useful for decyphering copy-pasted text from PDF documents that ends-up with strange character swaps in it due to encoding reasons

Download: CharConv_20100930.zip

Instructions:
Either run the .exe file from the "Delphi" or "Lazarus" subfolders (note that the 2nd one is much bigger, so prefer the 1st one), or open the .doc file from "Microsoft Word" and when prompted about macro security select to allow the script in the document file to execute

Requirements: if you try the .DOC file you must allow Word's macro security to enable the script for that version to run (this is also needed to be able to see the script and edit it)

Source code:

Source code (Object Pascal) is in "Delphi" and "Lazarus" projects (same code) and in the "Microsoft Word" folder (Visual Basic for Applications [VBA]).

The Object Pascal (.PAS file) and the Visual Basic for Application (.DOC) file logic is similar.
Following is the VBA code (can use Developer/Visual Basic toolbar in Word to see/edit it and Design mode to edit the GUI properties):

'//Description: Text converter
'//Author: George Birbilis (birbilis@kagi.com)

'//Messages//

Sub msgDifferentLayout()
 MsgBox "Source and Target must have identical layout to resync the mapping"
End Sub

Sub msgNonReversibleMapping(cFrom As String, cTo As String, cOther As String)
 MsgBox "Char [" + cTo + "] is already in 'To' field, but mapped to char [" + cOther + "] at the 'From' field. Added again to map to [" + cFrom + "]. This is a non reversible (1-1) mapping"
End Sub

'//Actions//

Function pos(s2 As String, s1 As String) As Integer 'Pascal-style search
 pos = InStr(1, s1, s2, vbBinaryCompare)
End Function

Sub UpdateWideCharMapping(cFrom As String, cTo As String)
 Dim p As Integer
 
 If pos(cFrom, txtFrom.Text) = 0 Then
  txtFrom.Text = txtFrom.Text + cFrom
  p = pos(cTo, txtTo.Text)
  If (p <> 0) Then
   msgNonReversibleMapping cFrom, cTo, Mid(txtFrom.Text, p, 1)
  End If
  txtTo.Text = txtTo.Text + cTo
 End If
End Sub

Sub UpdateMapping()
 Dim i As Integer
 Dim s1 As String
 Dim s2 As String
 
 s1 = txtSource.Text
 s2 = txtTarget.Text

 If (txtSource.LineCount <> txtTarget.LineCount) Or _
    (Len(s1) <> Len(s2)) Then
  msgDifferentLayout
 Else
  For i = 1 To Len(s1)
   UpdateWideCharMapping Mid(s1, i, 1), Mid(s2, i, 1)
  Next i
 End If
End Sub

Function convertWideChars(s As String, fromChars As String, toChars As String) As String
 Dim i As Integer
 Dim p As Integer
 Dim result As String
 Dim c As String
 
 result = ""
 For i = 1 To Len(s)
  c = Mid(s, i, 1)
  p = pos(c, fromChars)
  If p = 0 Then
   result = result + c
  Else
   result = result + Mid(toChars, p, 1)
  End If
 Next i
 convertWideChars = result
End Function

Sub ConvertText()
 txtTarget.Text = convertWideChars(txtSource.Text, txtFrom.Text, txtTo.Text)
End Sub

'//Event handlers//

Private Sub btnConvert_Click()
 ConvertText
End Sub

Private Sub btnUpdateMapping_Click()
 UpdateMapping
End Sub

(C)opyright 2006-2010 - Zoomicon / George Birbilis
Free to use / give due credit