This sample can do batch conversion of text files with different code pages - Unicode, utf-8, windows-1250 and others to one selected code page. The algorithm contains simple detection of source file code page using BOM.
You can choose any destination charset. See also ByteArray - save unicode data (string) as utf-8 with BOM to save files with BOM (unicode Little/Big, utf-8)
| Batch file conversion - character set and BOM detection of html files | |
|---|---|
Const DestCharSet = "utf-8"
'Const DestCharSet = "ascii"
Dim FS
Set fs = CreateObject("Scripting.FileSystemObject")
ConvertFolder "f:\", "f:\1"
Function ConvertFolder(byval InputPath, OutputPath)
Dim InputFolder, File
Set InputFolder = fs.GetFolder(InputPath)
For Each File In InputFolder.Files
If LCase(Right(File.Name,4)) = ".htm" Then
Wscript.Echo File.Path
'wscript.echo OutputPath & "\" & replace(file.path,":","")
ConvertFile File.Path, OutputPath & "\" & file.Name, DestCharSet
End If
Next
Dim FilesFolder
For Each FilesFolder In InputFolder.SubFolders
ConvertFolder FilesFolder.Path, OutputPath
Next
End Function
Sub ConvertFile(SourceFileName, DestFileName, DestCharSet)
'read the source file contents
Dim FileContents
Set FileContents = ReadOneFile(SourceFileName)
'Convert to the destination charset
Set FileContents = FileContents.CharSetConvert(DestCharSet)
'Save to a destination file
FileContents.SaveAs DestFileName
End Sub
Function ReadOneFile(FileName)
Dim ByteArray
Set ByteArray = CreateObject("ScriptUtils.ByteArray")
'Read first two bytes from the file
ByteArray.ReadFrom FileName,,2
Select Case ByteArray.HexString
'unicode big endian
Case "FEFF":
ByteArray.CharSet = "unicodebig"
'Read the file from 3rd byte to end.
ByteArray.ReadFrom FileName,3
'unicode little endian
Case "FFFE":
ByteArray.CharSet = "unicodelittle"
'Read the file from 3rd byte to end.
ByteArray.ReadFrom FileName,3
Case Else:
'Read first three bytes from the file
ByteArray.ReadFrom FileName,,3
If ByteArray.HexString = "EFBBBF" Then 'unicode utf-8
'read a file contents behind the BOM header
ByteArray.ReadFrom FileName,4
ByteArray.CharSet = "utf-8"
Else
'read whole contents of the file in other cases
ByteArray.ReadFrom FileName
On Error Resume Next
'try to detect charset from the data source'
ByteArray.CharSet = DetectCharSet(ByteArray.String)
'Set some default charset (default is OEM)
'if err<>0 then ByteArray.CharSet = "windows-1250"
End If
End Select
Set ReadOneFile = ByteArray
End Function
'The Function detects charset from the source string data.
Function DetectCharSet(Data)
On Error Resume Next
Dim charset
'the charset tag usually look like
'<meta http-equiv="Content-Type" content="text/html; charset=windows-1250">
charset = Split(Data, "charset=", 2, vbTextCompare)(1)
If Len(charset)>0 Then
charset = Split(charset, """", 2, vbTextCompare)(0)
End If
DetectCharSet = charset
End Function | |
Works with safearray binary data - save/restore binary data from/to a disk, convert to a string/hexstring, codepage/charset conversions, Base64 conversion, etc.
ByteArray is a COM class specially designed to work with Microsoft Windows Scripting engines - VB Script and JScript in Active Server Pages or WSH and in CHM or HTA applications. It also works with VB Net, Visual basic (VBA - VB 5, VB 6, Word, Excel, Access, …), C#, J#, C++, ASP, ASP.Net, Delphi and with T-SQL OLE functions - see Use ByteArray object article. You can also use the object in other programming environments with COM support, such is PowerBuilder.
Source code for ByteArray is available within distribution license, please see License page for ASP file upload and ScriptUtilities.
Huge ASP upload is easy to use, hi-performance ASP file upload component with progress bar indicator. This component lets you upload multiple files with size up to 4GB to a disk or a database along with another form fields. Huge ASP file upload is a most featured upload component on a market with competitive price and a great performance . The software has also a free version of asp upload with progress, called Pure asp upload , written in plain VBS, without components (so you do not need to install anything on server). This installation package contains also ScriptUtilities library. Script Utilities lets you create hi-performance log files , works with binary data , you can download multiple files with zip/arj compression, work with INI files and much more with the ASP utility.
© 1996 - 2011 Antonin Foller, Motobit Software | About, Contacts | e-mail: info@pstruh.cz