locked
Reading CSV file (using Recordset) into VB CSV-file UTF-16 How to do this

    Question

  • Hi I try to connect with a recordset to an CSV file (Tab delimeted), no problems with that. 

    I put in schema.ini the format TabDelimted.
    How can I say that this CSV file is UTF-16. Anyone any experience??
      Format is UTF-16 : little-endian, with BOM

    Ludo

      This is how I connect to CSV file
      
      
    1 Dim cn As ADODB.Connection  
    2 Set cn = New ADODB.Connection  
    3  
    4 cn.Provider = "MSDASQL" 
    5 cn.ConnectionString = "Driver={Microsoft Text Driver (*.txt; *.csv)};" &_   
    6                       "Dbq=D:\My Documents\;"   
    7 cn.Open   
    8  
    9 Dim rs As ADODB.Recordset   
    10 Set rs = cn.Execute("SELECT * FROM output1.CSV")  
    11  
    12 ........ 

     
    Friday, October 17, 2008 11:22 PM

Answers

  • Found the solution..
    In schema.ini I have specified that characterset=unicode (default is ANSI). Strange because UNICODE is in MSDN page NOT defined as a value (only ANSI or OEM)?????

    Schema.ini now looks like this:
    [OUTPUT.CSV]
    Format=TabDelimited
    ColNameHeader=True
    MaxScanRows=0
    CharacterSet=UNICODE

    Thanx for the help.

    BR
    Ludo
    • Marked as answer by LudoS Sunday, October 19, 2008 6:26 PM
    Sunday, October 19, 2008 6:25 PM

All replies

  • Highly possible you may study connectionString construction bit deeper.
    Saturday, October 18, 2008 4:33 PM
  • The FF FE is indicating that this an UTF-16 File. Couldn't find anything in connecionstring, to specify UTF-16.
     
    CSV looks like this (in HEX), Tab Delelimted
    FF FE 45 00 6D 00 70 00  6C 00 6F 00 79 00 65 00
    65 00 09 00 43 00 6F 00  75 00 6E 00 74 00 72 00
    79 00 09 00 52 00 65 00  67 00 75 00 6C 00 61 00
    etcetc

    Schema.ini
    [OUTPUT1.CSV]
    Format=TabDelimited
    ColNameHeader=True
    MaxScanRows=0
    Saturday, October 18, 2008 9:37 PM
  • Anyway you can check for coding brute force. For that you can check Support.lFile.rw.BruteEncodingTestR, at http://quilt.ic.cz/tmp/upl/Soft.htm#support (source, but you can use dll freely...). And when you know what is encoding of csv file, you read it with this encoding, and write temporary copy in encoding, which is understand with ADODB, or which is default in there.
    Sunday, October 19, 2008 4:43 PM
  • Found the solution..
    In schema.ini I have specified that characterset=unicode (default is ANSI). Strange because UNICODE is in MSDN page NOT defined as a value (only ANSI or OEM)?????

    Schema.ini now looks like this:
    [OUTPUT.CSV]
    Format=TabDelimited
    ColNameHeader=True
    MaxScanRows=0
    CharacterSet=UNICODE

    Thanx for the help.

    BR
    Ludo
    • Marked as answer by LudoS Sunday, October 19, 2008 6:26 PM
    Sunday, October 19, 2008 6:25 PM