locked
BULK INSERT .csv file error RRS feed

  • Frage

  • hi,

    as my assignment I got a 32GIG .csv file, i need to get this file in sql server i use the new one 'denali'.
    I can't open the file cause it's so large, so i printed the first lines with visual studio

    as you can see it counts 5 columns
    so i created a table in sql server :

    CREATE TABLE AMRTABLE
    (
    EXTERNAL_ID INT,
    VALUETIME SMALLDATETIME,
    VALUE DECIMAL,
    VALIDATE VARCHAR(1),
    REASON VARCHAR(10) null
    )
    ---D:/AMRwaarde/AMR_EXPORT.csv
    ---D:/output.csv
    BULK INSERT AMRTABLE
    FROM 'D:/AMRwaarde/AMR_EXPORT.csv'
    WITH
    (
    FIELDTERMINATOR = ';',
    FIRSTROW = 2,
    ROWTERMINATOR = ''\n'    ---also tested things like ' '  - 'CHAR(10)' , or 'CHAR(13')'
    )

    when i run the bulk i get the following error:

    The bulk load failed. The column is too long in the data file for row 1, column 5. Verify that the field terminator and row terminator are specified correctly.
    Msg 7301, Level 16, State 2, Line 1
    Cannot obtain the required interface ("IID_IColumnsInfo") from OLE DB provider "BULK" for linked server "(null)".


    I also created an output.csv file with the first 10 lines of the file
    when i run it there i get the same error or sometimes '0 rows affected'

    can some help me with this
    kind regards
    matthias

    Montag, 5. März 2012 12:28

Antworten

Alle Antworten

  • I found something, here on the forum that it might be my " signs.
    so i changed them in output.csv  i removed them and seperated everything with a ,

    then i ran the code again and that worked...
    now i can't change them in the 32GB csv file, so they suggested me to use FMT file
    how to get started with it, and how does this work?

    regards
    matthias
    Montag, 5. März 2012 13:02
  • To increase your productivity ( and work satisfaction), use the SSIS Import/Export Wizard:

    http://www.sqlusa.com/bestpractices2008/administration/ssiswizardicon/


    Kalman Toth, SQL Server & Business Intelligence Training; SQL SERVER BEST PRACTICES


    • Bearbeitet Kalman Toth Montag, 5. März 2012 13:06
    • Als Antwort markiert matthias1989 Dienstag, 6. März 2012 13:02
    Montag, 5. März 2012 13:05
  • Another solution could be using ";" as the FIELDTERMINATOR:

    BULK INSERT AMRTABLE
    FROM 'D:/AMRwaarde/AMR_EXPORT.csv'
    WITH
    (
    FIELDTERMINATOR = '";"',
    FIRSTROW = 2,
    ROWTERMINATOR = '\n'
    ) 
    but notice that the 1st and last columns will contain an additional " (double quote) character


    ∀A ∃B | Njeri(A) ∧ Njeri(B) ∧ B Më_i_mirë_se A

    Montag, 5. März 2012 13:33
  • thx for your reply
    but the '";"' doesnt work it still gives the following error:

    Msg 4866, Level 16, State 8, Line 1
    The bulk load failed. The column is too long in the data file for row 1, column 5. Verify that the field terminator and row terminator are specified correctly.
    Msg 7301, Level 16, State 2, Line 1
    Cannot obtain the required interface ("IID_IColumnsInfo") from OLE DB provider "BULK" for linked server "(null)".

    Montag, 5. März 2012 13:41
  • Check if the column values specified in your text file are convertible to the column data types of the table you are bulk-inserting into.

    If not, try declaring all the columns as character data and see if that works.


    ∀A ∃B | Njeri(A) ∧ Njeri(B) ∧ B Më_i_mirë_se A

    Montag, 5. März 2012 14:09
  • You need to use a format field, which should look like this:

    10.0
    6
    1 SQLCHAR 0 0 "\""            0 ""     ""
    2 SQLCHAR 0 0 "\";"\"     1 col1 ""
    3 SQLCHAR 0 0 "\";"\"     2 col2 ""
    4 SQLCHAR 0 0 "\";"\"     3 col3 ""
    5 SQLCHAR 0 0 "\";"\"     4 col4 Latin1_General_CI_AS
    6 SQLCHAR 0 0 "\"\r\n"    5 col5 Latin1_General_CI_AS

    The first line is the version number for the format. The next is the number of fields in the source files, and, yes, there are six fields in the file.

    The following lines describe the fields. First column is field number in the file. The next is the data type in the file, and this is always SQLCHAR for a text file in 8-bit format. Next is prefix length, used only for binary data flies. Fourth field is fixed length. Fifth field is the terminator. The double quote is part of the terminator, and the first field is an empty field, whence six fields.

    The sixth column is the field is the target column, 0 means don't import. Seventh is column name, but BULK INSERT does not use this. The last column is collation.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Montag, 5. März 2012 21:05
  • thx for your reply erland, but didn't worked as planned
    but i recreated the file with visual studio without the quotes and then I used the wizard, this worked well

    regards
    matthias
    Dienstag, 6. März 2012 13:03
  • To increase your productivity ( and work satisfaction), use the SSIS Import/Export Wizard:

    http://www.sqlusa.com/bestpractices2008/administration/ssiswizardicon/


    Kalman Toth, SQL Server & Business Intelligence Training; SQL SERVER BEST PRACTICES


    BWHAAA-HAAAA!!!!  I know this is an old post, Kalman, but I'm of totally the opposite opinion.  I get more done more quickly by avoiding SSIS at all cost.  In fact, I had a job where my primary job was to convert all SSIS packages to stored procedures because SSIS was just too slow for what they wanted to do.  After they got hit with a migration and had to move all those packages, they never wanted to see SSIS again. 

    After I did my duty for them, I didn't either. :D


    --Jeff Moden

    Donnerstag, 30. Juli 2020 04:08
  • I found something, here on the forum that it might be my " signs.
    so i changed them in output.csv  i removed them and seperated everything with a ,

    then i ran the code again and that worked...
    now i can't change them in the 32GB csv file, so they suggested me to use FMT file
    how to get started with it, and how does this work?

    regards
    matthias

    It's probably not the " signs that was the problem.  Some files end up having some garbage in them and you already know how difficult it is to trouble shoot such large files because almost nothing except some good code (like you did) will handle them.

    I fixed the issue I was having by having PowerShell make a copy of the file for me and it fixes junk like that.  I found this fix quite by accident when I ran similar to copy just the first 1000 rows so I could see what the rows looked like.

    If you make a .BAT file of the following and then click and drag the original "bad" file onto it, it'll make a repaired copy with the suffix of "_Fixed" on the end of the file name. (Needs to be all on one line in the .bat file.)

    powershell -command "& {get-content '%~1' -TotalCount 2000000000|set-content -path '%~dp1%~n1_FIXED%~x1'}"

    You could also execute the same command with the original filename using a call to XP_CmdShell or doing something in a PowerShell task in a job.

    Heh.. and, PLEASE, don't presume to lecture me about the security risk you might think XP_CmdeShell is.  Setup correctly, it's no more of a security risk than using a SELECT.


    --Jeff Moden

    Donnerstag, 30. Juli 2020 04:19