Hello All!
I know this has come up before and I have tried several of the solutions found within the forum but I just can't seem to import my file correctly and could use some input, please.
Sample file (less fields than actual file):
Name (str), Phone# (str), Description(str), Resolved(bool), Met(bool)
"Kay, Mary","123-4567","Used a "."not a"," in text", "1", "1"
The text is qualified with " and columns delimited with commas but the description field has embedded quotes and commas. Normally it works except if there embedded quotes and commas.
I have tried unqualified data and undouble, but that does not work either because of the embedded commas in quotes.
Do I need to do something before the data flow? Do I need to do custom code similar to undouble (I tried modifying undouble but using unqualified fields caused the source file to not like the data and go red)? Should the row be read as one field and parsed?
Thanks in advance for any help you can give!
Can you use a different delimiter, perhaps a tab, in assembling the file?|||I'm afraid not. This is being sent to me by another user and is not under my control.|||ChrisHelt wrote:
Should the row be read as one field and parsed?
You can try the ragged right approach.
Basically, you will read everything before the description column normally and the rest as one big column and use derived columns & expressions to parse it.
|||There is no way to make the built in processing work with the file you describe.How would you know when you saw a quote, it was part of the text and not the field deliminator?
You need to use a delimiter which is not in the text, like TAB, although that is not guaranteed either. I have used !##! for delimiters in this kind of text, because it is highly unlikely it will occur in the normal text.|||Thanks for all the suggestions.|||
I have to say that this has got to be one of the more annoying bugs in SSIS. We're on the sixth (seventh if you count OS/2 release) edition of SQL Server and their import routine can't handle embedded double quotes ANYMORE?
I just don't get it. That makes NO sense to me what so ever.
I was at least impressed that after I started jumping through the preview by sets of rows (and I had to be at the right point for it to actually fail, jumping to within 20-30 rows still showed the data with the embedded quotes, so it actually can handle it, it just chooses not to at certain times. When I moved back about 100 rows, then I got the message telling me that embedded quotes would not work and that I was SOL on importing data through this method.
When is SP3 coming out? Can I put in a request to have this little oversight fixed?
|||
Bill from Racine wrote:
I have to say that this has got to be one of the more annoying bugs in SSIS. We're on the sixth (seventh if you count OS/2 release) edition of SQL Server and their import routine can't handle embedded double quotes ANYMORE?
I just don't get it. That makes NO sense to me what so ever.
It makes perfect sense to me. If your data contains "embedded" quotes and commas, then the correct way to build that file is with a column delimiter such as a TAB, pipe, or some other character.
I just want to know how anything can parse this file:
"Column1","Col","umn2","Column3"
I see that and I see four columns, not three. Come on, give SSIS some credit. File builders need to understand the data instead of blindly choosing a default value of comma delimited. I NEVER accept comma delimited files and will work with the person/company who supplies them to correct the error (in my opinion.) Fixed width files are a sure fire way to get the data you desire.
Sorry, but people blaming SSIS for not handling bad data doesn't sit well with me.|||This problem is exactly why people who write EXPORTS should never use comma delimited files, commas are way to common in data. I always pick tab or | or something else which is very unlikely to occur in the data. In one case I picked #?# as a field delimiter.
It gets even worse when you try to import comment fields with free form text which can contain anything, including commas, single quotes, double quotes, CR, CR/LF, etc.|||
Tom Phillips wrote:
It gets even worse when you try to import comment fields with free form text which can contain anything, including commas, single quotes, double quotes, CR, CR/LF, etc.
This is easily solved though... Use fixed width files. The moral is to use the *correct* tool for the job. CSV files rarely cut the mustard, so to speak.
And sometimes, just *sometimes*, the data should be cleansed BEFORE exporting.|||
I'm guessing that all of the commenters to my post here may never actually tried to use the flat file import with embedded quotes.
For instance, my data does not have commas within the text. The only commas are the delimiter (and yes I agree - it's a lousy terminator, but sometimes you have no control over what you get) not within the actual field.
My fields looks like this:
"Thomas ""Tommy"" Barker", 123456, 3.009, 1005
Notice that there is a nickname given within the name field. That's causing SSIS to barf on the import, because there are embedded quotes - not an extra comma, just extra quotes.
DTS handles this just fine, even the data viewer can parse this 99% of the time. I found rolling back and forth over the offending record, one row at a time can sometimes get the same result, but most times, it reads it just fine. So there is some other issue in their parser that causes it to work sometimes and not others.
The embedded quotes has been something that flat files were created - so not being able to support it now seems kind of silly to me.
I was hoping that moving from DTS to SSIS - I could expect enhanced functionality - not degraded functionality.
Maybe in SP3.
|||
Phil,
That's one of the most annoying things about it. My feed has double quotes - the line actually looks like this:
"Thomas ""Tommy"" Barker", "field2", "etc"
I have been able to get SSIS to show me data in a data viewer and I can see the data just fine. But when I actually let the package go and run - it will bomb out with an error telling me it can't handle the embedded quotes.
Even MS through their feedback site has said that this was a "decision in design" and may be addressed at a later date.
Try adding a number of rows before and after and let the whole thing go in with the quotes as in my example and see if it works.
I was able to get the same data set returned in an XML format and it works fine, even though there are embedded quotes there, I guess because of the XML tags, it's able to make heads and tails of it.
Thanks for trying it out.
Bill
No comments:
Post a Comment