Do you have a question? Post it now! No Registration Necessary. Now with pictures!
- Posted on
November 18, 2005, 8:35 pm
rate this thread
Please don't top post.
The same logic applies.
This should get you started. Please excuse the line wrapping.
$ perl -e 'print "Hello\r\nThis is
This is a
$ perl -e 'print "Hello\r\nThis is
a\r\nhideously\r\nbroken\r\nsentence\r\n"' | perl -0pe 's!\r\n(.)! $1!g'
Hello This is a hideously broken sentence
That's one of the ways of doing it. But the data is not in the database
in tab-delimited format, so it' doesn't matter.
When you export the data you can specify both field and record delimiters.
Remove the "x" from my email address
JDS Computer Training Corp.
Writing in news:alt.www.webmaster
From the safety of the cafeteria
Are you automating notepad to achieve this task or attempting to do this
manually? I can think of better methods of automation and I would advise
against manual operation for a daily task.
As for record terminators, I've no idea whether you need both CR & LF or
whether either one of those will suffice.
virtue is its own punishment
|Writing in news:alt.www.webmaster
| From the safety of the cafeteria
|> 8,500 records
|> I'll try removing them all, (notepad++ can find and remove them) then re
|> import into access and see what happens. This is a daily access export
|> text for upload to web server. Aren't crlf's needed to signify the end
|> the record in a tab delimited file?
|Are you automating notepad to achieve this task or attempting to do this
|manually? I can think of better methods of automation and I would advise
|against manual operation for a daily task.
A long time Textpad user myself, This is the first I've heard of it. I'm going
to check it out.
- Norman L. DeForest
November 20, 2005, 2:20 pm
On Sat, 19 Nov 2005, DBLEXPOSURE wrote:
Could you create a fictitious version (with dummy data so you don't leak
any sensitive material) of such a file (or just do a search-and-replace on
a real problem file to substitute dummy data) and a manually-converted
verion of the same dummy data that *can* be imported into Access and make
zipped copies of both of them available? That way, people can see
*exactly* what conversions are necessary and someone may be better able
to create a script or program for making such a conversion automatically
(if that is possible).
If something as simple as (1) "if a line ends with a delimiter, assume
that the next line is a continuation and join the two lines" or (2) "if a
line has an odd number of quotes, assume that it ends with a partial
string and the next line contains the rest of the string" or (3) "if there
are fewer than 6 items in a line, assume the next line to be a
continuation of the record were possible, it could easily be automated.
Going by rule (1), this:
"a", 123, 4567, 619,
"b", 582, 1039, 772,
"a", 123, 4567, 619, 301, 65
"b", 582, 1039, 772, 520, 77
(note, this may fail if items may be omitted such as the "123" and "65"
in the first record and the "1039" in the second:
"a",, 4567, 619,
"b", 582,, 772,
since the omitted "65" could falsely imply that the next line is part of
the same record unless there is also a rule about the maximum number of
items in a record) and, using rule (2), this:
"Foo Corporation", "123 West
Main Street", "Cleveland", "OH"
"Foo Corporation", "123 West Main Street", "Cleveland", "OH"
and, using rule (3) (if whitespace was a delimiter), this:
123 456 720 492
239 332 721 449
123 456 720 492 4015 2991
239 332 721 449 3981 3114
However, without seeing an exact example of the data before and after
conversion, and the layout of the database, figuring out what rules can be
used is just a guessing game.
Norman De Forest http://www.chebucto.ns.ca/~af380/Profile.html
"> Is there anything Spamazon DOESN'T sell?
Clues. The market's too small to justify the effort."
-- Stuart Lamble in the scary devil monastery, Fri, 13 May 2005
Sure there are. When I read the file in Notepad++ (show all characters) I
see cr/lf at the end of every record.
This record has 3 of them before the end of record which causes it to break
when loaded into mysql.
AKR30930B 169.98 1 EACH 1.00 Akro Mils 30930B<br><br>Akro Mils 30930B beige
Procart. A customizable work
center for the assembly line warehouse or wherever you need portable
storage. High-density polyethylene structural foam construction is
dentresistant rust-proof and never needs painting. 400lb (180 kg) maximum
cart load. 200 lb. (90 kg) capacity per top and bottom shelf.
Assembly required. I-beam post makes loading and unloading or box-top cart.
Comfortable fullwidth handles give complete handling control.
<br><br>Limited to current inventory. AKRO MILS 30930B
They are 0x0A (\n), a CRLF is 0x0D 0x0A (\r\n). You have no CRLF's.
Assuming all records are supposed to start with some kind of part code like
AIM30-9202, you need a regex that will remove all \n's that are not
followed by a part code.
Well, I do not understand why notepad++ shows them as being cr/lf?
At any rate, they are scattered thought the database in places they should
Sounds easy enough :^(
I would like to get this done within my Access database as it is the source
for all exports now an future. ( Records are entered by any number of users
who may cut and paste from sources that contain \n's or \r\n's where they
might cause trouble )
Looking in Access I see no way of changing the record delimiter. Scratching
my head.. Thinking of added a new field to end the record????
Probably because your on Windows and it automatically show them as Windows
end of lines (CRLF), but if you open the file in a hex editor (as I did)
you will see the line endings are all 0x0A (LF).
Never used Access.
Just read the records one by one, apply the regex and update.