From: sabrina downard Date: 23:14 on 19 Jun 2005 Subject: Linefeeds How many years have the various operating systems been doing their own thing with regard to what marks the end of a line? Can we agree on something yet, please, for the love of bog? I really don't feel that I'm asking for too much, here, it not being 1985 anymore and all. hatefully, --s. p.s. Yes, iTunes -- exporting files that vi/awk/et al. on the same bedamned machine sees as two extremely long lines, and that only because one of the MP3s contained a comment which had a linefeed in it -- I'm looking at you.
From: Robert G. Werner Date: 04:00 on 20 Jun 2005 Subject: Re: Linefeeds sabrina downard wrote: > How many years have the various operating systems been doing their own > thing with regard to what marks the end of a line? Can we agree on > something yet, please, for the love of bog? I really don't feel that > I'm asking for too much, here, it not being 1985 anymore and all. > > hatefully, > --s. > > p.s. Yes, iTunes -- exporting files that vi/awk/et al. on the same > bedamned machine sees as two extremely long lines, and that only because > one of the MP3s contained a comment which had a linefeed in it -- I'm > looking at you. > line\n\r feeds\m are^m hard&return;
From: Dave Vandervies Date: 16:17 on 20 Jun 2005 Subject: Re: Linefeeds Somebody claiming to be Robert G. Werner wrote: > > sabrina downard wrote: > > p.s. Yes, iTunes -- exporting files that vi/awk/et al. on the same > > bedamned machine sees as two extremely long lines, and that only because > > one of the MP3s contained a comment which had a linefeed in it -- I'm > > looking at you. > > > line\n\r > feeds\m > are^m > hard&return; out=fopen("foo","w"); fputs("Nope, line\n",out); fputs("feeds are\n",out); fputs("actually\n",out); fputs("really easy\n",out); fclose(out); Any system that runs general-purpose programs has a C I/O library that knows exactly how to do line feeds for that system, and most non-C languages either have C at the back-end anyways or can easily be coerced to use the C library for I/O. The hard part is finding a cluestick big enough for all of the people who think all the world's a unix system and bypass the C stdio library "for efficiency" because "It doesn't really matter, text and binary are the same". (Yeah, and the bugs and idiocies you're introducing are really worth the few nanoseconds you save on millisecond-timed I/O operations.) The OP indicates that apparently not even all unix systems are unix in this respect anymore... dave
From: Michael G Schwern Date: 02:21 on 22 Jun 2005 Subject: Re: Linefeeds On Mon, Jun 20, 2005 at 11:17:26AM -0400, Dave Vandervies wrote: > The OP indicates that apparently not even all unix systems are unix in > this respect anymore... OS X is a bit schitzoid in that the Aqua things use the Mac newline (carriage return, 015) and the Unixy things use the Unix newline (012). They generally do a decent job interacting with each other but vim and XEmacs need to get a clue. vim at least tries to take a stab at auto-detecting what newline style is being used but it thinks a file with mac newlines is a DOS formatted file (015 012). They couldn't go that extra inch to see if the 015 is followed by a 012. XEmacs doesn't even try at all... at least I haven't found the variable to flip to make it try. Oddly enough I haven't had an issue lately with transfering text around between different machines. I guess after 10 years of fairly ubiquitous Interneting things are finally learning how to play nice with others.
From: Robert G. Werner Date: 02:42 on 22 Jun 2005 Subject: Re: Linefeeds Dave Vandervies wrote: > Somebody claiming to be Robert G. Werner wrote: > {snip] > > The OP indicates that apparently not even all unix systems are unix in > this respect anymore... > > > dave > I guess I was thinking more along the famous line by Barbie about math ...
From: Peter da Silva Date: 04:07 on 22 Jun 2005 Subject: Re: Linefeeds > The OP indicates that apparently not even all unix systems are unix in > this respect anymore... Old-sk00l Carbon apps think they're running under Mac OS. The library even maps "Mac HD : Users : Peter" into "/users/peter" so they don't even see UNIX file names.
From: David Champion Date: 20:06 on 25 Jun 2005 Subject: Re: Linefeeds * On 2005.06.20, in <200506201517.IAA23114@xxxxxx.xxx>, * "Dave Vandervies" <dj3vande@xxxxxx.xxx> wrote: > > > > > line\n\r > > feeds\m > > are^m > > hard&return; True. > out=fopen("foo","w"); > fputs("Nope, line\n",out); > fputs("feeds are\n",out); > fputs("actually\n",out); > fputs("really easy\n",out); > fclose(out); True. The problem isn't in I/O, it's in protocol. Everyone has a new and improved way of indicating logical line breaks within their own cross-platform specification. Traditionally, the Intarweb uses MS-DOS line breaks, \r\n, for maximum naive portability, while some specific platforms use either \r or \n solo. Each endpoint needs to be able to recognize what it's receiving and match what it's sending. I'm on a development team for an application -- a network listener with a bunch of arbitrary purpose behind it -- where, mysteriously, for reasons undiscovered, someone got \r\n backwards. It issues line breaks as \n\r. This is fine if you're a raw terminal device, and it doesn't really matter, but if you're a client application, this might matter. And, in fact, the client I use most often doesn't recognize \n\r as a line break; it recognizes it as two shizophrenic line breaks, so I get everything in doublespace. This has caused me some amount of teeth-grinding. I've had to turn vegetarian. > Any system that runs general-purpose programs has a C I/O library that > knows exactly how to do line feeds for that system, and most non-C > languages either have C at the back-end anyways or can easily be > coerced to use the C library for I/O. So, the trouble is it's not the host system, it's the interchange. What about data representations where a logical newline is zero-width whitespace, used exclusively to prettify presentation of metadata? The C library doesn't have a special XML mode, or a special LDAP mode, or a special Joe's L33t RDBMS mode -- nor should it. At some point you just have to accept that your application needs to have a brain, and also to use it. Personally -- and I'll admit that I'm speaking as a UNIX developer here -- I wish C didn't differentiate text and binary, not because they're the same, but because there's more than just text and binary in that big bad world, and it's not the C library's job to know the difference. It's just an illusion to think this alone is going to save your ass. > The hard part is finding a cluestick big enough for all of the people > who think all the world's a unix system and bypass the C stdio library Yeah, that's the Mac, right there. (Not really.) > The OP indicates that apparently not even all unix systems are unix in > this respect anymore... Who said anything about UNIX systems? Maybe it's iTunes for Windows, with Cygwin providing her %EDITOR% of choice. Newlines are hard, and it's not UNIX's fault.
From: peter (Peter da Silva) Date: 01:39 on 26 Jun 2005 Subject: Re: Linefeeds > Newlines are hard, and it's not UNIX's fault. In fact, UNIX followed the recommendations of the original ASCII standard that if a single character was used for a line separator it should be linefeed. Just about everyone else picked carriage return or both. Except DEC, of course. RSX text files had a one or two byte length, followed by an optional two byte line number, and the records themselves were either padded to a multiple of 80 bytes or jammed together with no separator, and blocks were either padded with nulls or lines could span block boundaries. You have to look at the file type and mode to see which was which. To read or write text files from Forth I gave up and called the Fortran runtime.
From: Martin Ebourne Date: 23:49 on 26 Jun 2005 Subject: Re: Linefeeds On Sat, 2005-06-25 at 14:06 -0500, David Champion wrote: > I'm on a development team for an application -- a network listener > with a bunch of arbitrary purpose behind it -- where, mysteriously, > for reasons undiscovered, someone got \r\n backwards. It issues line > breaks as \n\r. Probably not the reason here, but that's the Acorn line break. There's a good reason for it being that way round: it saved several bytes and a quite a few machine cycles on the old BBC micro. Cheers, Martin.
From: Jarkko Hietaniemi Date: 06:57 on 27 Jun 2005 Subject: Re: Linefeeds Martin Ebourne wrote: > On Sat, 2005-06-25 at 14:06 -0500, David Champion wrote: > >>I'm on a development team for an application -- a network listener >>with a bunch of arbitrary purpose behind it -- where, mysteriously, >>for reasons undiscovered, someone got \r\n backwards. It issues line >>breaks as \n\r. > > > Probably not the reason here, but that's the Acorn line break. There's a > good reason for it being that way round: it saved several bytes and a > quite a few machine cycles on the old BBC micro. Ummm, could you elaborate. ASM fine in the explanation. > Cheers, > > Martin. > >
From: Martin Ebourne Date: 11:04 on 27 Jun 2005 Subject: Re: Linefeeds Jarkko Hietaniemi <jhietaniemi@xxxxx.xxx> wrote: > Martin Ebourne wrote: >> Probably not the reason here, but that's the Acorn line break. There's a >> good reason for it being that way round: it saved several bytes and a >> quite a few machine cycles on the old BBC micro. > > Ummm, could you elaborate. ASM fine in the explanation. Well I'm a bit out of practice on the 6502. But something like this. There are two main OS calls for writing a character to the screen: OSWRCH - OS write character, the underlying call to write characters OSASCII - Same as OSWRCH but with character translation. In fact, all it does is translate the 'official' Acorn line ending to the underlying characters (which become the 'alternative' Acorn line break - should have been a bit more clear above), so it converts \r to \n\r. A key feature is that it returns with the accumulator untouched. So in the OS syscall entry space we have: .osascii CMP #13 BNE oswrch LDA #10 JSR oswrch LDA #13 .oswrch JMP (<jump table vector address>) Clearly to write \r\n and still return with \r in the accumulator takes another JSR, LDA, and RTS. 6 more bytes and plenty of cycles. Of course, if Acorn had used \n for linebreaks instead of \r then the code above would trivially produce \r\n and everything would have matched up with both unix & dos so much better. Cheers, Martin.
From: Peter da Silva Date: 12:01 on 27 Jun 2005 Subject: Re: Linefeeds On Jun 27, 2005, at 5:04 AM, Martin Ebourne wrote: > Of course, if Acorn had used \n for linebreaks instead of \r then the > code above would trivially produce \r\n and everything would have > matched up with both unix & dos so much better. Not to mention actually following ASCII which specified two possible encodings for a new line, either "linefeed-CarriageReturn" or "newline", where "linefeed" and "newline" were alternate names for the same position (0x0A, 0/10). Using "<CR>" for a newline breaks the straightforward translation of FORTRAN carriage control: If the first character is space, replace with linefeed. If the first character is plus, delete it. If the first character is 0, replace with linefeed-linefeed. If the first character is 1, replace with formfeed. Print the line followed by a carriage return. This produces the correct result in either case. Using carriage return for newline breaks FORTRAN, and in 1963 that was a big no-no.
From: Michael G Schwern Date: 23:25 on 27 Jun 2005 Subject: Re: Linefeeds On Mon, Jun 27, 2005 at 11:04:51AM +0100, Martin Ebourne wrote: > Of course, if Acorn had used \n for linebreaks instead of \r then the > code above would trivially produce \r\n and everything would have > matched up with both unix & dos so much better. \r\n makes sense to me as a newline, historically. Its a direct translation of the commands to the line printer. Move the head to the first column. Move down one row. \n\r makes sense in the same way. \n I can understand for Unix as by the early 70s working on displays rather than line printers is more common and its no longer necessary to give explicit commands. Though why they changed it... maybe they just wanted to save one character per line? But why use \r? \n I get, "move down one line" and moving back to the first column is implicit. But \r... "move back to the first column" and going to the next line is implicit? Doesn't seem right. Unless, of course, they didn't consider 015 to be "carriage return" and 012 to be "newline"?
Generated at 12:27 on 27 Sep 2007 by mariachi