| Topic: |
Science > Physics |
| User: |
"Chalky" |
| Date: |
12 Nov 2006 04:25:03 AM |
| Object: |
Restricted ASCII? |
ASCII is defined in wiki as an 8-bit system, developed from telegraphic
codes, to allow for 256 characters.
However, I have noticed that this set is truncated to 7 characters at
sci.astro.research to conform to its first commercial use as a
seven-bit teleprinter code (1963).
This can result in considerable garbling of 8-bit ASCII text.
Since I am pretty sure I have seen Schrodinger's equation spelled
correctly at sci.physics.research, I am curious to discover how endemic
this restriction of the ASCII set actually still is, in the Usenet
groups.
Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
-, and for copyright, below
=B0 =B1 =A9
Chalky
.
|
|
| User: "Chalky" |
|
| Title: Re: Restricted ASCII? |
12 Nov 2006 04:46:29 AM |
|
|
Chalky wrote:
ASCII is defined in wiki as an 8-bit system, developed from telegraphic
codes, to allow for 256 characters.
However, I have noticed that this set is truncated to 7 characters at
sci.astro.research to conform to its first commercial use as a
seven-bit teleprinter code (1963).
This can result in considerable garbling of 8-bit ASCII text.
Since I am pretty sure I have seen Schrodinger's equation spelled
correctly at sci.physics.research, I am curious to discover how endemic
this restriction of the ASCII set actually still is, in the Usenet
groups.
Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
-, and for copyright, below
=B0 =B1 =A9
Chalky
Interesting. All groups display correctly except
sci.physics.relativity, which still displayed 8 bits, but translated
into a more old-fashioned font.
Looks like sci.astro.research is the only newsgroup actually restricted
to 7 bits.
I wonder why that is?
Chalky
.
|
|
|
| User: "Sorcerer" |
|
| Title: Re: Restricted ASCII? |
12 Nov 2006 07:32:03 AM |
|
|
"Chalky" <chalkyspam@bleachboys.co.uk> wrote in message
news:1163328389.468587.34850@h48g2000cwc.googlegroups.com...
Chalky wrote:
ASCII is defined in wiki as an 8-bit system, developed from telegraphic
codes, to allow for 256 characters.
However, I have noticed that this set is truncated to 7 characters at
sci.astro.research to conform to its first commercial use as a
seven-bit teleprinter code (1963).
This can result in considerable garbling of 8-bit ASCII text.
Since I am pretty sure I have seen Schrodinger's equation spelled
correctly at sci.physics.research, I am curious to discover how endemic
this restriction of the ASCII set actually still is, in the Usenet
groups.
Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
-, and for copyright, below
° ± ©
Chalky
Interesting. All groups display correctly except
sci.physics.relativity, which still displayed 8 bits, but translated
into a more old-fashioned font.
Looks like sci.astro.research is the only newsgroup actually restricted
to 7 bits.
I wonder why that is?
Chalky
Returned from sci.physics, also absent auto-indent.
Androcles
.
|
|
|
| User: "Chalky" |
|
| Title: Re: Restricted ASCII? |
12 Nov 2006 07:57:15 AM |
|
|
Sorcerer wrote:
Returned from sci.physics.relativity, absent auto-indent.
Sorcerer wrote:
Returned from sci.physics, also absent auto-indent.
Androcles
Sorry, could you explain what you mean by this?
As far as I am aware, auto-indent is not an ascii code.
As far as I am aware, I did not employ an auto-indent in these
postings, anyway.
C
.
|
|
|
| User: "Sorcerer" |
|
| Title: Re: Restricted ASCII? |
12 Nov 2006 09:21:08 AM |
|
|
"Chalky" <chalkyspam@bleachboys.co.uk> wrote in message
news:1163339834.991566.50860@i42g2000cwa.googlegroups.com...
|
| Sorcerer wrote:
|
| > Returned from sci.physics.relativity, absent auto-indent.
|
| Sorcerer wrote:
|
| > Returned from sci.physics, also absent auto-indent.
| > Androcles
|
| Sorry, could you explain what you mean by this?
| As far as I am aware, auto-indent is not an ascii code.
| As far as I am aware, I did not employ an auto-indent in these
| postings, anyway.
|
It was my test, much like yours.
When using Outlook Express, one selects "Reply Group" and the
window to edit the text appears. Normally the text is auto-indented,
with a symbol "|", ">" or ":" (user selectable) but it looks as if
responding
8-bit ASCII prevents that.
The other bug I have is long text without white space. A long URL, for
example,
will corrupt all the text that follows, taking out ALL white space as you'll
see below.
I did not type "...Variations.htmIt", there was a carriage return/line feed
between "...Variations.htm" and "It"
Androcles.
Example:
http://www.androcles01.pwp.blueyonder.co.uk/Copernicus/LightCurveVariations.htmItwas my test, much like yours.When using Outlook Express, one selects "ReplyGroup"and the window to edit the text appears. Normally thetext isauto-indented, but it looks as if 8 bit ASCIIprevents that. The other bug islong text without white space.A long URL, for example, will corrupt all thetext that follows.
.
|
|
|
|
|
|
| User: "Sorcerer" |
|
| Title: Re: Restricted ASCII? |
12 Nov 2006 07:26:14 AM |
|
|
"Chalky" <chalkyspam@bleachboys.co.uk> wrote in message
news:1163328389.468587.34850@h48g2000cwc.googlegroups.com...
Chalky wrote:
ASCII is defined in wiki as an 8-bit system, developed from telegraphic
codes, to allow for 256 characters.
However, I have noticed that this set is truncated to 7 characters at
sci.astro.research to conform to its first commercial use as a
seven-bit teleprinter code (1963).
This can result in considerable garbling of 8-bit ASCII text.
Since I am pretty sure I have seen Schrodinger's equation spelled
correctly at sci.physics.research, I am curious to discover how endemic
this restriction of the ASCII set actually still is, in the Usenet
groups.
Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
-, and for copyright, below
° ± ©
Chalky
Interesting. All groups display correctly except
sci.physics.relativity, which still displayed 8 bits, but translated
into a more old-fashioned font.
Looks like sci.astro.research is the only newsgroup actually restricted
to 7 bits.
I wonder why that is?
Chalky
Returned from sci.physics.relativity, absent auto-indent.
.
|
|
|
| User: "Sorcerer" |
|
| Title: Re: Restricted ASCII? |
12 Nov 2006 07:58:06 AM |
|
|
"Sorcerer" <Headmaster@hogwarts.physics_e> wrote in message
news:WhF5h.157081$lT5.7614@fe2.news.blueyonder.co.uk...
|
| "Chalky" <chalkyspam@bleachboys.co.uk> wrote in message
| news:1163328389.468587.34850@h48g2000cwc.googlegroups.com...
|
| Chalky wrote:
|
| > ASCII is defined in wiki as an 8-bit system, developed from telegraphic
| > codes, to allow for 256 characters.
| >
| > However, I have noticed that this set is truncated to 7 characters at
| > sci.astro.research to conform to its first commercial use as a
| > seven-bit teleprinter code (1963).
| >
| > This can result in considerable garbling of 8-bit ASCII text.
| >
| > Since I am pretty sure I have seen Schrodinger's equation spelled
| > correctly at sci.physics.research, I am curious to discover how endemic
| > this restriction of the ASCII set actually still is, in the Usenet
| > groups.
| >
| > Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
| > -, and for copyright, below
| >
| > ° ± ©
| >
| > Chalky
|
| Interesting. All groups display correctly except
| sci.physics.relativity, which still displayed 8 bits, but translated
| into a more old-fashioned font.
|
| Looks like sci.astro.research is the only newsgroup actually restricted
| to 7 bits.
|
| I wonder why that is?
|
| Chalky
|
|
| Returned from sci.physics.relativity, absent auto-indent.
|
Returned from sci.physics, complete with auto-indent.
Androcles.
.
|
|
|
|
|
| User: "Chalky" |
|
| Title: Re: Restricted ASCII? |
12 Nov 2006 07:41:30 AM |
|
|
Chalky wrote:
Chalky wrote:
ASCII is defined in wiki as an 8-bit system, developed from telegraphic
codes, to allow for 256 characters.
However, I have noticed that this set is truncated to 7 characters at
sci.astro.research to conform to its first commercial use as a
seven-bit teleprinter code (1963).
This can result in considerable garbling of 8-bit ASCII text.
Since I am pretty sure I have seen Schrodinger's equation spelled
correctly at sci.physics.research, I am curious to discover how endemic
this restriction of the ASCII set actually still is, in the Usenet
groups.
Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
-, and for copyright, below
=B0 =B1 =A9
Chalky
Interesting. All groups display correctly except
sci.physics.relativity, which still displayed 8 bits, but translated
into a more old-fashioned font.
Looks like sci.astro.research is the only newsgroup actually restricted
to 7 bits.
I wonder why that is?
Chalky
I have since noticed that the wiki reference and all references
therefrom are complete rubbish in all other respects, since they all
still restrict the displayed characters to the least significant 7 bits
of ASCII (i.e. restriction to Bell teleprinter code, circa 1963).
This erases all unique characteristics of Scandanavian (and Germanic)
languages, all unique characteristics of Latin languages (such as
French & Spanish), and all currencies other than the Yankey Dollar.
(Thus excluding the British Pound Sterling, the Euro, and the Japanese
Yen [to name a few important examples], as well as precluding the use
of any more advanced scientific notation.)
So, this is (probably) goodbye from me to sci.astro.research. (I can't
cope with this more-than-40-year-out-of-date ascii restriction. [Or, as
Captain Beefheart said more eloquently, "I cry, but I can't buy your
Veterans' Day Poppy."])
In view of this apparent dearth of up-to-date information on ascii on
the internet, I am now recommending (to the relevant management) that
the intRAnet version of the file http://1stlight.org/design/ascii.asp,
should now be included on the intERnet version of that site, too.
Chalky
.
|
|
|
|
| User: "" |
|
| Title: Re: Restricted ASCII? |
14 Nov 2006 06:37:07 PM |
|
|
Chalky wrote:
Chalky wrote:
ASCII is defined in wiki as an 8-bit system, developed from telegraphic
codes, to allow for 256 characters.
However, I have noticed that this set is truncated to 7 characters at
sci.astro.research to conform to its first commercial use as a
seven-bit teleprinter code (1963).
This can result in considerable garbling of 8-bit ASCII text.
Since I am pretty sure I have seen Schrodinger's equation spelled
correctly at sci.physics.research, I am curious to discover how endemic
this restriction of the ASCII set actually still is, in the Usenet
groups.
Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
-, and for copyright, below
=B0 =B1 =A9
Chalky
Interesting. All groups display correctly except
sci.physics.relativity, which still displayed 8 bits, but translated
into a more old-fashioned font.
Looks like sci.astro.research is the only newsgroup actually restricted
to 7 bits.
I wonder why that is?
It's probably the same reason as all
backward compatibilty problems,
Whoever set the newsgroup up, may
have decided to use Fortran code
from the Mercury missions, as the
standard for that group's servers.=20
=20
Chalky
.
|
|
|
|
| User: "Pat OConnell" |
|
| Title: Re: Restricted ASCII? |
12 Nov 2006 09:03:11 PM |
|
|
Chalky wrote:
Chalky wrote:
ASCII is defined in wiki as an 8-bit system, developed from telegraphic
codes, to allow for 256 characters.
However, I have noticed that this set is truncated to 7 characters at
sci.astro.research to conform to its first commercial use as a
seven-bit teleprinter code (1963).
This can result in considerable garbling of 8-bit ASCII text.
Since I am pretty sure I have seen Schrodinger's equation spelled
correctly at sci.physics.research, I am curious to discover how endemic
this restriction of the ASCII set actually still is, in the Usenet
groups.
Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
-, and for copyright, below
° ± ©
Chalky
Interesting. All groups display correctly except
sci.physics.relativity, which still displayed 8 bits, but translated
into a more old-fashioned font.
Looks like sci.astro.research is the only newsgroup actually restricted
to 7 bits.
Only characters 0 through 127 have been standardized as part of ASCII.
Each computer operating system (for instance DOS, Windows, Mac, and VMS)
displays its own symbol set for 128 through 255.
--
Pat O'Connell
[note munged EMail address]
Take nothing but pictures, Leave nothing but footprints,
Kill nothing but vandals...
.
|
|
|
| User: "Paul Schlyter" |
|
| Title: Re: Restricted ASCII? |
13 Nov 2006 09:43:30 AM |
|
|
An interesting history of character codes, from Morse Codes through
Baudot Code to ASCII-1967 can be found here:
http://www.wps.com/projects/codes/
The author says, on that page:
# ASCII is and always was a seven bit code. I am shocked at the number of
# people and sources that claim it to be an 8-bit code. There are only 128
# character codes in ASCII.
# Many of the extentions to ASCII are 8 bits, but they are not ASCII.
--
----------------------------------------------------------------
Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN
e-mail: pausch at stockholm dot bostream dot se
WWW: http://stjarnhimlen.se/
.
|
|
|
|
| User: "Ben Rudiak-Gould" |
|
| Title: Re: Restricted ASCII? |
13 Nov 2006 06:12:38 PM |
|
|
Pat O'Connell wrote:
Each computer operating system (for instance DOS, Windows, Mac, and VMS)
displays its own symbol set for 128 through 255.
Every *localized version* of every OS has a *default* character encoding
that many tools use in the absence of other encoding information. The local
default makes no sense (on either end) for documents that are being sent
between computers, like email and Usenet messages and HTML pages. Therefore
you should always specify the encoding of such a message explicitly within
the message itself, which makes the local default irrelevant.
-- Ben
.
|
|
|
|
| User: "Chalky" |
|
| Title: Re: Restricted ASCII? |
13 Nov 2006 01:19:58 AM |
|
|
Pat O'Connell wrote:
Only characters 0 through 127 have been standardized as part of ASCII.
Each computer operating system (for instance DOS, Windows, Mac, and VMS)
displays its own symbol set for 128 through 255.
Thanks for this info..
But, _good grief_, is this for real?
I had assumed that if I checked that a web page displayed correctly
with Netscape 4, Netscape 7, Mozilla, Firefox, Opera, and Internet
Explorer browsers, this would mean that it probably displayed
correctly, period.
Are you now telling me that I also have to run all these browsers on a
range of different computer makes, with different dates of manufacture,
before I can have any confidence in this?
If so, is there a reference link I can go to, to identify what the
problem characters are likely to be?
Or is it now necessary for each web designer to construct his own
graphics symbols for everything that is not already defined in 7 bit
US-ASCII ?
Chalky
.
|
|
|
| User: "Sorcerer" |
|
| Title: Re: Restricted ASCII? |
13 Nov 2006 01:44:19 AM |
|
|
"Chalky" <chalkyspam@bleachboys.co.uk> wrote in message
news:1163402398.151550.250220@m7g2000cwm.googlegroups.com...
|
| Pat O'Connell wrote:
|
| > Only characters 0 through 127 have been standardized as part of ASCII.
| > Each computer operating system (for instance DOS, Windows, Mac, and VMS)
| > displays its own symbol set for 128 through 255.
|
| Thanks for this info..
|
| But, _good grief_, is this for real?
Yes.
| I had assumed that if I checked that a web page displayed correctly
| with Netscape 4, Netscape 7, Mozilla, Firefox, Opera, and Internet
| Explorer browsers, this would mean that it probably displayed
| correctly, period.
|
| Are you now telling me that I also have to run all these browsers on a
| range of different computer makes, with different dates of manufacture,
| before I can have any confidence in this?
If that gives you confidence, yes. I won't bother. When I built
a Nascom II in 1979 the character set above 127 was used for
graphics.
|
| If so, is there a reference link I can go to, to identify what the
| problem characters are likely to be?
Maybe, I don't know (or even care very much) :-)
| Or is it now necessary for each web designer to construct his own
| graphics symbols for everything that is not already defined in 7 bit
| US-ASCII ?
If he wants to.
I was accused of writing milliseconds (symbol roman m)
when I had written microseconds (symbol greek mu) by someone
not using Internet Explorer or Firefox. I am NOT changing it.
The reader can change his browser as far as I'm concerned.
See for yourself:
http://www.androcles01.pwp.blueyonder.co.uk/Smart/Smart.htm
Androcles.
.
|
|
|
| User: "Paul Schlyter" |
|
| Title: Re: Restricted ASCII? |
13 Nov 2006 09:43:30 AM |
|
|
In article <nnV5h.124886$3x1.20740@fe1.news.blueyonder.co.uk>,
Sorcerer <Headmaster@hogwarts.physics_e> wrote:
...........
I was accused of writing milliseconds (symbol roman m)
when I had written microseconds (symbol greek mu) by someone
not using Internet Explorer or Firefox. I am NOT changing it.
The reader can change his browser as far as I'm concerned.
See for yourself:
http://www.androcles01.pwp.blueyonder.co.uk/Smart/Smart.htm
Androcles.
You could try to write:
μ
instead of:
<font face="Symbol">m</font>
though. That requires a Unicode font in the web browser to display properly,
but it worked fine on my browsers.
Or you could encode your web page as UTF-8 (and then of course also say so in
the header of the HTML) -- then you could write greek letters directly in
your HTML code. But that requires an UTF-8 capable browser, but most modern
browser have that capability.
Note: the Unicode character code for Greek mu is: 0x03BC
--
----------------------------------------------------------------
Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN
e-mail: pausch at stockholm dot bostream dot se
WWW: http://stjarnhimlen.se/
.
|
|
|
| User: "Sorcerer" |
|
| Title: Re: Restricted ASCII? |
13 Nov 2006 11:32:34 AM |
|
|
"Paul Schlyter" <pausch@saaf.se> wrote in message
news:eja37t$2165$1@merope.saaf.se...
| In article <nnV5h.124886$3x1.20740@fe1.news.blueyonder.co.uk>,
| Sorcerer <Headmaster@hogwarts.physics_e> wrote:
| ..........
| > I was accused of writing milliseconds (symbol roman m)
| >when I had written microseconds (symbol greek mu) by someone
| >not using Internet Explorer or Firefox. I am NOT changing it.
| >The reader can change his browser as far as I'm concerned.
| >See for yourself:
| > http://www.androcles01.pwp.blueyonder.co.uk/Smart/Smart.htm
| >
| >Androcles.
|
| You could try to write:
|
| μ
|
| instead of:
|
| <font face="Symbol">m</font>
|
| though. That requires a Unicode font in the web browser to display
properly,
| but it worked fine on my browsers.
|
| Or you could encode your web page as UTF-8 (and then of course also say so
in
| the header of the HTML) -- then you could write greek letters directly in
| your HTML code. But that requires an UTF-8 capable browser, but most
modern
| browser have that capability.
|
| Note: the Unicode character code for Greek mu is: 0x03BC
Thank you, I could. But I'm not going to, and neither am I going to use
Chinese, Hebrew or Cherokee characters in anticipation of someone having
a browser that doesn't recognise Greek characters. Either he gets a browser
that does, or he fails to communicate. That's his problem, not mine.
Many years ago I went to Italy to speak with some engineers concerning
the electronics of Tornado, which was built jointly between Britain,
Germany and Italy.
http://archive.cs.uu.nl/pub/AIRCRAFT-IMAGES/Tornado.jpg
I discovered there were no electronics data books in Italian, they
were using American just as I was. So we both had to use a common
subset of English which I admit was easier for me to learn than he,
but had I been German we'd have been on equal footing.
Esperanto is a failure, English is a success. So... I'm not going to
rewrite
or change my pages to Unicode. You are free to do so if you wish.
Androcles
.
|
|
|
| User: "Paul Schlyter" |
|
| Title: Re: Restricted ASCII? |
13 Nov 2006 04:43:30 PM |
|
|
In article <S_16h.159294$lT5.31497@fe2.news.blueyonder.co.uk>,
Sorcerer <Headmaster@hogwarts.physics_e> wrote:
"Paul Schlyter" <pausch@saaf.se> wrote in message
news:eja37t$2165$1@merope.saaf.se...
In article <nnV5h.124886$3x1.20740@fe1.news.blueyonder.co.uk>,
Sorcerer <Headmaster@hogwarts.physics_e> wrote:
..........
I was accused of writing milliseconds (symbol roman m)
when I had written microseconds (symbol greek mu) by someone
not using Internet Explorer or Firefox. I am NOT changing it.
The reader can change his browser as far as I'm concerned.
See for yourself:
http://www.androcles01.pwp.blueyonder.co.uk/Smart/Smart.htm
Androcles.
You could try to write:
μ
instead of:
<font face="Symbol">m</font>
though. That requires a Unicode font in the web browser to display
properly, but it worked fine on my browsers.
Or you could encode your web page as UTF-8 (and then of course also say so
in the header of the HTML) -- then you could write greek letters directly
in your HTML code. But that requires an UTF-8 capable browser, but most
modern browser have that capability.
Note: the Unicode character code for Greek mu is: 0x03BC
Thank you, I could. But I'm not going to, and neither am I going to use
Chinese, Hebrew or Cherokee characters in anticipation of someone having
a browser that doesn't recognise Greek characters. Either he gets a browser
that does, or he fails to communicate. That's his problem, not mine.
.....excuse me, but you really got this backwards. Using μ in your
HTML really requires the browser to recognize Greek characters -- it just
won't work on browsers recognizing no Greek characters....
Many years ago I went to Italy to speak with some engineers concerning
the electronics of Tornado, which was built jointly between Britain,
Germany and Italy.
http://archive.cs.uu.nl/pub/AIRCRAFT-IMAGES/Tornado.jpg
I discovered there were no electronics data books in Italian, they
were using American just as I was. So we both had to use a common
subset of English which I admit was easier for me to learn than he,
but had I been German we'd have been on equal footing.
Esperanto is a failure, English is a success.
Currently it is, but in a century or two the situation may be different.
Maybe we're all speaking Chinese then..... :-)
You probably see no difference between a century and eternity though... <g>
So... I'm not going to
rewrite
or change my pages to Unicode. You are free to do so if you wish.
Androcles
--
----------------------------------------------------------------
Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN
e-mail: pausch at stockholm dot bostream dot se
WWW: http://stjarnhimlen.se/
.
|
|
|
| User: "Sorcerer" |
|
| Title: Re: Restricted ASCII? |
13 Nov 2006 10:31:08 PM |
|
|
"Paul Schlyter" <pausch@saaf.se> wrote in message
news:ejaqps$2aq0$1@merope.saaf.se...
| In article <S_16h.159294$lT5.31497@fe2.news.blueyonder.co.uk>,
| Sorcerer <Headmaster@hogwarts.physics_e> wrote:
|
| > "Paul Schlyter" <pausch@saaf.se> wrote in message
| > news:eja37t$2165$1@merope.saaf.se...
| >> In article <nnV5h.124886$3x1.20740@fe1.news.blueyonder.co.uk>,
| >> Sorcerer <Headmaster@hogwarts.physics_e> wrote:
| >> ..........
| >> > I was accused of writing milliseconds (symbol roman m)
| >> >when I had written microseconds (symbol greek mu) by someone
| >> >not using Internet Explorer or Firefox. I am NOT changing it.
| >> >The reader can change his browser as far as I'm concerned.
| >> >See for yourself:
| >> > http://www.androcles01.pwp.blueyonder.co.uk/Smart/Smart.htm
| >> >
| >> >Androcles.
| >>
| >> You could try to write:
| >>
| >> μ
| >>
| >> instead of:
| >>
| >> <font face="Symbol">m</font>
| >>
| >> though. That requires a Unicode font in the web browser to display
| >> properly, but it worked fine on my browsers.
| >>
| >> Or you could encode your web page as UTF-8 (and then of course also say
so
| >> in the header of the HTML) -- then you could write greek letters
directly
| >> in your HTML code. But that requires an UTF-8 capable browser, but
most
| >> modern browser have that capability.
| >>
| >> Note: the Unicode character code for Greek mu is: 0x03BC
| >
| > Thank you, I could. But I'm not going to, and neither am I going to use
| > Chinese, Hebrew or Cherokee characters in anticipation of someone having
| > a browser that doesn't recognise Greek characters. Either he gets a
browser
| > that does, or he fails to communicate. That's his problem, not mine.
|
| ....excuse me, but you really got this backwards. Using μ in your
| HTML really requires the browser to recognize Greek characters -- it just
| won't work on browsers recognizing no Greek characters....
Excuse me, but damnly my frank, I just don't give a dear. Have a nice day.
Androcles
|
|
| > Many years ago I went to Italy to speak with some engineers concerning
| > the electronics of Tornado, which was built jointly between Britain,
| > Germany and Italy.
| > http://archive.cs.uu.nl/pub/AIRCRAFT-IMAGES/Tornado.jpg
| > I discovered there were no electronics data books in Italian, they
| > were using American just as I was. So we both had to use a common
| > subset of English which I admit was easier for me to learn than he,
| > but had I been German we'd have been on equal footing.
| >
| > Esperanto is a failure, English is a success.
|
| Currently it is, but in a century or two the situation may be different.
| Maybe we're all speaking Chinese then..... :-)
|
| You probably see no difference between a century and eternity though...
<g>
|
| > So... I'm not going to
| > rewrite
| > or change my pages to Unicode. You are free to do so if you wish.
| > Androcles
|
| --
| ----------------------------------------------------------------
| Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN
| e-mail: pausch at stockholm dot bostream dot se
| WWW: http://stjarnhimlen.se/
.
|
|
|
|
| User: "Jeff Root" |
|
| Title: Re: Restricted ASCII? |
13 Nov 2006 06:19:30 PM |
|
|
My web pages don't specify the character set because I
didn't know which character set to specify, and I'd rather
let the user's operating system & browser go with what
they think is right rather than specifying one which
doesn't work for some people.
However, recently I wanted to use a centered dot. I found
it in Arial (the font I was suggesting for the page) using
Windows Character Map. I copied the character and pasted
it into the page in my text editor. The editor is set to
use a different font, and the character did not display
correctly. Based on past experience, I went ahead anyway.
The dot showed up fine on the web page on my local hard
drive. But when I uploaded the page to the server and
viewed it from there, the character did not display.
It was suggested to me that I specify the character set.
But I still was unsure which set to specify, and wanted to
avoid unnecessary complications. I found that replacing
the pasted character with the HTML escape sequence ·
works, so I am currently going with that.
Any comments or suggestions? I will specify a character
set if I have confidence that it won't cause more problems
than it solves. If it really isn't needed, though, I'll
continue to omit it.
-- Jeff, in Minneapolis
.
|
|
|
| User: "Ben Rudiak-Gould" |
|
| Title: Re: Restricted ASCII? |
13 Nov 2006 06:49:17 PM |
|
|
Jeff Root wrote:
My web pages don't specify the character set because I
didn't know which character set to specify, and I'd rather
let the user's operating system & browser go with what
they think is right rather than specifying one which
doesn't work for some people.
The only way the page can not work for /some/ people is if you don't specify
an encoding. If you do specify an encoding, then it will either look right
to everyone or look wrong to everyone. In particular, if it looks right to
you, it'll look right to everyone. That's much better than leaving it up to
each user's browser.
It was suggested to me that I specify the character set.
But I still was unsure which set to specify, and wanted to
avoid unnecessary complications. I found that replacing
the pasted character with the HTML escape sequence ·
works, so I am currently going with that.
Yes, stick with ASCII and you'll be fine. "·" is a sequence of ASCII
characters for document-encoding purposes.
-- Ben
.
|
|
|
| User: "John Liberty Bell" |
|
| Title: Re: Restricted ASCII? and the final test |
15 Nov 2006 09:45:13 PM |
|
|
Ben Rudiak-Gould wrote (Re: Restricted ASCII? The final test):
Pat O'Connell wrote:
Chalky wrote:
¿ ¥ ® € ?
These are escaped characters in HTML, and are written correctly above in
ASCII.
I suspect the reason most of us see (eg) ¿ and not the character
represented by ¿ in HTML, may be that the groups server has
cunningly inserted an extra invisible character within each of those
HTML instructions, to block our browsers and newsreaders from
displaying those HTML instructions, as single characters.
As Chalky discovered, if you instead simply paste in the corresponding
displayed HTML character when making a posting, many of us will then
see it. However, there are then some user dependent problems which can
arise:
1) If you are using an Outlook Express Newsreader, the indents will
foul up when you try to respond.
2) If you are using another (as yet unidentified [here]) user
interface, that character is translated instead into a string of (7
bit) ascii characters, so you don't see what was intended.
3) If, on the other hand, you are using an Internet browser interface,
there are no resultant problems for you, personally, UNLESS you post to
sci.astro.research. [This is because the moderator there falls into
category (2), and, if approved, the posting will then appear in the
altered form the moderator saw]
Except for "€", which is a control character, not the Euro symbol.
"€" is a Euro symbol for Microsoft browsers and for other
'relaxed' ISO based browsers. "€" is a control character only
under 'strict' ISO interpretation
The Euro symbol is "€".
Yes, that is the Euro symbol in Unicode. It appears to work on all
browsers going back at least as far as Netscape 4.75. That,
incidentally, is the only browser I have tried which does not also
display the Euro symbol when fed €. Instead it displays €.
So I think I would agree, on the basis of this early Netscape test,
that Unicode is probably the best way to go (at least for HTML
scripting).
Regards
John
.
|
|
|
| User: "Richard Tobin" |
|
| Title: Re: Restricted ASCII? and the final test |
16 Nov 2006 07:32:14 AM |
|
|
In article <1163648713.310853.239480@b28g2000cwb.googlegroups.com>,
John (Liberty) Bell <john.bell@accelerators.co.uk> wrote:
"€" is a Euro symbol for Microsoft browsers and for other
'relaxed' ISO based browsers. "€" is a control character only
under 'strict' ISO interpretation
This is nonsense. Character number 128 in some Microsoft character
sets is a Euro character. But the &#NNN; notation *means* the
character represented by that number in Unicode. That browsers
display a Euro symbol is at best an attempt to make broken pages
display correctly, and may well just be a consequence of the fonts
they use.
The Euro symbol is "€".
Yes, that is the Euro symbol in Unicode. It appears to work on all
browsers going back at least as far as Netscape 4.75.
Whether it works depends more on the fonts than on the browsers.
Browsers don't have code to handle each character.
So I think I would agree, on the basis of this early Netscape test,
that Unicode is probably the best way to go (at least for HTML
scripting).
If it's not Unicode, it's not HTML either.
-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
.
|
|
|
|
| User: "Paul Schlyter" |
|
| Title: Re: Restricted ASCII? and the final test |
16 Nov 2006 01:42:37 AM |
|
|
In article <1163648713.310853.239480@b28g2000cwb.googlegroups.com>,
John (Liberty) Bell <john.bell@accelerators.co.uk> wrote:
Ben Rudiak-Gould wrote (Re: Restricted ASCII? The final test):
Pat O'Connell wrote:
Chalky wrote:
¿ ¥ ® € ?
These are escaped characters in HTML, and are written correctly above in
ASCII.
I suspect the reason most of us see (eg) ¿ and not the character
represented by ¿ in HTML, may be that the groups server has
cunningly inserted an extra invisible character within each of those
HTML instructions, to block our browsers and newsreaders from
displaying those HTML instructions, as single characters.
There are no such invisible characters inserted here, and I see
¿ too....
There's an easier way to accomplish this: make sure a line like
this is present in the message header:
Content-Type: text/plain; charset="us-ascii"
A compliant news reader should then display this as pure ASCII, not as HTML
excape characters. Of course, if web based, the news reader must do
some processing, such as replacing the ¿ with e.g. &#191;
--
----------------------------------------------------------------
Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN
e-mail: pausch at stockholm dot bostream dot se
WWW: http://stjarnhimlen.se/
.
|
|
|
| User: "John Liberty Bell" |
|
| Title: Re: Restricted ASCII? and the final test |
16 Nov 2006 04:51:52 AM |
|
|
Paul Schlyter wrote:
In article <1163648713.310853.239480@b28g2000cwb.googlegroups.com>,
John (Liberty) Bell <john.bell@accelerators.co.uk> wrote:
Pat O'Connell wrote:
Chalky wrote:
¿ ¥ ® € ?
These are escaped characters in HTML, and are written correctly above in
ASCII.
I suspect the reason most of us see (eg) ¿ and not the character
represented by ¿ in HTML, may be that the groups server has
cunningly inserted an extra invisible character within each of those
HTML instructions, to block our browsers and newsreaders from
displaying those HTML instructions, as single characters.
There are no such invisible characters inserted here, and I see
¿ too....
OK, so my blind guess can't be right.
There's an easier way to accomplish this: make sure a line like
this is present in the message header:
Content-Type: text/plain; charset="us-ascii"
Ah! So it IS called us-ascii
A compliant news reader should then display this as pure ASCII, not as HTML
excape characters. Of course, if web based, the news reader must do
some processing, such as replacing the ¿ with e.g. &#191;
Ah So! Chalky did say he saw something like &#191; when he
previewed the original posting, so modified the posting by pasting in
the correspondingly displayed characters, when he employed browser and
email clients directly to read the code.
However, when I previewed my own (later) postings, no such alteration
on preview was displayed.
Amazing how much seems to have changed in just a few days, isn't it?
John
.
|
|
|
| User: "George Dishman" |
|
| Title: Re: Restricted ASCII? and the final test |
16 Nov 2006 05:47:28 PM |
|
|
"John (Liberty) Bell" <john.bell@accelerators.co.uk> wrote in message
news:1163674312.513112.24390@b28g2000cwb.googlegroups.com...
Ah So! Chalky did say he saw something like &#191; when he
previewed the original posting, so modified the posting by pasting in
the correspondingly displayed characters, when he employed browser and
email clients directly to read the code.
If you write HTML with notepad, you type "&" to display
the ampersand sign. It is the browser that does the conversion
from HTML.
However, when I previewed my own (later) postings, no such alteration
on preview was displayed.
In a "posting" on Usenet, the "&" should be passed through
unaltered and anyone using a compliant reader should see that.
If you view witha browser and it shows the ampersand then it is
broken, the interface should change "&" to "&amp;" so
the browser shows the characters.
Amazing how much seems to have changed in just a few days, isn't it?
Not really, ASCII is still 7-bit, just as it always has been.
George
.
|
|
|
| User: "Jeff Root" |
|
| Title: Re: Restricted ASCII? and the final test |
16 Nov 2006 11:02:20 PM |
|
|
When writing HTML, is it better to just say "M&M" or to
write out the verbose equivalent "M&M" ?
-- Jeff, in Minneapolis
.
|
|
|
| User: "David Woolley" |
|
| Title: Re: Restricted ASCII? and the final test |
17 Nov 2006 01:38:03 AM |
|
|
In article <1163739740.253656.102810@j44g2000cwa.googlegroups.com>,
Jeff Root <jeff5@freemars.org> wrote:
When writing HTML, is it better to just say "M&M" or to
M&M is not HTML (undefined general entity) and, for an XHTML document
would cause a compliant XHTML browser to abort the document (entity
reference not closed with ;, which is a well-formedness violation -
does not match the syntactical definition of a document).
(Note that IE (including IE7) does not support XHTML served as XHTML but
only a subset of XHTML 1.0 (defined in appendix C of its specification)
served, in compatibility mode, as HTML, and intended to be treated as
a sort of broken HTML.)
write out the verbose equivalent "M&M" ?
The "'s are not needed, unless you use the string in an attribute
value, and even then, definitely for HTML, and I believe also for XHTML,
only if you use " rather than ' as the delimiter. (Using delimiters
is mandatory in XHTML, and is mandatory in HTML where most punctuation
characters are used.)
The most common place where &'s are invalidly left bare is form submission
like URLs. The HTML specification itself points this one out.
See <http://validator.w3.org/> to check whether or not a document is HTML.
See <http://blogs.msdn.com/ie/archive/2005/09/15/467901.aspx> for Microsoft's
policy on supporting XML in IE7.
.
|
|
|
|
| User: "Richard Tobin" |
|
| Title: Re: Restricted ASCII? and the final test |
17 Nov 2006 06:56:09 AM |
|
|
In article <1163739740.253656.102810@j44g2000cwa.googlegroups.com>,
Jeff Root <jeff5@freemars.org> wrote:
When writing HTML, is it better to just say "M&M" or to
write out the verbose equivalent "M&M" ?
There are a few contexts in SGML in which & can be used literally, but
this is not one of them (it's recognised as an entity reference open
delimiter because it's followed by a name start character). And HTML
itself recommends that you should use &.
-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
.
|
|
|
|
| User: "Chris L Peterson" |
|
| Title: Re: Restricted ASCII? and the final test |
17 Nov 2006 12:09:37 AM |
|
|
On 16 Nov 2006 21:02:20 -0800, "Jeff Root" <jeff5@freemars.org> wrote:
When writing HTML, is it better to just say "M&M" or to
write out the verbose equivalent "M&M" ?
The latter will give the output you are looking for under a wider
variety of conditions.
_________________________________________________
Chris L Peterson
Cloudbait Observatory
http://www.cloudbait.com
.
|
|
|
|
|
|
|
|
|
|
|
|
| User: "Richard Tobin" |
|
| Title: Re: Restricted ASCII? |
13 Nov 2006 11:47:05 AM |
|
|
In article <S_16h.159294$lT5.31497@fe2.news.blueyonder.co.uk>,
Sorcerer <Headmaster@hogwarts.physics_e> wrote:
Thank you, I could. But I'm not going to, and neither am I going to use
Chinese, Hebrew or Cherokee characters in anticipation of someone having
a browser that doesn't recognise Greek characters. Either he gets a browser
that does, or he fails to communicate. That's his problem, not mine.
But you didn't use a Greek character. You used an "m", and requested
a font in which you expect it to look like a mu. Why not just use
a mu?
-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
.
|
|
|
| User: "Sorcerer" |
|
| Title: Re: Restricted ASCII? |
13 Nov 2006 11:58:04 AM |
|
|
"Richard Tobin" <richard@cogsci.ed.ac.uk> wrote in message
news:ejab2p$vc2$1@pc-news.cogsci.ed.ac.uk...
Did you snip something? It seems I have too and now I can't find it.
Androcles
.
|
|
|
|
|
|
|
|
|
|
|

|
Related Articles |
|
|