|
Home > Archive > Unix Shell > February 2005 > special charcthers
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
special charcthers
|
|
| SuperDaemon 2005-02-04, 6:01 pm |
|
I would like to find out how I could enter special characters or uncommon
symbols in bash(or in vi)? I know there was a thread a while ago about
entering cent sign (i.e ¢) Is there a tutorial somewhere or a table where i
Could look these up?
Thanks
| |
| Ed Morton 2005-02-05, 7:50 am |
|
SuperDaemon wrote:
> I would like to find out how I could enter special characters or uncommon
> symbols in bash(or in vi)? I know there was a thread a while ago about
> entering cent sign (i.e ¢) Is there a tutorial somewhere or a table where i
> Could look these up?
>
> Thanks
Take a look at this URL if vim works for you:
http://www.vim.org/htmldoc/digraph.html
Ed.
| |
| Stephane CHAZELAS 2005-02-05, 5:49 pm |
| 2005-02-04, 18:21(+00), SuperDaemon:
>
> I would like to find out how I could enter special characters or uncommon
> symbols in bash(or in vi)? I know there was a thread a while ago about
> entering cent sign (i.e ¢) Is there a tutorial somewhere or a table where i
> Could look these up?
[...]
There are plenty of ways to do it at different levels.
Note the ¢ is not the same thing depending on the charset you
actually use. In a latin1 (or latin9, iso8859-15 which is the
new standard charset for west european languages) charset, it's
represented as the character number 162.
In the Unicode charset with an UTF8 encoding, it's represented
as two bytes (194 and 162).
If you want to be able to write:
echo ¢
You need either to tell your terminal to send the 162 character
(or the two 194 and 162 characters depending on your locale) to
bash upon the pressing of a key or key combination and tell bash
to insert those characters literally on the command line (which
it should do by default if the configuration is OK).
Or you can tell bash to insert a 162 character (or 194+162) when
it receives a character or a character list.
In the first case, if the terminal is a X terminal such as
xterm, you can tell X to generate the "cent" key symbol upon
pressing a key or key combination. And xterm already knows how
to convert that keysym to the correct character (or list of
characters) through the LC_CTYPE localisation variable.
There's even a special way to do this through the "Compose" or
"Multi_Key" key.
For instance, on my British PC keyboard, there's an unused
MS Windows key on the bottom right hand side of the keyboard.
xev tells me that its keycode is 116. I can then tell X to map
that key to "Multi_key":
xmodmap -e 'keycode 116 = Multi_key'
Then, if I type <RMSWin><c><|>, xterm receives a "cent" keysym,
then it sends the character 162 to bash which inserts a 162
character on the command line (which xterm displays as ¢).
On some system, you can have a look at
/usr/X11R6/lib/X11/locale/<charset>/Compose
to see the availabe "Compose" sequences.
You could also have mapped
<Altgr-C> to "cent":
xmodmap -e 'keycode 54 = c C cent copyright'
To have:
<c>: c
<Shift-c>: C
<Altgr-c>: ¢
<Altgr-Shift-c>: ©
(note that the above may not work on every X server, you need
to have <Altgr> mapped to Mode_Switch, when xkb is used, it
often works differently).
Now, you can also tell xterm to send a specific character upon
the reception of a keysym. For instance,
xterm -xrm 'XTerm.VT100.translations: #override <Key>F11: string(0xA2)'
and xterm will send the character 0xA2 (162) upon pressing <F11>
Now, you can have an all-bash solution. That means it will work
only at bash prompt but that it will work for bash in every
terminal (not just the X ones):
bind '\C-kCt: "¢"'
And bash will insert the ¢ character(s) when it receives the
'^K', 'C', 't' character sequence. Well, that's only in theory,
bash (actually readline within bash) key sequence support is
quite bogus.
You may prefer using zsh:
bindkey -s '\C-kCt' '\u00a2'
If you use the \u00a2 Unicode representation, that means the
above will work in any locale (for instance, zsh will send the
char 162 in a latin1 locale and 194+162 in a utf8 locale).
(Note that I chose ^KCt because it is the character sequence
that is used in the vim editor to insert a ¢).
Finally, note that bash, zsh and ksh93 has the $'...' special
kind of quotes that allows one to enter:
echo $'\xA2'
echo $'\242'
echo $'\u00a2' # zsh and ksh93 only
--
Stéphane
| |
| Robert Bonomi 2005-02-05, 5:49 pm |
| In article <M2PMd.324$UX3.0@newsread3.news.pas.earthlink.net>,
SuperDaemon <Superdaemon@DiskAndExecutionMON.biz> wrote:
>
>I would like to find out how I could enter special characters or uncommon
>symbols in bash(or in vi)? I know there was a thread a while ago about
>entering cent sign (i.e ¢) Is there a tutorial somewhere or a table where i
>Could look these up?
>
>Thanks
Part of the answer is 'hardware dependent' -- what you have to do to generate
that special/uncommon symbol from your keyboard.
'The rest of the answer' is "Control-V" -- a meta-character that means "take
the next character _literally_, without regard for any special meaning it
might have".
Then there is the 'minor issue' of how those 'special/uncommon' symbols will
be displayed on _other_ devices -- e.g. your 'cent sign', above, showed here
as a lower-case letter 'o', with an accent grave -- no way in h*ll I'd have
guessed that you "meant" a cent-sign, if you hadn't expressly so stated. 
| |
| Stephane CHAZELAS 2005-02-05, 5:49 pm |
| 2005-02-05, 18:29(-00), Robert Bonomi:
[...]
> Then there is the 'minor issue' of how those 'special/uncommon' symbols will
> be displayed on _other_ devices -- e.g. your 'cent sign', above, showed here
> as a lower-case letter 'o', with an accent grave -- no way in h*ll I'd have
> guessed that you "meant" a cent-sign, if you hadn't expressly so stated. 
[...]
Then, your news reader has to be broken or not MIME-compliant.
The OP clearly stated in the headers:
Content-Type: text/plain; charset=iso-8859-1
And in iso-8859-1, character number 162 has to be displayed as a
Cent symbol, certainly not as « ò » (o accent grave).
Note that the characters with eighth bit on may be unusual or
uncommon in English speaking countries but they are not
elsewhere.
(even in Britain, £ (pound sign) is a non-ASCII character).
--
Stéphane
| |
| Sven Mascheck 2005-02-06, 7:47 am |
| Stephane CHAZELAS wrote:
> [...] And xterm already knows how to convert that keysym to the correct
> character (or list of characters) through the LC_CTYPE localisation
> variable.
Avoiding ambiguity: The locale (and esp. LC_CTYPE for printability)
is always of concern if you want more than ASCII.
> Note that the characters with eighth bit on may be unusual or
> uncommon in English speaking countries but they are not
> elsewhere.
>
> (even in Britain, £ (pound sign) is a non-ASCII character).
Interestingly the US _are_ prepared for 8bit LC_CTYPE=en_US always
covers latin1, even without a codeset suffix like "en_US.iso8859_1".
> xterm -xrm 'XTerm.VT100.translations: #override <Key>F11: string(0xA2)'
(offtopic: ":@Alt_L<Key>c" or alike can be _handy_, that is,
overloading Alt/Meta-a/o/u/s/c/m - if not already in use by
an application running in xterm)
| |
| Robert Bonomi 2005-02-07, 8:47 pm |
| In article <slrnd0a8pl.4aj.stephane.chazelas@spam.is.invalid>,
Stephane CHAZELAS <this.address@is.invalid> wrote:
>2005-02-05, 18:29(-00), Robert Bonomi:
>[...]
>[...]
>
>Then, your news reader has to be broken or not MIME-compliant.
Thank you for playing.
The actual situation is "none of the above"
The news-reader software is fine.
The _dumb_terminal_ I'm using does not have the *HARDWARE* support needed.
There is _no_ 'cent sign' in the character generator 'memory'.
Given that situation, the news reader software has a very limited spectrum
of alternatives:
1) don't send _any_ symbol for that character. (in effect "deleting" it
from display)
2) substitute a ASCII [SP] (in effect 'ignoring' the character)
3) substitute a 'non-blank' filler character (indicating 'something
"unprintable" goes here -- that *assumes* that the display device does
have something that can be so used, however)
4) pass the _unaltered_ symbol code to the device, and let the device
decide 'what, if anything' to do with it.
"Pass the buck" (option 4), means that "pass the blame" also occurs. The
latter *is* appealing to designers in _any_ field -- when you can say "it's
not *my* fault" that the 'right thing' did *not* happen, life is easier. 
>
>The OP clearly stated in the headers:
>Content-Type: text/plain; charset=iso-8859-1
>
>And in iso-8859-1, character number 162 has to be displayed as a
>Cent symbol, certainly not as « ò » (o accent grave).
You lie. "has to" is _not_ correct. "Should", or "is intended to" is closer
to reality.
Devices that do not have the physical capability to render the 'intended'
character *are* allowed to 'do the best they can, within their limitations'.
>Note that the characters with eighth bit on may be unusual or
>uncommon in English speaking countries but they are not
>elsewhere.
*ANY* character-set encoding "may be unusual or uncommon", somewhere, while
"not elsewhere".
BTW, "characters" do _not_ have an "eighth bit" (or any other 'bits', for
that matter). Characters are typographic symbols, with specific _shape_
characteristics. Numeric 'encodings' of characters have bits. sometimes
an 'eighth' bit; sometimes even higher-numbered ones.
>(even in Britain, £ (pound sign) is a non-ASCII character).
"BFD" applies. My original remark, as relates to "making sure that the
_receiving end_ can *DISPLAY*AS*INTENDED* the symbols used", does apply.
You can have a "MIME-compliant" message reader/displayer that does _not_
have support for some particular 'character-set' installed. If the sender
uses that character-set, the receiver will *not* see the message "as the
sender intended it".
For "communication" to occur, the sender and receiver must _agree_ on the
symbols used to exchange information. While it would seem, at first glance,
that this is an equal responsibility on both parties, in fact this is *NOT*
the case. The _sender_ does bear the greater responsibility. *THEY* have
the greater interest in ensuring that _communication_ occurs -- otherwise
they wouldn't be bothering to =send= the information in the first place.
This is not to say that the receiver has _no_ responsibility in the matter,
that is obviously not true.
The sender who _assumes_ *anything* about the capabilities of the recipient
is doing _exactly_ that -- making an ASSUMPTION. Even "assuming" ASCII as a
least-common-denominator is not necessarily reasonable, when dealing with
some Pacific Rim areas, for example.
In Western Europe, North America, and parts of Africa, you are *almost*sure*
to be safe in "assuming" the '48 printable' characters of the old IBM 026 key-
punch, and the IBM 1403 printer. You can *probably* rely on the 95 printable
characters of 7-bit US-ASCII. Anything beyond that, and a 'wise man' *will*
check to make sure that the recipient can 'speak the same language' as far as
the symbols used for communication.
| |
| SuperDaemon 2005-02-08, 8:48 pm |
| SuperDaemon wrote:
>
> I would like to find out how I could enter special characters or uncommon
> symbols in bash(or in vi)? I know there was a thread a while ago about
> entering cent sign (i.e ¢) Is there a tutorial somewhere or a table where
> i Could look these up?
>
> Thanks
Thanks to everyone who replied. I also found out about xmodmap and its
function.
Thanks.
|
|
|
|
|