Discussion:
Upcoming gfortran 15 will contain unsigned numbers
(too old to reply)
Thomas Koenig
2024-10-13 09:52:15 UTC
Permalink
Hello world,

J3 has passed https://j3-fortran.org/doc/year/24/24-116.txt , a
proposal for unsigned numbers in Fortran, in their February/March
meeting.

Gfortran 15, to be released in the beginning of 2025, will contain
an experimental implementation of that proposal. The current
development version already has that documentation, so if you
are so inclined, you can download and compile it for yourself to
try it out already.

This feature can be accessed by using the -funsigned flag.
Some short documentation can be found at

https://gcc.gnu.org/onlinedocs/gfortran/Unsigned-integers.html

Test cases (also doubling as use examples) can be found at
https://gcc.gnu.org/git/?p=gcc.git;a=tree;f=gcc/testsuite/gfortran.dg
(any file that has "unsigned" in its name).

Bug reports and comments are welcome.

Enjoy!

Best regards

Thomas
Gary Scott
2024-10-13 14:09:22 UTC
Permalink
Post by Thomas Koenig
Hello world,
J3 has passed https://j3-fortran.org/doc/year/24/24-116.txt , a
proposal for unsigned numbers in Fortran, in their February/March
meeting.
Gfortran 15, to be released in the beginning of 2025, will contain
an experimental implementation of that proposal. The current
development version already has that documentation, so if you
are so inclined, you can download and compile it for yourself to
try it out already.
This feature can be accessed by using the -funsigned flag.
Some short documentation can be found at
https://gcc.gnu.org/onlinedocs/gfortran/Unsigned-integers.html
Test cases (also doubling as use examples) can be found at
https://gcc.gnu.org/git/?p=gcc.git;a=tree;f=gcc/testsuite/gfortran.dg
(any file that has "unsigned" in its name).
Bug reports and comments are welcome.
Enjoy!
Best regards
Thomas
Yay? :) I'm so used to using larger signed integers and various tricks
that I'll probably never use it, but at least I have the option now.
Lawrence D'Oliveiro
2024-10-13 21:04:36 UTC
Permalink
Post by Thomas Koenig
J3 has passed https://j3-fortran.org/doc/year/24/24-116.txt , a
proposal for unsigned numbers in Fortran ...
Do you mean “unsigned integers”?
Post by Thomas Koenig
https://gcc.gnu.org/onlinedocs/gfortran/Unsigned-integers.html
Thought so.
Lynn McGuire
2024-10-15 20:26:42 UTC
Permalink
Post by Thomas Koenig
Hello world,
J3 has passed https://j3-fortran.org/doc/year/24/24-116.txt , a
proposal for unsigned numbers in Fortran, in their February/March
meeting.
Gfortran 15, to be released in the beginning of 2025, will contain
an experimental implementation of that proposal. The current
development version already has that documentation, so if you
are so inclined, you can download and compile it for yourself to
try it out already.
This feature can be accessed by using the -funsigned flag.
Some short documentation can be found at
https://gcc.gnu.org/onlinedocs/gfortran/Unsigned-integers.html
Test cases (also doubling as use examples) can be found at
https://gcc.gnu.org/git/?p=gcc.git;a=tree;f=gcc/testsuite/gfortran.dg
(any file that has "unsigned" in its name).
Bug reports and comments are welcome.
Enjoy!
Best regards
Thomas
Any plans to support UTF16 or UTF8 in gfortran ?

Thanks,
Lynn
Steven G. Kargl
2024-10-15 20:46:59 UTC
Permalink
Post by Lynn McGuire
Any plans to support UTF16 or UTF8 in gfortran ?
gfortran has supported UTF-8 for long time now. Here's
an example from the manual.


program character_kind
use iso_fortran_env
implicit none
integer, parameter :: ascii = selected_char_kind ("ascii")
integer, parameter :: ucs4 = selected_char_kind ('ISO_10646')

character(kind=ascii, len=26) :: alphabet
character(kind=ucs4, len=30) :: hello_world

alphabet = ascii_"abcdefghijklmnopqrstuvwxyz"
hello_world = ucs4_'Hello World and Ni Hao -- ' &
// char (int (z'4F60'), ucs4) &
// char (int (z'597D'), ucs4)

write (*,*) alphabet

open (output_unit, encoding='UTF-8')
write (*,*) trim (hello_world)
end program character_kind
--
steve
Lynn McGuire
2024-10-15 21:51:32 UTC
Permalink
Post by Steven G. Kargl
Post by Lynn McGuire
Any plans to support UTF16 or UTF8 in gfortran ?
gfortran has supported UTF-8 for long time now. Here's
an example from the manual.
program character_kind
use iso_fortran_env
implicit none
integer, parameter :: ascii = selected_char_kind ("ascii")
integer, parameter :: ucs4 = selected_char_kind ('ISO_10646')
character(kind=ascii, len=26) :: alphabet
character(kind=ucs4, len=30) :: hello_world
alphabet = ascii_"abcdefghijklmnopqrstuvwxyz"
hello_world = ucs4_'Hello World and Ni Hao -- ' &
// char (int (z'4F60'), ucs4) &
// char (int (z'597D'), ucs4)
write (*,*) alphabet
open (output_unit, encoding='UTF-8')
write (*,*) trim (hello_world)
end program character_kind
Thanks !

Lynn
Lawrence D'Oliveiro
2024-10-15 22:35:36 UTC
Permalink
Post by Lynn McGuire
Any plans to support UTF16 or UTF8 in gfortran ?
No UTF-16, please!

That was just a horrible backward-compatibility hack for the sake of those
platforms (*cough* Java, Windows NT *cough*) that embraced Unicode as
their native encoding just a little too soon.
Lynn McGuire
2024-11-09 03:00:55 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Lynn McGuire
Any plans to support UTF16 or UTF8 in gfortran ?
No UTF-16, please!
That was just a horrible backward-compatibility hack for the sake of those
platforms (*cough* Java, Windows NT *cough*) that embraced Unicode as
their native encoding just a little too soon.
Aren't all of the Unix and Linux boxen either UTF-16 or UTF-32 ?

Lynn
Lawrence D'Oliveiro
2024-11-09 04:37:37 UTC
Permalink
Post by Lynn McGuire
Aren't all of the Unix and Linux boxen either UTF-16 or UTF-32 ?
No!
Lynn McGuire
2024-11-09 05:23:56 UTC
Permalink
Post by Lynn McGuire
Aren't all of the Unix and Linux boxen either UTF-16 or UTF-32 ?
No!
A friend of mine was working on a PLC in Japan a couple of decades ago.
It was very custom Unix and UTF-32. He was shocked. He had several
problems porting their software to the PLC.

Lynn
Thomas Koenig
2024-11-09 11:57:41 UTC
Permalink
Post by Lynn McGuire
Post by Lawrence D'Oliveiro
Post by Lynn McGuire
Any plans to support UTF16 or UTF8 in gfortran ?
No UTF-16, please!
That was just a horrible backward-compatibility hack for the sake of those
platforms (*cough* Java, Windows NT *cough*) that embraced Unicode as
their native encoding just a little too soon.
Aren't all of the Unix and Linux boxen either UTF-16 or UTF-32 ?
Most certainly not:

$ echo $LANG
de_DE.UTF-8

Haven't seen anything but UTF-8 in a long time.
Lynn McGuire
2024-11-23 01:40:19 UTC
Permalink
Post by Thomas Koenig
Post by Lynn McGuire
Post by Lawrence D'Oliveiro
Post by Lynn McGuire
Any plans to support UTF16 or UTF8 in gfortran ?
No UTF-16, please!
That was just a horrible backward-compatibility hack for the sake of those
platforms (*cough* Java, Windows NT *cough*) that embraced Unicode as
their native encoding just a little too soon.
Aren't all of the Unix and Linux boxen either UTF-16 or UTF-32 ?
$ echo $LANG
de_DE.UTF-8
Haven't seen anything but UTF-8 in a long time.
Come over here to Windows. UTF-16 is the name of the game. It is a
total pain.

Microsoft is rumored to be working on a UTF-8 API for Win32 / Win64. I
will believe it when I see it.

Lynn
Lawrence D'Oliveiro
2024-11-23 02:08:19 UTC
Permalink
Post by Lynn McGuire
Come over here to Windows. UTF-16 is the name of the game. It is a
total pain.
Microsoft (and Sun, with Java) adopted Unicode at precisely the wrong
time, back when everybody believed the Unicode folks who said that it
would remain a fixed-length 16-bit code.
Post by Lynn McGuire
Microsoft is rumored to be working on a UTF-8 API for Win32 / Win64. I
will believe it when I see it.
Linux simply ignored the issue. Filespecs passed to the kernel are split
at ASCII “/” characters and terminated by NUL. And those are the only byte
values with special interpretations; file/directory names are free to
contain anything else. As a result, it works seamlessly with UTF-8.
Lynn McGuire
2024-11-23 03:16:19 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Lynn McGuire
Come over here to Windows. UTF-16 is the name of the game. It is a
total pain.
Microsoft (and Sun, with Java) adopted Unicode at precisely the wrong
time, back when everybody believed the Unicode folks who said that it
would remain a fixed-length 16-bit code.
Post by Lynn McGuire
Microsoft is rumored to be working on a UTF-8 API for Win32 / Win64. I
will believe it when I see it.
Linux simply ignored the issue. Filespecs passed to the kernel are split
at ASCII “/” characters and terminated by NUL. And those are the only byte
values with special interpretations; file/directory names are free to
contain anything else. As a result, it works seamlessly with UTF-8.
It was probably an economic decision on Microsoft's part with having to
ship different versions of DOS and Windows with English, French,
Russian, Chinese, Japanese, etc, etc, etc back in the 1980s. UTF-8 came
out in the middle 1990s ??? I suspect that they wanted to ship just one
version of Windows, Office, etc and have it automatically acclimate to
the user's desired main language.

Several countries mandated that large software shops ship their
countries native language in their software. France and Quebec were the
guiltiest of this. I actually have a French version of Microsoft Office
95 over in my cabinet.

Lynn
Thomas Koenig
2024-11-23 08:04:06 UTC
Permalink
Post by Lynn McGuire
Post by Thomas Koenig
Post by Lynn McGuire
Post by Lawrence D'Oliveiro
Post by Lynn McGuire
Any plans to support UTF16 or UTF8 in gfortran ?
No UTF-16, please!
That was just a horrible backward-compatibility hack for the sake of those
platforms (*cough* Java, Windows NT *cough*) that embraced Unicode as
their native encoding just a little too soon.
Aren't all of the Unix and Linux boxen either UTF-16 or UTF-32 ?
$ echo $LANG
de_DE.UTF-8
Haven't seen anything but UTF-8 in a long time.
Come over here to Windows.
I should have qualified that with "on Linux".
Post by Lynn McGuire
UTF-16 is the name of the game. It is a
total pain.
I can well believe that.

At least Fortran has standard methods of using UTF-8.
Wolfgang Agnes
2024-11-23 12:18:11 UTC
Permalink
[...]
Post by Lynn McGuire
Post by Thomas Koenig
$ echo $LANG
de_DE.UTF-8
Haven't seen anything but UTF-8 in a long time.
Come over here to Windows. UTF-16 is the name of the game. It is a
total pain.
How about UCS-2? Have you seen it? I find UCS-2 when I use
write-sequence from Common Lisp on the Windows Console.
Post by Lynn McGuire
Microsoft is rumored to be working on a UTF-8 API for Win32 / Win64.
I will believe it when I see it.
Same here.
Lawrence D'Oliveiro
2024-11-23 22:09:39 UTC
Permalink
Post by Wolfgang Agnes
How about UCS-2?
“UCS-2” was the name of the encoding back when it was assumed that Unicode
was always going to be just 16 bits. After the coding was extended, those
“surrogate” ranges were introduced, to allow representation of the extra
characters within a 16-bit encoding, and so “UCS-2” was renamed to
“UTF-16”.

In short, “UTF-16” is basically “UCS-2 with surrogates”.
Wolfgang Agnes
2024-11-25 11:35:48 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Wolfgang Agnes
How about UCS-2?
“UCS-2” was the name of the encoding back when it was assumed that Unicode
was always going to be just 16 bits. After the coding was extended, those
“surrogate” ranges were introduced, to allow representation of the extra
characters within a 16-bit encoding, and so “UCS-2” was renamed to
“UTF-16”.
In short, “UTF-16” is basically “UCS-2 with surrogates”.
Nice to know! Thanks. So, UCS means ``Universal Character Set''. I
thought it was a whole different character set. It's a bit difficult to
understand ``surrogates''. So many definitions come up such as ``Basic
Multilingual Plane''. Can you explain what surrogates are?
Lynn McGuire
2024-11-25 20:39:37 UTC
Permalink
Post by Wolfgang Agnes
Post by Lawrence D'Oliveiro
Post by Wolfgang Agnes
How about UCS-2?
“UCS-2” was the name of the encoding back when it was assumed that Unicode
was always going to be just 16 bits. After the coding was extended, those
“surrogate” ranges were introduced, to allow representation of the extra
characters within a 16-bit encoding, and so “UCS-2” was renamed to
“UTF-16”.
In short, “UTF-16” is basically “UCS-2 with surrogates”.
Nice to know! Thanks. So, UCS means ``Universal Character Set''. I
thought it was a whole different character set. It's a bit difficult to
understand ``surrogates''. So many definitions come up such as ``Basic
Multilingual Plane''. Can you explain what surrogates are?
There is lots of information at
https://home.unicode.org/

And
https://stackoverflow.com/

Lynn
Lawrence D'Oliveiro
2024-11-25 23:35:34 UTC
Permalink
It's a bit difficult to understand ``surrogates''.
The Unicode folks just decided that the ranges 0xD800-0xDBFF (1024 codes
of “high surrogates”) and 0xDC00-0xDFFF (1024 codes of “low surrogates”)
would be used in pairs to represent codes above 0xFFFF in UTF-16 encoding.
This gives an additional 1024×1024 = 1048576 different codes, which should
be enough to cover the entire (current) Unicode range, which officially
goes up to 0x10FFFF. At least, that’s what they’re saying right now.

In the full UCS-4 encoding, those ranges are considered invalid.

Loading...