Date format description #1

Open
opened 1 year ago by adele.work · 17 comments

Instead of redefining all date format, maybe you should only refer to accepted format RFC and give some samples:

Date format must conform to one of these specifications:

RFC 2822
http://www.faqs.org/rfcs/rfc2822.html
eg. "Mon 02 Jan 2006 15:04:05 MST" "Mon 02 Jan 2006 15:04 GMT"...

or

RFC 3339
http://www.faqs.org/rfcs/rfc3339.html
eg "1996-12-19 16:39:57-08:00" "1990-12-31T23:59:60Z"...

Instead of redefining all date format, maybe you should only refer to accepted format RFC and give some samples: Date format must conform to one of these specifications: RFC 2822 http://www.faqs.org/rfcs/rfc2822.html eg. "Mon 02 Jan 2006 15:04:05 MST" "Mon 02 Jan 2006 15:04 GMT"... or RFC 3339 http://www.faqs.org/rfcs/rfc3339.html eg "1996-12-19 16:39:57-08:00" "1990-12-31T23:59:60Z"...
Owner

So to summarize the headhache between me and myself last night:

  • I'd like this to stay as simple as possible, which might go against using "T" between date and time as RFC3339 suggest if I understand correctly. Also, not sur if "+01:00" or "-08:00" in RFC2822 is more user friendly than GMT/CEST/…?
  • It must be displayed nicely, because the browser will just display the date as is, and seeing "T" and/or "+08:00" makes it less readable (IMO).

Issue I found yesterday is that a standard for date doesn't imply best readability for end users 🤔

That being said, I'd love to use a standard as it would make tinylog application way simpler: https://git.bacardi55.io/bacardi55/gtl/src/branch/main/core/feeds.go#L16

I also just found rfc7231
https://datatracker.ietf.org/doc/html/rfc7231#page-65
Sun, 06 Nov 1994 08:49:37 GMT
Sat, 30 Apr 2016 17:52:13 GMT

Could be a good one possibly…

Maybe allow all 3 as they are standards?

So to summarize the headhache between me and myself last night: * I'd like this to stay as simple as possible, which might go against using "T" between date and time as RFC3339 suggest if I understand correctly. Also, not sur if "+01:00" or "-08:00" in RFC2822 is more user friendly than GMT/CEST/…? * It must be displayed nicely, because the browser will just display the date as is, and seeing "T" and/or "+08:00" makes it less readable (IMO). Issue I found yesterday is that a standard for date doesn't imply best readability for end users 🤔 That being said, I'd love to use a standard as it would make tinylog application way simpler: https://git.bacardi55.io/bacardi55/gtl/src/branch/main/core/feeds.go#L16 I also just found rfc7231 https://datatracker.ietf.org/doc/html/rfc7231#page-65 Sun, 06 Nov 1994 08:49:37 GMT Sat, 30 Apr 2016 17:52:13 GMT Could be a good one possibly… Maybe allow all 3 as they are standards?
Poster

I understand, simplicity and readability are really the goal. Maybe your first proposition is the best. And limiting some subset of well-known format is a good idea.

I understand, simplicity and readability are really the goal. Maybe your first proposition is the best. And limiting some subset of well-known format is a good idea.

I hadn't think about a date format before I have read this RFC.

I'm using linux date in vim, which print for me:

$ locale date_fmt
%a %d %b %Y %r %Z
$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
$ date
Tue 08 Jun 2021 10:12:46 PM CEST

When I have read RFC I start thinking about it. I have checked which RFC is my date format.

-I[FMT], --iso-8601[=FMT]
              output  date/time  in  ISO 8601 format.  FMT='date' for date only (the default), 'hours', 'minutes', 'seconds', or 'ns' for date and time to the indi‐
              cated precision.  Example: 2006-08-14T02:34:56-06:00

       -R, --rfc-email
              output date and time in RFC 5322 format.  Example: Mon, 14 Aug 2006 02:34:56 -0600

       --rfc-3339=FMT
              output date/time in RFC 3339 format.  FMT='date', 'seconds', or 'ns' for date and time to the indicated precision.  Example: 2006-08-14 02:34:56-06:00

It's almost RFC 5322, but there are no comma after day name. I think that is too complicated. Of course it's easier for programmer to handle not so many formats, but it could be hard to think about it.

And the second side of a problem - every format which could be writed by human, for easy writting not for computer parser.

I don't have a good answer. I will think about it. I see that there are some projects about date parsing madness: https://github.com/soniah/date_practice. ;-)

I hadn't think about a date format before I have read this RFC. I'm using linux date in vim, which print for me: ``` $ locale date_fmt %a %d %b %Y %r %Z $ locale LANG=en_US.UTF-8 LANGUAGE= LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= $ date Tue 08 Jun 2021 10:12:46 PM CEST ``` When I have read RFC I start thinking about it. I have checked which RFC is my date format. ``` -I[FMT], --iso-8601[=FMT] output date/time in ISO 8601 format. FMT='date' for date only (the default), 'hours', 'minutes', 'seconds', or 'ns' for date and time to the indi‐ cated precision. Example: 2006-08-14T02:34:56-06:00 -R, --rfc-email output date and time in RFC 5322 format. Example: Mon, 14 Aug 2006 02:34:56 -0600 --rfc-3339=FMT output date/time in RFC 3339 format. FMT='date', 'seconds', or 'ns' for date and time to the indicated precision. Example: 2006-08-14 02:34:56-06:00 ``` It's almost RFC 5322, but there are no comma after day name. I think that is too complicated. Of course it's easier for programmer to handle not so many formats, but it could be hard to think about it. And the second side of a problem - every format which could be writed by human, for easy writting not for computer parser. I don't have a good answer. I will think about it. I see that there are some projects about date parsing madness: https://github.com/soniah/date_practice. ;-)
Poster

My locale config is:

$ locale date_fmt
%a %d %b %Y %T %Z

$ locale
LANG=fr_FR.UTF-8
LC_CTYPE="fr_FR.UTF-8"
LC_NUMERIC="fr_FR.UTF-8"
LC_TIME="fr_FR.UTF-8"
LC_COLLATE="fr_FR.UTF-8"
LC_MONETARY="fr_FR.UTF-8"
LC_MESSAGES="fr_FR.UTF-8"
LC_PAPER="fr_FR.UTF-8"
LC_NAME="fr_FR.UTF-8"
LC_ADDRESS="fr_FR.UTF-8"
LC_TELEPHONE="fr_FR.UTF-8"
LC_MEASUREMENT="fr_FR.UTF-8"
LC_IDENTIFICATION="fr_FR.UTF-8"
LC_ALL=

$ date
mar. 08 juin 2021 22:51:46 CEST

Clearly, my default date format can not be compatible for parsing (with text in French)

It is necessary to fix some well defined formats that are useful for tinylog

My locale config is: ``` $ locale date_fmt %a %d %b %Y %T %Z $ locale LANG=fr_FR.UTF-8 LC_CTYPE="fr_FR.UTF-8" LC_NUMERIC="fr_FR.UTF-8" LC_TIME="fr_FR.UTF-8" LC_COLLATE="fr_FR.UTF-8" LC_MONETARY="fr_FR.UTF-8" LC_MESSAGES="fr_FR.UTF-8" LC_PAPER="fr_FR.UTF-8" LC_NAME="fr_FR.UTF-8" LC_ADDRESS="fr_FR.UTF-8" LC_TELEPHONE="fr_FR.UTF-8" LC_MEASUREMENT="fr_FR.UTF-8" LC_IDENTIFICATION="fr_FR.UTF-8" LC_ALL= $ date mar. 08 juin 2021 22:51:46 CEST ``` Clearly, my default date format can not be compatible for parsing (with text in French) It is necessary to fix some well defined formats that are useful for tinylog
Owner

Date format can be a mess… I had to implement 8 different formats in gtl to be compatible with all tinylogs I found so far (and I'm sure I'm missing some).

	"2006-01-02 15:04:05 MST",
	"2006-01-02 15:04 MST",
	"Mon 02 Jan 2006 03:04:05 PM MST",
	"Mon 02 Jan 2006 03:04 PM MST",
	"Mon 02 Jan 2006 15:04 MST",
	"Mon Jan  2 15:04:05 MST 2006",
	"Mon Jan 02 15:04:05 MST 2006",
	"Mon Jan 02 03:04:05 PM MST 2006",

I tried to keep it "simple" with the 3 I found the most "readable", not the easiest to parse but this may be wrong…

Might be interesting to point out as well that the gemini feed protocol has a fixed date format YYYY-MM-DD… Should we simply keep the same?

Date format can be a mess… I had to implement 8 different formats in gtl to be compatible with all tinylogs I found so far (and I'm sure I'm missing some). ``` "2006-01-02 15:04:05 MST", "2006-01-02 15:04 MST", "Mon 02 Jan 2006 03:04:05 PM MST", "Mon 02 Jan 2006 03:04 PM MST", "Mon 02 Jan 2006 15:04 MST", "Mon Jan 2 15:04:05 MST 2006", "Mon Jan 02 15:04:05 MST 2006", "Mon Jan 02 03:04:05 PM MST 2006", ``` I tried to keep it "simple" with the 3 I found the most "readable", not the easiest to parse but this may be wrong… Might be interesting to point out as well that the gemini feed protocol has a fixed date format `YYYY-MM-DD`… Should we simply keep the same?

I was thinking if a date format is so important for tinylog. Or we think that is important, cause we are thinking about Lace and we want to construct a one timeline.

For example, if our software would save last state of tinylogs and show information "there are N new posts" after comparing last state, with recent state, the date don't be needed?

Maybe we are too focused on Lace?

Maybe date is part of a post, like any other text information. And sometimes it won't be parsed, and processing of a tinylog will be fine. Like last example which Bacardi found:

## 2021-4-29 
### 18:40 
...
### 14:30
...
I was thinking if a date format is so important for tinylog. Or we think that is important, cause we are thinking about Lace and we want to construct a one timeline. For example, if our software would save last state of tinylogs and show information "there are N new posts" after comparing last state, with recent state, the date don't be needed? Maybe we are too focused on Lace? Maybe date is part of a post, like any other text information. And sometimes it won't be parsed, and processing of a tinylog will be fine. Like last example which Bacardi found: ``` ## 2021-4-29 ### 18:40 ... ### 14:30 ... ```
Owner

Well, I think date are important so you can have a nice timeline with all tinylogs entries mixed from the different authors.

If we don't have this per post, it will be a mess, no?
I like the example you mentioned, but I believe it is almost too complex for the gemini philosophy around simplicity of parsing responses.

Well, I think date are important so you can have a nice timeline with all tinylogs entries mixed from the different authors. If we don't have this per post, it will be a mess, no? I like the example you mentioned, but I believe it is almost too complex for the gemini philosophy around simplicity of parsing responses.
Poster

I agree with @bacardi55 if you can't parse the date, you can't aggregate several tinylogs in a timeline

I agree with @bacardi55 if you can't parse the date, you can't aggregate several tinylogs in a timeline

I just saw a link to this RFC for the first time, and my immediate reaction is that the date formats are just way too complex. Defacto-standard gemlogs with just a simple yyyy-mm-dd work well and are in line with the simplicity of gemini and gemini-text itself. Why not just add a hh:mm (24 hour, UTC) time after the date and that's it? Just seems a lot simpler, and allows you to just sort alphabetically in scripts when/if you do not care about the exact times.

It would be sad if there was a need for complex parser code or third-party date-parse libraries just to figure out what order tinylog posts are in.

I just saw a link to this RFC for the first time, and my immediate reaction is that the date formats are just way too complex. Defacto-standard gemlogs with just a simple yyyy-mm-dd work well and are in line with the simplicity of gemini and gemini-text itself. Why not just add a hh:mm (24 hour, UTC) time after the date and that's it? Just seems a lot simpler, and allows you to just sort alphabetically in scripts when/if you do not care about the exact times. It would be sad if there was a need for complex parser code or third-party date-parse libraries just to figure out what order tinylog posts are in.

And to add to my previous post, including timezones is just another can of worms that it would be great to not have anyone have to deal with. Timezones are not even constant over time. And then there are the daylight saving times and the ever-changing rules for those in different countries or parts of countries. A numeric offset in hours (and half-hours) would avoid much pain and make it more likely that the format becomes well-supported.

And to add to my previous post, including timezones is just another can of worms that it would be great to not have anyone have to deal with. Timezones are not even constant over time. And then there are the daylight saving times and the ever-changing rules for those in different countries or parts of countries. A numeric offset in hours (and half-hours) would avoid much pain and make it more likely that the format becomes well-supported.
Owner

Thanks a lot for your feedback :)

I was agreeing with you until

including timezones is just another can of worms that it would be great to not have anyone have to deal with

I think timezone are important for making sure timeline makes sens and is indeed a timeline.
In term of usability, asking people to convert everytime their local time into UTC is not really user friendly, and I believe a bit too limiting.
Timezone, even during summertime can be handled nicely.
My tinylog has entry in CET and CEST timezone and it works fine.

But I fully agree and I think moving the date format to a simple YYYY-MM-DD HH:MM TZ would be the best. But that also mean that almost every tinylogs author would need to change their format right now.

As I just wrote in #4:

Interesting you say this, as when reading the 2 options I'm clearly more for the first one. I think the confusion comes from me trying to stay as close as the original format from Drew and lace
=> gemini://friendo.monster/log/lace.gmi

In this post, he indicates "date must be understandable by date -d"

That's most probably because he wrote lace in bash, and the date format is very permissive… To be compatible with all known tinylogs I'm following, I had to implement 10 different format. I tried to rationnalize them to 3, but even that may be to tolerant and, as said in #1, maybe we should just inforce a simple one like YYYY-MM-DD HH:MM TZ (with HH in 1-24 format, avoiding am/pm option). The first half is the same as gemlog feeds…

But maybe that's a "mandatory" move if we want to have a "real" standard…

Thanks a lot for your feedback :) I was agreeing with you until > including timezones is just another can of worms that it would be great to not have anyone have to deal with I think timezone are important for making sure timeline makes sens and is indeed a timeline. In term of usability, asking people to convert everytime their local time into UTC is not really user friendly, and I believe a bit too limiting. Timezone, even during summertime can be handled nicely. My tinylog has entry in CET and CEST timezone and it works fine. But I fully agree and I think moving the date format to a simple `YYYY-MM-DD HH:MM TZ` would be the best. But that also mean that almost every tinylogs author would need to change their format right now. As I just wrote in #4: > Interesting you say this, as when reading the 2 options I'm clearly more for the first one. I think the confusion comes from me trying to stay as close as the original format from Drew and lace => gemini://friendo.monster/log/lace.gmi > > In this post, he indicates "date must be understandable by date -d" > > That's most probably because he wrote lace in bash, and the date format is very permissive… To be compatible with all known tinylogs I'm following, I had to implement 10 different format. I tried to rationnalize them to 3, but even that may be to tolerant and, as said in #1, maybe we should just inforce a simple one like YYYY-MM-DD HH:MM TZ (with HH in 1-24 format, avoiding am/pm option). The first half is the same as gemlog feeds… But maybe that's a "mandatory" move if we want to have a "real" standard…
Owner

Seems I very wrongly assumed that timezone abreviation were a standard.

But, from:
https://en.wikipedia.org/wiki/List_of_time_zone_abbreviations

It is clearly not the case. BST can either be:

  • Bangladesh Standard Time: UTC+06
  • Bougainville Standard Time: UTC+11
  • British Summer Time: UTC+01

So we shouldn't rely on them otherwise it would mess with timelines…

The alternative is to force UCT{+/-}HHMM but this format is not reader friendly…

I'm wondering if a metadata timezone: UTC +0200 (or timezone: Europe/Paris) and then allow the user to just use ## YYYY-MM-DD HH:MM Title
That would allow user to have still a clean (but formalized) date format while allowing tool to parse timeline better

Might be cleaner and easier for author as less work… 🤔

Thoughts?

Seems I very wrongly assumed that timezone abreviation were a standard. But, from: https://en.wikipedia.org/wiki/List_of_time_zone_abbreviations It is clearly not the case. BST can either be: * Bangladesh Standard Time: UTC+06 * Bougainville Standard Time: UTC+11 * British Summer Time: UTC+01 So we shouldn't rely on them otherwise it would mess with timelines… The alternative is to force `UCT{+/-}HHMM` but this format is not reader friendly… I'm wondering if a metadata `timezone: UTC +0200` (or `timezone: Europe/Paris`) and then allow the user to just use `## YYYY-MM-DD HH:MM Title` That would allow user to have still a clean (but formalized) date format while allowing tool to parse timeline better Might be cleaner and easier for author as less work… 🤔 Thoughts?
Poster

Ouch, I discover, as you, that TZ abbreviations are not formalized.

I think the best way is to put an Epoch timestamp :D

Your idea of meta timezone is good but should be optionnal. If a (traveling) author prefers to add TZ to each entry, I think he/she could do it. In this case, meta timezone is default timezone (if not present in entry)

If TZ is not on the title, it means that you have to rewrite it when mixing several tinylogs (in fact I think you have to do it any way)

Ouch, I discover, as you, that TZ abbreviations are not formalized. I think the best way is to put an Epoch timestamp :D Your idea of meta timezone is good but should be optionnal. If a (traveling) author prefers to add TZ to each entry, I think he/she could do it. In this case, meta timezone is default timezone (if not present in entry) If TZ is not on the title, it means that you have to rewrite it when mixing several tinylogs (in fact I think you have to do it any way)
Owner

I think the best way is to put an Epoch timestamp :D

Yeah, that would be easy for user :D

If a (traveling) author prefers to add TZ to each entry, I think he/she could do it.

I'm a bit afraid of readability of entries if we start adding more info 🤔

And when you say TZ, are you refering the UCT+2000 format or the Europe/Paris one? (Or the abbreviation CEST?)

> I think the best way is to put an Epoch timestamp :D Yeah, that would be easy for user :D > If a (traveling) author prefers to add TZ to each entry, I think he/she could do it. I'm a bit afraid of readability of entries if we start adding more info 🤔 And when you say TZ, are you refering the UCT+2000 format or the Europe/Paris one? (Or the abbreviation CEST?)

The way lace handles time right now is:

The author's time stamp is converted to epoch, some date math is done for how long ago, and the date is presented in the default date/time format of the reader.

It doesn't solve the problem but it does show that the format the author uses does not have to be the same as the reader sees.

So the standard could be for the author to use UTC, UTC +/-, or even epoch time but Lace and/or gtl could put the time in local time and use the locale format of the reader. I personally like just using UTC.

I like the idea of the format changing to ## YYYY-MM-DD HH:MM TZ

that way the only deviation from the Gemini feed standard is the addition of time. I realize it is a pain but should we not adhere to the standards aleady set for Gemini?

Again I am looking at the standard being for the author. The software the reader uses could put the timestamp in a different format as defined by the locale of the reader.

The way lace handles time right now is: The author's time stamp is converted to epoch, some date math is done for how long ago, and the date is presented in the default date/time format of the reader. **It doesn't solve the problem but it does show that the format the author uses does not have to be the same as the reader sees.** So the standard could be for the author to use UTC, UTC +/-, or even epoch time but Lace and/or gtl could put the time in local time and use the locale format of the reader. I personally like just using UTC. I like the idea of the format changing to `## YYYY-MM-DD HH:MM TZ` that way the only deviation from the Gemini feed standard is the addition of time. I realize it is a pain but should we not adhere to the standards aleady set for Gemini? Again I am looking at the standard being for the author. The software the reader uses could put the timestamp in a different format as defined by the locale of the reader.
Poster

## YYYY-MM-DD HH:MM TZ is certainly the best solution. Letting author to choose its TZ format (+0200 or CEST). Even if abbreviation are not standarized, they are recognized by almost date tools. If no TZ, UTC is default

`## YYYY-MM-DD HH:MM TZ` is certainly the best solution. Letting author to choose its TZ format (+0200 or CEST). Even if abbreviation are not standarized, they are recognized by almost date tools. If no TZ, UTC is default
Owner

After being fed up by adding 8 new time format in gtl, I've restricted (564aff963d) a bit more the date format as mentioned by @adele.work:

<Date> format: The format extend the date format of the gemini feed standard (YYYY-MM-DD):

YYYY-MM-DD hh:mm TZ

TZ should either be:

  • A timezone abbreviations like UTC, CEST, ET, BST, …
  • A valid UTC offset (eg: +02:00 for CEST, -06:00 for ET, …).

Nota: Timezone abbreviations are not standards (eg BST is an abbreviations of 3 different timezone). For better comprehension for any tools, the UTC offset is usually better.

If the timezone is not precised, UTC should be assumed.

After being fed up by adding 8 new time format in gtl, I've restricted (564aff963d07e36f2a8dbbd2002a5dcf1a15baad) a bit more the date format as mentioned by @adele.work: > `<Date>` format: The format extend the date format of the [gemini feed standard](https://gemini.circumlunar.space/docs/companion/subscription.gmi) (YYYY-MM-DD): > > ``` > YYYY-MM-DD hh:mm TZ > ``` > > `TZ` should either be: > > * A [timezone abbreviations](https://en.wikipedia.org/wiki/List_of_time_zone_abbreviations) like UTC, CEST, ET, BST, … > * A valid UTC offset (eg: +02:00 for CEST, -06:00 for ET, …). > > **Nota**: Timezone abbreviations are not standards (eg BST is an abbreviations of 3 different timezone). For better comprehension for any tools, the UTC offset is usually better. > > If the timezone is not precised, UTC should be assumed.
Sign in to join this conversation.
No Label
No Milestone
No project
No Assignees
5 Participants
Notifications
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
There is no content yet.