Go to the first, previous, next, last section, table of contents.


Interpret string according to given format

The first function is a rather low-level interface. It is nevertheless frequently used in user programs since it is better known. Its implementation and the interface though is heavily influenced by the getdate function which is defined and implemented in terms of calls to strptime.

Function: char * strptime (const char *s, const char *fmt, struct tm *tp)
The strptime function parses the input string s according to the format string fmt and stores the found values in the structure tp.

The input string can be retrieved in any way. It does not matter whether it was generated by a strftime call or made up directly by a program. It is also not necessary that the content is in any human-recognizable format. I.e., it is OK if a date is written like "02:1999:9" which is not understandable without context. As long the format string fmt matches the format of the input string everything goes.

The format string consists of the same components as the format string for the strftime function. The only difference is that the flags _, -, 0, and ^ are not allowed. Several of the formats which strftime handled differently do the same work in strptime since differences like case of the output do not matter. For symmetry reasons all formats are supported, though.

The modifiers E and O are also allowed everywhere the strftime function allows them.

The formats are:

%a
%A
The weekday name according to the current locale, in abbreviated form or the full name.
%b
%B
%h
The month name according to the current locale, in abbreviated form or the full name.
%c
The date and time representation for the current locale.
%Ec
Like %c but the locale's alternative date and time format is used.
%C
The century of the year. It makes sense to use this format only if the format string also contains the %y format.
%EC
The locale's representation of the period. Unlike %C it makes sometimes sense to use this format since in some cultures it is required to specify years relative to periods instead of using the Gregorian years.
%d
%e
The day of the month as a decimal number (range 1 through 31). Leading zeroes are permitted but not required.
%Od
%Oe
Same as %d but the locale's alternative numeric symbols are used. Leading zeroes are permitted but not required.
%D
Equivalent to the use of %m/%d/%y in this place.
%F
Equivalent to the use of %Y-%m-%d which is the ISO 8601 date format. This is a GNU extension following an ISO C 9X extension to strftime.
%g
The year corresponding to the ISO week number, but without the century (range 00 through 99). Note: This is not really implemented currently. The format is recognized, input is consumed but no field in tm is set. This format is a GNU extension following a GNU extension of strftime.
%G
The year corresponding to the ISO week number. Note: This is not really implemented currently. The format is recognized, input is consumed but no field in tm is set. This format is a GNU extension following a GNU extension of strftime.
%H
%k
The hour as a decimal number, using a 24-hour clock (range 00 through 23). %k is a GNU extension following a GNU extension of strftime.
%OH
Same as %H but using the locale's alternative numeric symbols are used.
%I
%l
The hour as a decimal number, using a 12-hour clock (range 01 through 12). %l is a GNU extension following a GNU extension of strftime.
%OI
Same as %I but using the locale's alternative numeric symbols are used.
%j
The day of the year as a decimal number (range 1 through 366). Leading zeroes are permitted but not required.
%m
The month as a decimal number (range 1 through 12). Leading zeroes are permitted but not required.
%Om
Same as %m but using the locale's alternative numeric symbols are used.
%M
The minute as a decimal number (range 0 through 59). Leading zeroes are permitted but not required.
%OM
Same as %M but using the locale's alternative numeric symbols are used.
%n
%t
Matches any white space.
%p
%P
The locale-dependent equivalent to `AM' or `PM'. This format is not useful unless %I or %l is also used. Another complication is that the locale might not define these values at all and therefore the conversion fails. %P is a GNU extension following a GNU extension to strftime.
%r
The complete time using the AM/PM format of the current locale. A complication is that the locale might not define this format at all and therefore the conversion fails.
%R
The hour and minute in decimal numbers using the format %H:%M. %R is a GNU extension following a GNU extension to strftime.
%s
The number of seconds since the epoch, i.e., since 1970-01-01 00:00:00 UTC. Leap seconds are not counted unless leap second support is available. %s is a GNU extension following a GNU extension to strftime.
%S
The seconds as a decimal number (range 0 through 61). Leading zeroes are permitted but not required. Please note the nonsense with 61 being allowed. This is what the Unix specification says. They followed the stupid decision once made to allow double leap seconds. These do not exist but the myth persists.
%OS
Same as %S but using the locale's alternative numeric symbols are used.
%T
Equivalent to the use of %H:%M:%S in this place.
%u
The day of the week as a decimal number (range 1 through 7), Monday being 1. Leading zeroes are permitted but not required. Note: This is not really implemented currently. The format is recognized, input is consumed but no field in tm is set.
%U
The week number of the current year as a decimal number (range 0 through 53). Leading zeroes are permitted but not required.
%OU
Same as %U but using the locale's alternative numeric symbols are used.
%V
The ISO 8601:1988 week number as a decimal number (range 1 through 53). Leading zeroes are permitted but not required. Note: This is not really implemented currently. The format is recognized, input is consumed but no field in tm is set.
%w
The day of the week as a decimal number (range 0 through 6), Sunday being 0. Leading zeroes are permitted but not required. Note: This is not really implemented currently. The format is recognized, input is consumed but no field in tm is set.
%Ow
Same as %w but using the locale's alternative numeric symbols are used.
%W
The week number of the current year as a decimal number (range 0 through 53). Leading zeroes are permitted but not required. Note: This is not really implemented currently. The format is recognized, input is consumed but no field in tm is set.
%OW
Same as %W but using the locale's alternative numeric symbols are used.
%x
The date using the locale's date format.
%Ex
Like %x but the locale's alternative data representation is used.
%X
The time using the locale's time format.
%EX
Like %X but the locale's alternative time representation is used.
%y
The year without a century as a decimal number (range 0 through 99). Leading zeroes are permitted but not required. Please note that it is at least questionable to use this format without the %C format. The strptime function does regard input values in the range @math{68} to @math{99} as the years @math{1969} to @math{1999} and the values @math{0} to @math{68} as the years @math{2000} to @math{2068}. But maybe this heuristic fails for some input data. Therefore it is best to avoid %y completely and use %Y instead.
%Ey
The offset from %EC in the locale's alternative representation.
%Oy
The offset of the year (from %C) using the locale's alternative numeric symbols.
%Y
The year as a decimal number, using the Gregorian calendar.
%EY
The full alternative year representation.
%z
Equivalent to the use of %a, %d %b %Y %H:%M:%S %z in this place. This is the full ISO 8601 date and time format.
%Z
The timezone name. Note: This is not really implemented currently. The format is recognized, input is consumed but no field in tm is set.
%%
A literal `%' character.

All other characters in the format string must have a matching character in the input string. Exceptions are white spaces in the input string which can match zero or more white space characters in the input string.

The strptime function processes the input string from right to left. Each of the three possible input elements (white space, literal, or format) are handled one after the other. If the input cannot be matched to the format string the function stops. The remainder of the format and input strings are not processed.

The return value of the function is a pointer to the first character not processed in this function call. In case the input string contains more characters than required by the format string the return value points right after the last consumed input character. In case the whole input string is consumed the return value points to the NUL byte at the end of the string. If strptime fails to match all of the format string and therefore an error occurred the function returns NULL.

The specification of the function in the XPG standard is rather vague. It leaves out a few important pieces of information. Most important it does not specify what happens to those elements of tm which are not directly initialized by the different formats. Various implementations on different Unix systems vary here.

The GNU libc implementation does not touch those fields which are not directly initialized. Exceptions are the tm_wday and tm_yday elements which are recomputed if any of the year, month, or date elements changed. This has two implications:

The following example shows a function which parses a string which is supposed to contain the date information in either US style or ISO 8601 form.

const char *
parse_date (const char *input, struct tm *tm)
{
  const char *cp;

  /* First clear the result structure.  */
  memset (tm, '\0', sizeof (*tm));

  /* Try the ISO format first.  */
  cp = strptime (input, "%F", tm);
  if (cp == NULL)
    {
      /* Does not match.  Try the US form.  */
      cp = strptime (input, "%D", tm);
    }

  return cp;
}


Go to the first, previous, next, last section, table of contents.