openat2 — open and possibly create a file (extended)
#include <sys/stat.h> #include <fcntl.h> #include <linux/openat2.h>
long
openat2( |
int dirfd, |
const char *pathname, | |
struct open_how *how, | |
size_t size) ; |
Note | |
---|---|
There is no glibc wrapper for this system call; see NOTES. |
The openat2
() system call is
an extension of openat(2) and provides a
superset of its functionality.
The openat2
() system call
opens the file specified by pathname
. If the specified file
does not exist, it may optionally (if O_CREAT
is specified in how.flags
) be created.
As with openat(2), if pathname
is a relative
pathname, then it is interpreted relative to the directory
referred to by the file descriptor dirfd
(or the current working
directory of the calling process, if dirfd
is the special value
AT_FDCWD
). If pathname
is an absolute
pathname, then dirfd
is ignored (unless how.resolve
contains
RESOLVE_IN_ROOT
, in which case
pathname
is resolved
relative to dirfd
).
Rather than taking a single flags
argument, an extensible
structure (how
) is
passed to allow for future extensions. The size
argument must be specified
as sizeof(struct
open_how).
The how
argument
specifies how pathname
should be opened,
and acts as a superset of the flags
and mode
arguments to openat(2). This argument
is a pointer to a structure of the following form:
struct open_how { u64 flags
; /* O_* flags */u64 mode
; /* Mode for O_{CREAT,TMPFILE} */u64 resolve
; /* RESOLVE_* flags */ /* ... */};
Any future extensions to openat2
() will be implemented as new
fields appended to the above structure, with a zero value
in a new field resulting in the kernel behaving as though
that extension field was not present. Therefore, the caller
must
zero-fill
this structure on initialization. (See the "Extensibility"
section of the NOTES
for more
detail on why this is necessary.)
The fields of the open_how structure are as follows:
flags
This field specifies the file creation and file
status flags to use when opening the file. All of the
O_*
flags
defined for openat(2) are valid
openat2
() flag
values.
Whereas openat(2) ignores
unknown bits in its flags
argument,
openat2
() returns an
error if unknown or conflicting flags are specified
in how.flags
.
mode
This field specifies the mode for the new file,
with identical semantics to the mode
argument of
openat(2).
Whereas openat(2) ignores
bits other than those in the range 07777
in its mode
argument,
openat2
() returns an
error if how.mode
contains
bits other than 07777
.
Similarly, an error is returned if openat2
() is called with a nonzero
how.mode
and how.flags
does not
contain O_CREAT
or
O_TMPFILE
.
resolve
This is a bit-mask of flags that modify the way in
which all
components of pathname
will be
resolved. (See path_resolution(7)
for background information.)
The primary use case for these flags is to allow
trusted programs to restrict how untrusted paths (or
paths inside untrusted directories) are resolved. The
full list of resolve
flags is as
follows:
RESOLVE_BENEATH
Do not permit the path resolution to succeed if any component of the resolution is not a descendant of the directory indicated by
dirfd
. This causes absolute symbolic links (and absolute values ofpathname
) to be rejected.Currently, this flag also disables magic-link resolution (see below). However, this may change in the future. Therefore, to ensure that magic links are not resolved, the caller should explicitly specify
RESOLVE_NO_MAGICLINKS
.RESOLVE_IN_ROOT
Treat the directory referred to by
dirfd
as the root directory while resolvingpathname
. Absolute symbolic links are interpreted relative todirfd
. If a prefix component ofpathname
equates todirfd
, then an immediately following..
component likewise equates todirfd
(just as/..
is traditionally equivalent to/
). Ifpathname
is an absolute path, it is also interpreted relative todirfd
.The effect of this flag is as though the calling process had used chroot(2) to (temporarily) modify its root directory (to the directory referred to by
dirfd
). However, unlike chroot(2) (which changes the filesystem root permanently for a process),RESOLVE_IN_ROOT
allows a program to efficiently restrict path resolution on a per-open basis.Currently, this flag also disables magic-link resolution. However, this may change in the future. Therefore, to ensure that magic links are not resolved, the caller should explicitly specify
RESOLVE_NO_MAGICLINKS
.RESOLVE_NO_MAGICLINKS
Disallow all magic-link resolution during path resolution.
Magic links are symbolic link-like objects that are most notably found in proc(5); examples include
/proc/[pid]/exe
and/proc/[pid]/fd/*
. (See symlink(7) for more details.)Unknowingly opening magic links can be risky for some applications. Examples of such risks include the following:
If the process opening a pathname is a controlling process that currently has no controlling terminal (see credentials(7)), then opening a magic link inside
/proc/[pid]/fd
that happens to refer to a terminal would cause the process to acquire a controlling terminal.In a containerized environment, a magic link inside
/proc
may refer to an object outside the container, and thus may provide a means to escape from the container.Because of such risks, an application may prefer to disable magic link resolution using the
RESOLVE_NO_MAGICLINKS
flag.If the trailing component (i.e., basename) of
pathname
is a magic link,how.resolve
containsRESOLVE_NO_MAGICLINKS
, andhow.flags
contains bothO_PATH
andO_NOFOLLOW
, then anO_PATH
file descriptor referencing the magic link will be returned.RESOLVE_NO_SYMLINKS
Disallow resolution of symbolic links during path resolution. This option implies
RESOLVE_NO_MAGICLINKS
.If the trailing component (i.e., basename) of
pathname
is a symbolic link,how.resolve
containsRESOLVE_NO_SYMLINKS
, andhow.flags
contains bothO_PATH
andO_NOFOLLOW
, then anO_PATH
file descriptor referencing the symbolic link will be returned.Note that the effect of the
RESOLVE_NO_SYMLINKS
flag, which affects the treatment of symbolic links in all of the components ofpathname
, differs from the effect of theO_NOFOLLOW
file creation flag (inhow.flags
), which affects the handling of symbolic links only in the final component ofpathname
.Applications that employ the
RESOLVE_NO_SYMLINKS
flag are encouraged to make its use configurable (unless it is used for a specific security purpose), as symbolic links are very widely used by end-users. Setting this flag indiscriminately—i.e., for purposes not specifically related to security—for all uses ofopenat2
() may result in spurious errors on previously functional systems. This may occur if, for example, a system pathname that is used by an application is modified (e.g., in a new distribution release) so that a pathname component (now) contains a symbolic link.RESOLVE_NO_XDEV
Disallow traversal of mount points during path resolution (including all bind mounts). Consequently,
pathname
must either be on the same mount as the directory referred to bydirfd
, or on the same mount as the current working directory ifdirfd
is specified asAT_FDCWD
.Applications that employ the
RESOLVE_NO_XDEV
flag are encouraged to make its use configurable (unless it is used for a specific security purpose), as bind mounts are widely used by end-users. Setting this flag indiscriminately—i.e., for purposes not specifically related to security—for all uses ofopenat2
() may result in spurious errors on previously functional systems. This may occur if, for example, a system pathname that is used by an application is modified (e.g., in a new distribution release) so that a pathname component (now) contains a bind mount.RESOLVE_CACHED
Make the open operation fail unless all path components are already present in the kernel's lookup cache. If any kind of revalidation or I/O is needed to satisfy the lookup,
openat2
() fails with the error EAGAIN . This is useful in providing a fast-path open that can be performed without resorting to thread offload, or other mechanisms that an application might use to offload slower operations.
If any bits other than those listed above are set
in how.resolve
, an error
is returned.
On success, a new file descriptor is returned. On error,
−1 is returned, and errno
is set to indicate the error.
The set of errors returned by openat2
() includes all of the errors
returned by openat(2), as well as the
following additional errors:
An extension that this kernel does not support was
specified in how
. (See the
"Extensibility" section of NOTES
for more detail on how
extensions are handled.)
how.resolve
contains
either RESOLVE_IN_ROOT
or
RESOLVE_BENEATH
, and the
kernel could not ensure that a ".." component didn't
escape (due to a race condition or potential attack).
The caller may choose to retry the openat2
() call.
RESOLVE_CACHED
was
set, and the open operation cannot be performed using
only cached information. The caller should retry
without RESOLVE_CACHED
set in how.resolve
.
An unknown flag or invalid value was specified in
how
.
mode
is
nonzero, but how.flags
does not
contain O_CREAT
or
O_TMPFILE
.
size
was
smaller than any known version of struct open_how.
how.resolve
contains
RESOLVE_NO_SYMLINKS
, and
one of the path components was a symbolic link (or
magic link).
how.resolve
contains
RESOLVE_NO_MAGICLINKS
,
and one of the path components was a magic link.
how.resolve
contains
either RESOLVE_IN_ROOT
or
RESOLVE_BENEATH
, and an
escape from the root during path resolution was
detected.
how.resolve
contains
RESOLVE_NO_XDEV
, and a
path component crosses a mount point.
This system call is Linux-specific.
The semantics of RESOLVE_BENEATH
were modeled after
FreeBSD's O_BENEATH
.
Glibc does not provide a wrapper for this system call; call it using syscall(2).
In order to allow for future extensibility, openat2
() requires the user-space
application to specify the size of the open_how structure that it is passing.
By providing this information, it is possible for
openat2
() to provide both
forwards- and backwards-compatibility, with size
acting as an implicit
version number. (Because new extension fields will always
be appended, the structure size will always increase.) This
extensibility design is very similar to other system calls
such as sched_setattr(2),
perf_event_open(2), and
clone3(2).
If we let usize
be the size of the
structure as specified by the user-space application, and
ksize
be the size
of the structure which the kernel supports, then there are
three cases to consider:
If ksize
equals usize
, then there is
no version mismatch and how
can be used
verbatim.
If ksize
is larger than usize
, then there are
some extension fields that the kernel supports which
the user-space application is unaware of. Because a
zero value in any added extension field signifies a
no-op, the kernel treats all of the extension fields
not provided by the user-space application as having
zero values. This provides
backwards-compatibility.
If ksize
is smaller than usize
, then there are
some extension fields which the user-space
application is aware of but which the kernel does not
support. Because any extension field must have its
zero values signify a no-op, the kernel can safely
ignore the unsupported extension fields if they are
all-zero. If any unsupported extension fields are
nonzero, then −1 is returned and errno
is set to E2BIG. This provides
forwards-compatibility.
Because the definition of struct open_how may change in the future (with new fields being added when system headers are updated), user-space applications should zero-fill struct open_how to ensure that recompiling the program with new headers will not result in spurious errors at runtime. The simplest way is to use a designated initializer:
struct open_how how = { .flags = O_RDWR, .resolve = RESOLVE_IN_ROOT };
or explicitly using memset(3) or similar:
struct open_how how; memset(&how, 0, sizeof(how)); how.flags = O_RDWR; how.resolve = RESOLVE_IN_ROOT;
A user-space application that wishes to determine which
extensions the running kernel supports can do so by
conducting a binary search on size
with a structure which
has every byte nonzero (to find the largest value which
doesn't produce an error of E2BIG).
This page is part of release 5.11 of the Linux man-pages
project. A
description of the project, information about reporting bugs,
and the latest version of this page, can be found at
https://www.kernel.org/doc/man−pages/.
Copyright (C) 2019 Aleksa Sarai <cypharcyphar.com> %%%LICENSE_START(VERBATIM) Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Since the Linux kernel and libraries are constantly changing, this manual page may be incorrect or out-of-date. The author(s) assume no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein. The author(s) may not have taken the same level of care in the production of this manual, which is licensed free of charge, as they might when working professionally. Formatted or processed versions of this manual, if unaccompanied by the source, must acknowledge the copyright and authors of this work. %%%LICENSE_END |