ARPA2 Common Libraries  2.6.2
Access Control for Documents and Folders

Services often need to make Access Control decisions on Documents and/or the Folders containing them. This is an ACL discipline to structure that.

Here are a few concepts that are useful to understand:

  • Document or File is a sequence of bytes, usually with some describing information such as a name.
  • Folder or Directory is a group of Documents and/or Folders, each indexed by name.
  • Path is a sequence of zero or more folders, possibly with a Document at its end. The path is read from left to right and follows each name to the next Folder and, possibly at the end, to a Document.
  • Link is a name that can also occur in a Folder, which holds a Path for which it forms an alias name in the holding Folder.
  • Volume is a collection of Paths and also the starting point for Links.
  • Canonical Path specifies a Volume and Path with any Links resolved.

These are general. Document Access references documents under an Access Name following a //<volume>/<path> grammar. The <volume> string must not contain any / symbols, but @ is permitted, perhaps to hint a user. This string combines with an Access Domain to uniquely specify a Document.

ARPA2 Reservoir as Default Volumes

ARPA2 Reservoir serves as a Default Volume, denoted by an absent <volume> string.

ARPA2 Reservoir defines a few concepts worth knowing about:

  • Collection is a Folder named as a UUID and sitting directly underneath the Volume that represents a Domain.
  • Resource is a Document sitting directly underneath a Collection. Resources have a UUID as their name but may also have metadata in LDAP that describe them, and allow searching.
  • Index is a set of names that translate to Collection UUIDs, to be resolved under the Volume of the current Domain. Indexes are found for a domain, a user@domain, and every Collection is not just a set of Resources but it is also an Index.

One might say that an Index defines Links, because the names contained in them resolve to a simple Path.

A canonical URI in ARPA2 Reservoir look like //<domain>/<colluuid>/ for a Collection (note that trailing slash), or //<domain>/<colluuid>/<resuuid> for a Resource. Since the Access Domain is incorporated in a preceding phase, since users are mapped to their own <colluuid> and since the <volume> string is absent, the Access Names for ARPA2 Reservoir take the simplified forms /<colluuid>/ for a Collection and /<colluuid>/<resuuid> for a Resource.

ARPA2 Reservoir defines rights per Collection, and applies this to all contained Resources, as well as to continued paths. Aliases that do not start with a /<colluuid>/ have no special rights. In terms of the logic of Access Control, this means that the RIGHTS for the /<colluuid>/ form is looked up for a given Remote Selector, but that all other Paths into ARPA2 Reservoir resolve as KV, so merely permitted to know about. Specifically note that the RIGHTS of a Collection pass over to all Resource held underneath it; this is a deliberate simplification of user management; Collections represent an access profile for all the Resources contained directly underneath them. To this end, the anything after /<colluuid>/ is removed before the Access Rights lookup.

Operators can Define Volumes

Any <volume> name can be introduced by operators as required, and its RIGHTS can be bestowed as desired. The form of the Access Name is //<volume>/<path> and this combines with an Access Domain under which it is defined. The use of // at the start sets these operator-defined Volumes apart from Access Names as used by ARPA2 Reservoir. End the <path> with / if and only if it is a directory; never start a <path> with a slash however, as that is part of the prefixed //<volume>/ form.

An example of an Access Name for an internally shared NFS export for a company's products could be //products/Food/Organic/BloodOrange.md and an example Access Name tied to a user could be //john@homedirs/Letters/Love/mary.tex.

As far as Access Control is concerned the <user> must be supplied in lowercase and the remainder should be mapped to lowercase inasfar as they are case-insensitive. The general idea being that the Access Name is a UTF-8 string, with its application-specific ideas of normalisation, Access Control refrains from half measures and simply does nothing along these lines.

Access Type for Documents

To use these forms of Access Name, both for ARPA2 Reservoir and operator-defined Volumes, the following Access Type is used under Access Control:

51af068f-49dd-3fd4-a94d-37052073e98e

This value was allocated on http://uuid.arpa2.org for this purpose.

Document Access API

The customary Rules operations for key derivation are used:

  • rules_dbkey_domain() to derive a Domain Key for the Access Domain and optional Database Secret;
  • rules_dbkey_service() to derive a Service Key from the Domain Key and the Access Type for Document Access;
  • rules_dbkey_selector() is normally used internally to index into the database for the Remote Identity, or an abstraction thereof, in case the database is used as ACL store.

The function to call for retrieval of the Access Rights to a Document or Folder is:

#include <stdint.h>
#include <stdbool.h>
#include <arpa2/identity.h>
#include <arpa2/access_document.h>
bool access_document (const a2id_t *remote, char *xsname,
const uint8_t *opt_svckey, unsigned svckeylen,
const char *opt_acl, unsigned acllen,
access_rights *out_rights,
a2act_t *optout_actor);

The remote represents the (remote) user intending to access the document with the given xsname, as defined by Access Name above. We emphasise that the user is (in the most general case) remote, to avoid reducing our view to one that is strictly local. The ACL can be setup with Rules to allow only local access, but all mechanisms are general enough to allow a mixture of local and remote users, and so is the API.

When the opt_acl is not NULL, it points to a concatenation of Rules, each of which ends in a NUL character. This is the situation where an explicit Ruleset is supplied by the application, supposedly because it is configured into the context that triggers the access request. The total length of the Ruleset, including the very last NUL character, is set in acllen.

When opt_acl is NULL, its acllen is ignored and lookups are done in the database. The opt_svckey then provides a Service Key of length svckeylen as derived with rules_dbkey_selector() and customarily configured in application contexts that trigger access requests. It is possible that applications are sent a hexadecimal form of the Service Key, but it should be brought back into binary form before calling this function.

When optout_actor is not NULL, it may receive an Actor Identity if one is defined in the applicable access rule. Actor Identities are local aliases for the remote identity, and may be used for such things as Groups, where a group member identity would be revealed and not the authenticated Remote Identity.

When this function succeeds, it returns true; in case of operational problems it returns false and sets errno to a com_err code. Not finding any Rules for the input data is not an error; it will simply return V or visitor rights, the lowest grade available. Either return value indicates that the out_rights can be relied on; for false it will not include the V right that is otherwise alway included.

Attributes for Document Access

The following attribute is defined:

  • =g<scene>+<actor>@<xsdomain> with the Access Domain is used to specify the full Actor Identity; the domain may not be available to access_document(). This is returned in the optout_actor field, to be used in unison with the originally supplied remote to mark an internal or localised identity for the Remote Identity. When processing documents, it may be beneficial to use this value for activity logging, because it can protect the privacy of remote as long as no abuse tracking is required.

Writing Rules for Document Access

Rules that specify Document Access consist of rights assigned under Remote Selectors for a given Access Name. When the database is used, the Remote Selectors are part of the database indexing scheme; when explicit Rulesets are given, including in an LDAP repository that is used to fill the database from, then Remote Selectors are explicitly specified with ~selector and applied to the following RIGHTS specifications.

When multiple Remote Selectors match a Remote Identity, then the most concrete of these wins. When multiple RIGHTS apply for the same concreteness level, so with the same Remote Selector, then all these are combined with bitwise or, meaning that it suffices if a flag is only specified in one of multiple locations. In any case, even when no Remote Selector matches at all, the V right is included as a "zero" form; it merely specifies the right to visit, but not even to know about the existence of a Document or Folder, let alone read or write it.

The meaningful Access Rights are defined in <arpa2/access.h> and include

  • A for administrative access by humans,
  • S for automation access to make administrative changes,
  • F confugration access to a service,
  • T operation access to start or stop a service without knowing its contents,
  • D deletion access to resources,
  • C creation access to resources,
  • X execution access to make a resource do something, such as accepting connections,
  • W write access to change a resource,
  • R read access to retrieve or see a resource,
  • P prove access to check or prove properties about a resource that are not shown,
  • K know access to be aware of the existence of a resource,
  • O owner access to possess a resource without being able to work on it,
  • V visitor access to be kept in the blind.

These rights are pretty much ordered from the highest to the lowest. It is possible to specify them individually with names like ACCESS_WRITE or to include all lower ones with ACCESS_WRITE_DOWN for writing, or with ACCESS_WRITE_UP when matching. Whether this makes sense depends on the application and its desire to be broad in its usage patterns.