ARPA2 Common Libraries
2.6.1
|
The API for Access Control is split into layers that guide its processing.
Access Control can be considered from different angles:
Bottom layer: Loading and parsing database rules.
#comment
to ignore a (marker) word.flags
stored in bits and callback to upper layer.=xval
stored in variable x
and callback to upper layer.^trigger
via callback to upper layer.~selector
in rulesets via callback to upper layer.Upper layer: Semantics for a specific Access Type. This part is specific to an application, of which Communication Access is a possible example. These are separately documented because of the semantic differentiation.
flags
, ^trigger
and =xval
offerings.Management view: Passes over databases and treats them as a bulk data store. (This work may not be completely done.)
#label
markers.Policies are defided with a Ruleset; for the application of Access Control these are commonly known as Access Control List or ACL. Each Ruleset consists of one or more Rules, each being a UTF-8 string with a terminating NUL character. (Note that NUL is not a separator but a terminator.)
Rules consist of words in the low-level grammar:
^trigger
to trigger callbacks from the bottom layer to the upper layer.=xval
for 26 kept variables like x
that are provided with any callback.FLAGS
to spell up to 26 flags like F
, L
, ... as uppercase letters.~sel
to capture ARPA2 Selectors to match (not in the database form).#label
to label a Rule so management tools may recognise them as theirs.Localised Rulesets are perfect for manual management; database Rulesets are better suited for automation and scalable deployment.
Localised Rulesets are kept in an application context, and drive applications like Access Control. This is parsed by the lower layer and callbacks are fed into the upper layer as always. The most concrete match is the one that wins, subject to application policies, but order of Rules in a Ruleset is of no importance; do not depend on Rule order to avoid surprising changes to your policies. Such surprises may be caused by software upgrades at any time, and the implied non-determinism must be taken into account while creating Rulesets.
Database Rulesets are stored in a key-value database. If not overridden at build time, that database is located in /var/lib/arpa2/rules
but the environment variable $ARPA2_RULES_DIR
can override that. This is a directory which the database considers its working environment. Note that all utilities that employ ARPA2 Rules adhere to this setting, including the Group Logic and Access Control. It is not uncommon for an environment variable to influence program behaviour, but it should be noted as a security precaution that externally provided overrides may be a cause of unintended access.
The database is indexed with keys composed of an optional Database Secret, Access Domain (forming a value known as the Domain Key, granting domain-specific administration), plus the Access Type (at this point it may yield a value known as the Service Key, which can be spread in hexadecimal form to service applications) and an Access Name and an ARPA2 Selector (forming the final Request Key used to query the database).
The Remote Selector goes through an iteration process, where the most concrete match is final. Note that empty Rulesets are removed from the database, so be sure to add something to top iteration with a Ruleset.
Remote Selectors are part of the database lookup key, so they are not explicitly stored in an Access Control database. They are however part of a Localised Ruleset.
To allow management of the database, two mechanisms are usable:
#manual
overrides or #pulley
bot-inserted Rules.Iteration over content should be private; without a key, one should be left mostly in the blind about contents. This protects from harvesting of (identity) information.
To this end, keys are hashed but not stored in the plain. This also helps to make the database more efficient.
We plan to encrypt database contents too, based on parts of the key that are not present in the database lookup key. We currently use that to select a Ruleset, and insert it in the beginning of each (of multiple) values for a (partial) lookup key. We may end up using it as encryption "entropy", such as an IV or salt. The quotes emphasise that entropy may be rather thin, especially after already matching the database lookup key.
Variables with =xval
grammar are stored but do not lead to callback. Storage takes the form of a char *
to an UTF-8 entry and an unsigned
length with the number of bytes (not code points). This usually points somewhere in the current Access Rules, so NUL termination does not apply.
Triggers with ^trigger
grammar cause a callback with that string and the current set of 26 =xval
variables. When the return is false
, it indicates that the upcoming Access Rights are not considered an option.
Flags with FLAGS
grammar are parsed into a 32-bit flag field, where A
is in bit 0, up to Z
in bit 25. They cause a callback that the upper layer may consider the final answer. Any ^trigger
callbacks are specific for the next FLAGS
and a failure-returning callback causes the suppression of the following FLAGS
. Triggers are forgotten after this point and processing continues.
Remote Selectors with ~rsel
grammar cause a callback, but would be an error when passing over a database Ruleset.
Labels with #label
grammar do not cause callbacks.
Endrule causes a callback, allowing the upper layer to summarise and reset. The 26 =xval
variables and the FLAGS
are available to the callback. After the callback returns, this data is cleared for a fresh start with the next Rule in the Ruleset, if any.
Endruleset causes a callback, allowing the upper layer to summarise and possibly trigger a final action. The 26 =xval
variables are cleared at this point, and so are the FLAGS
. It is up to the endrule summaries to provide application-specific information to this callback. During this callback, the space allocated for the ruleset is still locked in memory, so references into that are still valid; after this call returns this memory will be unlocked, so any such references are no longer guaranteed to be valid.
Various applications of Rules need unique keys to scope access patterns. These keys are derived top-to-bottom, using irreversible digest algorithms, so that a key to one scope cannot be used to derive the key for a peering scope, let alone a predecessor in the derivation chain. The keys that are derived can therefore be installed in applications with minimal leakage of credentials for others that may be managed differently or elsewhere.
API calls that make database lookups start with a scattering key, involving the following elements:
The mapping is made in 2 stages, to allow maximum control and the ability to gradually delegate control.
The Database Secret is mixed with the Domain, and passed through a message digest (secure hash) to produce a byte sequence. This is the first stage. It yields the Domain Key:
The inputs to this call are the optional Database Secret in opt_dbkey
, which is skipped when it is NULL; the length must be set in dbkeylen
. In addition, the Access Domain is provided in xsdomain
, in UTF-8 notation and terminated with a NUL character. Note that Punycode is considered a local notation mechanism for DNS and not used anywhere else in the ARPA2 infrastructure; it is simply too specific and too confusing in comparison to UTF-8. The xsdomain
must be in all-lowercase notation.
When the function call is successful, it returns true
and sets the key in domkey
, with its size available through C macro sizeof
. Upon error, the call returns false
and errno
is set to a com_err value.
The second stage mixes this output with the binary Access Type, again via a message digest. The output is called a Service Key and often represented in hexadecimal for configuration convenience, but it is to be mapped back to its binary form for continued use with the Rules system.
This routine takes in the domkey
and domkeylen
as produced by rules_dbkey_domain()
and combines them with the UUID in binary form in xstype
. None of these arguments is optional. The output is produced in svckey
, which has a static length at compile time, to be derived with sizeof
if needed.
The function returns true
on success and false
with a com_err code in errno
on failure.
There actually is a third level, but it is not normally used by programs. This is rules_dbkey_selector()
, and it derives the binary key for database indexing when trying to locate a given ARPA2 Selector. The general strategy for locating a Rule that links to a given ARPA2 Identity in the database is to iterate from concrete to abstract forms for the Identity and derive a database index for each in turn; the first that matches will be used and the search stops.
The first action is to open the database, whose location is hard-wired.
The default action for operations is to open the database for reading alone. Many of these are permitted to run in parallel, unlike editing operations which may be more constrained, but also much less frequent. All these operations return true
on success, or false
on failure with errno
set to a com_err code.
To iterate over values in the database, construct a loop with two operations, like
The operations are typed as follows:
The out_dbdata
from rules_dbget()
represent the Ruleset, while the rule
output from rules_dbloop()
and rules_dbnext()
represent only one Rule at a time. Since they end in a NUL character, their size is not part of the return. As soon as no further Rule is found, the latter two routines return false
.
The rules_dbget()
function points a database cursor at the Ruleset it found last. This pointer is moved when the function is called again, but also when the transactions change or the database closes. For safe code, do not assume that the database cursor is valid after a false
return from rules_dbloop()
or rules_dbnext()
.
Applications normally use rules_dbopen_rdonly()
to read from the database and be ignorant about trunking. The general form however, allows read/write mode by setting rdonly
to false
and it also specifies a trunk
to select for,
These functions return true
on success, or otherwise false
with a com_err code in errno
. This applies to the remainder of the functions too.
To add or delete rules in a database, use
In this, the prekey
of size prekeylen
is constructed from the Access Domain and Access Type and optionally a Database Secret. The xskey
is the Access Name to use. The rules
of size ruleslen
form a Ruleset, concatenating a number of Rules, each of which ends in a NUL character.
When the opt_selector
is provided, it will be used to index in the database; otherwise, ~selector
bits in the rules/ruleslen
are used to determine the database entries to update; in this case, knowledge from the rules/ruleslen
is taken apart and may get distributed over multiple database records, so that the fast index mechanism based on Selector iteration can be used. Since the exact same procedure for doing this is used for adding and deletion, this mostly remains transparant to the caller. This mechanism simplifies editing the database content from input that takes the form of general Rules that involve Selectors, such as might be configured in LDAP and automatically pulled in by a daemon; this is how we envision configuration data to be exchanged between sites without the propagation delays caused be the trade-offs in simpler caching mechanisms. When security concerns play a role, it pays to be able to make fast updates.
A final call mirrors the get operation with a set operation,
Note how it is possible to set two database values in one stroke. This is for convenience while offering efficient updates to longer stretches of Rules. The rules_dbset()
function assumes that rules_dbget()
was run before, and returned successfully. This is why no digest is required in rules_dbset()
; it operates on the current database cursor position, which is locked.
To add rules with this function, set in0_dbdata
to the reult of rules_dbget()
and set in1_dbdata
to the newly added rules. Each of these segments ends in a NUL character on each of the contained Rules, including the last one. If you prefer to insert at the beginning, just reverse the use of in0_dbdata
and in1_dbdata
.
To delete rules with this function, set in0_dbdata
to the part of rules_dbget()
before the rule that will be removed, in1_dbdata
to the part after the rule to be removed. Again, any Rule must end in a NUL character; it is however possible that either data field is empty when it contains no Rules at all.
Databases are indexed by keys that are specific to a Domain, Access Type, Access Name and a Selector. They may be further scattered if a Database Secret was initially incorporated. The scattering is random and uses long enough keys to allow for merging data from different sources, as long as they differ on at least one of these parameters.
Combining sources can be efficient. Remember that a database of size N usually needs only log(N) pages to find a target, and that this is done with a memory-mapped database. The basis for the logarithm is around 250, so one page delivers up to 250 keys, two page loads deliver 31250 up to 62500, three page loads reach 7.8 to 15.6 million keys, and so on.
These number are proximate, but accurate is the exponential growth curve. Merging in an extra database barely impacts search efficiency. This is what you get when you design for scale!
The different uploads are not distinguished in any way; it is assumed that a full match on the keys (or, effectively, a secure hash computed over it) implies a fully warranted offer for those keys.
For administrative purposes, it is useful to be able to separate the various keys (and their associated Rule sets) by source. One might for example use this to reset the information held from one particular uplink. These uplinks are therefore identified with a Trunk Identifier, a number (in 32 bits by default) that is part of every Rule set, but ignored by the Access Control logic. It is however useful for source-dependent bulk operations, in spite of having merged data sources.
When subscribing to an LDAP uplink, you would specify its Trunk Identifier. The value 1 is reserved for manual entries/overrides, but 2 and over are available for assignment to automated Trunk subscriptions.
TODO: Bulk operations have not been implemented yet; they would iterate over entries and include support for removal. Individual elements can be edited.