Non-human Identity (NHI) for Workloads and AI Agents: Current State and a Call for Industry Collaboration
Non-human identity (NHI) is undergoing a transformation to reduce the attack surface as industry has rapidly and dramatically increased the sheer number of Application Programming Interface (API) connections, internal and external. Much of the change can be credited to automation of processes enabled by AI and the speed at which work can be accomplished today.
The initial work to replace vulnerable authentication credentials that include passwords and API keys is well underway and close to final answers. There are several emerging solutions, each with the same pattern, but different underlying structures to represent the data and different sources for public/private key pairs.
Taking a step back, a few of the problems with both passwords and API keys that led to each being used to successfully conduct attacks included:
Passwords and API keys were typically long-lived and required storage in a part of the system that was recoverable on a reboot, such as hidden in the filesystem. They may have been protected in some way, including the use of a key store.
Passwords and API keys are used again and again to authenticate an identity that can perform a function, leading to potential attack vectors on-the-wire.
These problems have been solved with an overarching solution of taking a format that describes properties of the thing, in this case a workload that can include system information as well to create an identifier and a connection to its credentials. The credentials are public/private key pairs that may be issued in a number of ways including the use of a certificate authority that is considered a strong validator of the identity, properties of the keys, and the keys themselves. A second method might include system generated keys that may be issued from hardware that has a trust anchor for the root key held immutably in a Trusted Platform Module. A third method is a key pair that is generated in software without the backing of a more formal structured key issuance process and may be considered a raw public/private key pair.
Breaking this down to recognize that the same pattern is in use across the emerging methods can be helpful when analyzing each to gain a broad understanding. These methods allow the problems with older credential methods to be addressed in that keys can be issued on a quickly rotating basis, with five minutes not being uncommon. Due to this quick turnover and the issuance process occurring on active systems, the keys are held in memory rather than on a hard disk eliminating another big problem.
There are several standards efforts tackling the definition of those methods along with the protocols used with each to standardize API security (e.g. IETF WIMSE, IETF SSH, IETF Web-BoT-Auth). Since these are emerging as solved problems with additional standardization required, the next set of problems is at the early phase of discussion, where industry is starting to think about the identity of these workloads or other non-human identity (NHI) functions.
NHI identifiers may be derived manually or through an automated process. It can be critical to have the identity assured in some way, which may lead to the need for a manual process with the involvement of a certificate authority that is bound to a policy and set of operational procedures that align to an assurance level. Or there may be processes that are extremely short-lived with no critical data that are on the exact opposite of the spectrum where little to no assurance actually makes sense.
The next set of discussions emerging is to determine if the identity and credential issuance should be categorized in ways similar to what was done for human identities in the National Institute of Standards and Technology (NIST) Special Publication 800-63-B document with authenticator assurance levels (AAL). The challenge upon industry is to determine what this should look like and where the work should be completed. This could be explored collaboratively in the IETF to align to the wider work on identity credentials, potentially via the new practical-cybersecurity mailing list
If aligned to the levels defined by the referenced NIST document, perhaps it would be aligned to categories such as:
AAL3 Very High Confidence: Hardware assured with identities following strict processes for naming and assurance with issuance from a certificate authority of a certain assurance level.
AAL2 High Confidence: In system hardware assured, for example the use of a TPM to digitally sign evidence to assure workload name is as identified.
AAL1 Some Confidence: Software generated identities with no assurance on the identity and correlated credentials. In this instance and the hardware assured identities, what data is used to generate the identity from system or workload properties could be beneficial to have industry guidance.
The purpose of this blog is not to define the levels, but rather to open up this conversation more broadly for industry input as well as to begin to think about where the work should be completed.
Others have developed proposals in the past and may have just been ahead of industry’s readiness to take on the problem.