Editorís Note: This is an especially significant article for its teaching value and application. It addresses philosophical and practical issues of special interest to everyone, especially IT and network security, administrators of online learning programs, financial institutions, and government.
This paper defines and relates authentication and identification and provides a secure means to protect computer communications. This technique can be implemented now using scripts linked to this article. Google lists hundreds of blogs and comments in response to its original publication of this concept in OLDaily.
Part 1. The Problem of Identity
Each of us has an identity. We are composed of a single physical entity - the human body - to which, typically, a name or sign is attached: 'Stephen Downes', 'The King of England', 'Jennifer 8. Lee', 'Prince'. Identity is important. It is - in a literal sense - who we are. Through identity we distinguish ourselves from each other, and through this distinction a host of cultural and social artifacts flow: attribution of authorship, ownership of houses, permission to drive, residency, citizenship, the right to vote, and more.
The problem of identity has traditionally been posed as an ontological problem. What is it, philosophers have asked, that makes an individual person an individual person? Today we generally approach this with a materialist response: a person is the body that contains the person. The problems posed by philosophers - problems such as the potential migration of souls, of brain or body transplants, of possession and transubstantiation, we leave to the philosophers. Produce the body and you have the person. Habeas corpus.
The problem of identity is today an epistemological problem. How do you know that this person is who this person claims to be? It is described, not from the point of view of the cogito, not from the point of the view of the person wondering who they are, but from the point of view of a third person, the one who seeks to know, "Who goes there?" and to be able to be satisfied with the response. The problem of knowledge, when it is connected to questions of personal identity, is tied into the fabric of society. Without the capacity to know, most of our customs and institutions would founder. The ritual of marriage, the assignment of criminal responsibility, the right to access a home: none of these would be possible without a third party being able to state an informed opinion about your or my identity.
In the virtual world, this problem is magnified. The virtual presence is corpus-free. There is no body to present to a third party as evidence that we are indeed who we say we are. The traditional connection between signification (the use of a sign to attribute identity) and instantiation (the actual instance of a human body) has been lost. The question, "Who goes there?" attains a new significance when there is no means to follow up with the demand, "Step forward and be recognized." And without this, the second part, it seems, the fabric of society so well known in the physical world cannot be migrated to the virtual world.
The title of this paper suggests that the answer to the problem consists of two parts: authentication and identification. This is partially true. It would be more accurate, I think, to say that it consists of two approaches. On the one hand, we have the assertion that I am a certain person. That is 'identification'. It is the specific process of attaching an identity of a presence - either a physical presence, or in the context of our current enquiry, a virtual presence. And on the other hand we have the verification - the means of proof that what I say is true, that there is sufficient evidence for my claim.
This is not as easy a distinction as it may seem. In every instance of identification there is a flavour of authentication, and in every instance of authentication there is a flavour of identification. In my own self-identification as 'Stephen Downes', for example, there needs to be some means by which I know that this is true. Typically, when I self-identify, I consult my own memory (a process so easy and habitual I do not even notice having done it) to find the specific string of characters or sounds that correspond, more or less uniquely, to me. Without memory, self-identification is impossible; the first utterance of amnesiacs (at least in the movies) is, "Who am I? I don't know who I am?" And not, say, "I can't remember the name of the capital of France."
Because of the complexity of contemporary society, a name is seldom sufficient to uniquely identity an individual. Search Google and you will find references to numerous other instances of people named 'Stephen Downes'. Consequently, I use supplementary strings in order to distinguish myself: I have a Canadian Social Insurance Number. I have a unique email address. And over the years I have accumulated a clutter of other identification marks: bank account numbers, drivers' licenses, passport numbers, and more. And I have established a unique set of relations with other entities: a marriage certificate, a property deed and address, a telephone number, a birth date.
Of course, I cannot remember all of these things (usually I'm good for only two or three of them) and so I carry tokens to assist me. Here now I am not relying on my memory (save for the fact that I have, say, an account at a bank or an Air Canada frequent flyer number). My knowledge of the particular identification mark consists entirely in my having of the token containing that number. Ask me to repeat my credit card number without looking at my card and I am lost. Telephone number? I am always looking at my telephone to remember what it is. Though these numbers may constitute a part of my identity (where we think of 'identity' precisely and only as the means of establishing my unique personhood) my knowledge of my own identity requires verification via these external tokens. It requires, even for me, authentication.
Conversely, authentication is impossible without identification. There must be, at some point, a mechanism whereby I say, "This is who I am," in order for that claim to be verified. True, such claims are often more or less implicit. The license plate on my car, for example, functions as an ongoing self-identification statement (or would, if I had a car). Placing a bank card into an ATM is an act of self-identification. Presenting my body in front of the members of the parole board, or in front of a prospective employer, or in front of a college registrar, is also a means of self-identification. In each of these instances, identification is a necessary first step occurring prior to authentication. The proof will follow, but it must follow, the claim.
Even though the distinction is therefore somewhat ambiguous, it is nonetheless possible to draw the distinction in a rough and ready fashion. A lot will ride on this distinction, so it is worth being as clear as possible at the outset.
Identification is the act of claiming an identity, where an identity is a set of one or more signs signifying a distinct entity.
Authentication is the act of verifying that identity, where verification consists in establishing, to the satisfaction of the verifier, that the sign signifies the entity.
There are two major types of assertion to be considered:
I claim that I am P, and I am P. Here I am making a true claim. This (it should be emphasized) is the normal case. I make a claim about myself, and the claim is true. Most of us (even criminals) most of the time want our identity claims to be successful.
I claim that I am P, and I am not P. Here I am making a false claim; I am stating that am someone other than who I actually am. Identity theft is the making of such a claim on a consistent basis for the purpose of monetary gain (usually money belonging to the person who I am claiming to be). But in fact false claims of identity are common: forging a cheque, presenting false ID at a bar or tavern, falsely representing oneself as an architect or a marine biologist, using Bugmenot - these are all instances of this second case.
Aside from the presentation of a physical body (and sometimes accompanied by the presentation of a physical body) identity claims may take only one of two forms; these are, in fact, the same forms we use to remind ourselves of our own identity:
First, we may make an assertion. That is, we produce, through an utterance, an act of writing, or a keyboard entry, an appropriate sign that signifies our own identity. For example, I may say, "I am Stephen Downes." Or I may sign my name to a document. Or I may key my PIN number into an ATM. Each of these is the assertion that "I am so-and-so."
Second, we may present a token. That is, we produce a physical object on which an appropriate sign signifying our identity has been embedded. In some cases, tokens may contain more than one sign - for example, a driver's license will contain a name, a signature, and a photograph. In other cases, tokens may contain a reference only to the bearer - the presentation of money, for example, is a token that somebody (nominally, the government) owes the bearer a certain value of goods; presentation of the token is in essence the claim "I am the person to whom the government owes this amount of goods."
It is common at this juncture to confuse an identity claim with authentication. For example, the presentation of a bank card (a token) to a bank machine, combined with an assertion (the keying of a PIN), is often taken to constitute a type of authentication. However, it is not; it is nothing more than the claim to be a certain person.
Importantly, nothing inherently in the bank card and PIN prevents the possibility that 'I claim to be P, and I am not P'. In fact, this happens all the time. I give my card to my wife, tell her the PIN number, and say, "Take out an extra $40 for me too." In a similar manner, there is nothing inherent in a passport, driver's license, assertion that "I am Stephen Downes," claim that "I am an architect," etc., that precludes the possibility of it being a false assertion. Thus we have fake ID, fake passports, counterfeit money, and sleazy ladies men at a pick-up bar.
To put it in slogan form: when you present your driver's license to the police officer, that's an identity claim. When the police officer compares the photo on the license with your face, that's authentication.
Nothing in the claim prevents it from being a false claim. This is true notwithstanding a long history of efforts to make claims self-authenticating. But the only sure evidence of identity is the presentation of the thing itself - and in the case of people, of the person him or herself. Any entity distinct from the person may be forged, faked, stolen, loaned, lost or otherwise disassociated with the person. That this is a logical possibility is tautologically true; and when the stakes are sufficiently high, the logically possible becomes probable.
In the virtual world, moreover, the body is never present. Hence, the only thing a person or service ever sees is the claim. Hence, it is always a logical possibility that the identity claim may misrepresent the person being identified. On the internet, no identity claim can ever be self-verifying. In order to know that a person's claim that "I am P" is in fact true, there must be a reliance on some process of authentication. Or so, at least, it would seem.
As mentioned above, authentication is the process of verifying an identity claim. A billion words (more or less) have been written on the subject of authentication, but for brevity's sake we will skip most of them here.
The idea of authentication is to present the person or service with evidence that attests to the truth of a statement of the form "I am P." And while numerous techniques are employed in the process of authentication, they break down into two major categories:
First, the testimony of some third party who can attest to the truth of the statement that I am P, or
Second, the presentation of an artifact that is in some way knowably unique to the person and which also attests to the truth of the statement that "I am P."
Below we will look at some authentication systems intended for use on the internet. But it is important first to observe and to argue that, with some few (and generally unacceptable) alternatives, no system of authentication succeeds.
This is a strong claim. It needs clarification. It is important to recognize that by 'succeeds' we mean here 'proving beyond reasonable doubt that "I am P" is true.' But what would constitute reasonable doubt? This depends on the circumstances. If you want to give an advertising flyer to 'Stephen Downes', then your standards of proof are pretty low. But if you want to give a million dollars to 'Stephen Downes' then (I would hope) your standards are higher.
Authentication is, indeed, a classic epistemological problem. Absolute certainty is impossible to obtain, therefore, the standards of proof are adjusted to meet the circumstances.
It is easy to see how any sort of authentication could fail.
First, consider the testimony of some third party. This is a very common form of authentication. It typically takes the form, "X asserts that 'I am P' is true" where X is the identity of a trusted third party. In systems relying on identity brokers, authentication servers, and the like, authentication takes this form.
However, now where you had one problem, you now have two.
First, how do you know that the statement 'I am X' is true? After all, in order to trust statements from an authentication service, it is necessary to know that it is in fact the authentication service making the statement. But what is to prevent someone from asserting "I am X" in cases where it is not true?
Second, how does X know that the statement 'I am P' is true? X is faced with the same problem you are: in order for X to authenticate the statement that 'I am P' X must be able to prove that the statement is true. But how is X to do this? X has at his or her disposal only the same tools that you have at your disposal. X must either rely on some trusted third party (in which case we go through the cycle again), or X must rely on some artifact that is knowably unique to the person in question.
For the purposes of this argument, we can ignore the first problem (though in practice the designers of authentication systems cannot).
The second problem, meanwhile, is merely an instance of the original problem. After all, if an authentication broker could establish that 'I am P' is true, then so (in principle) could you. Conversely, if there is no way for you to establish that 'I am P' is true, there is no way for a third party to establish that 'I am P' is true.
The problem of authentication thus resolves to this: the presentation of an artifact that is in some way knowably unique to the person and which also attests to the truth of the statement that "I am P."
And here is why authentication ultimately fails: there is no such artifact. The only entity that is necessarily unique to a person is, necessarily, the person him or her self. Any other entity may, at one time or another, be associated with another person. A key, a card, a telephone, a computer, a specially marked deck of playing cards - any of these may change hands at any given time, any of these may be altered to record false information, and any of these may be forged or duplicated. Even the body itself, in some circumstances, fails this test: a person may be coerced, a person's fingers may be cut off - the writers of Law and order and Crime Scene Investigation have contrived no end to the number of ways even a person's body can offer misleading evidence.
Is this the end of authentication? Of course not. But here we get to the heart of how authentication really works. At its core, authentication depends on some sort of proxy standing in for the person being authenticated. In other words, it depends not on person uttering "I am P" but on some sort of stand-in uttering "I am P", and then leaves the question of the relation between the proxy and the person up to the person.
In a sense, it's like the police identifying a car, and then holding the owner responsible for the actions of the car. If a radar camera captures a photograph of a car running a red light, no attempt is made to identify the driver of the car; rather, the car is deemed to be the offender, and therefore, the owner of the car is liable for fines or suspensions. The car, in this case, is a proxy for the person, and while it may not be possible to establish beyond a doubt whether the car signifies the person, it is possible to establish that the car is the car.
Online, while we may not be able to identify the person using the computer, we can establish the identity of the computer (within certain bounds). Thus liberated, we now have a legion of authentication schemes. For example:
IP-based authentication - a computer is deemed authenticated if and only if it accesses the internet through a limited range of IP addresses. Since IP addresses are owned, and since it is difficult to spoof an IP address, a computer reporting to be connected through the appropriate IP address is deemed to be authenticated.
Processor-based authentication - a computer (or an Ethernet card, using a MAC address) is deemed to be authenticated if and only if it provides an authorized hardware address to the authentication service.
Trusted computing - a computer is deemed to be authenticated if and only if it provides credentials obtained from a 'trusted' programming space within the computer, that is, a part of the computer's program that is inaccessible to the computer user.
The process of authentication, therefore, involves the establishment of a unique identity for the computer (or some essential part of the computer, such as its Ethernet card), and the transmission of that identity to the authentication service, whether that authentication service is the original service provider or some trusted third party that will provide testimony to the service provider.
It ties access, in other words, to a specific device, rather than to a specific person.
There is no doubt that this is the direction in which the authentication industry as a whole is moving. Machine identification is already the norm in the mobile phone industry, where the vendor has control over the hardware and programming of the phone. Microsoft's trusted computing initiative seeks to "create secure compartmentalization of data and applications" that cannot be accessed by the computer owner. My laptop uses a secure wireless networking card. To access journal subscriptions through CISTI my IP must be authenticated either directly or through a VPN.
But there is also no doubt that these developments are not being met with open arms. There is a large community devoted to hacking mobile phones. Microsoft's Longhorn has met with widespread criticism. Critics have charged, reasonably, that using the computer as a proxy for authentication locks the user into a hardware dependence; his or her content is tied to a specific machine, a specific hardware configuration, a specific vendor. As I once commented, only half-jokingly, "Trusted computing will bring to Microsoft Word the reliability and stability of Outlook and Exchange."
Beneath that, though, is a sentiment probably more accurately captured by opposition to things like red light cameras. Such mechanisms usurp my ownership of my own identity. If my assertion that "I am P" has little credibility before, it has no credibility in an era when authentication is based on machine ID and license plate number. It strips away my control over the use of my identity, as I now have no ability to allow or deny the release of that identity to third parties. And it impacts my autonomy, as now I may use what was once thought of as mine under strictly controlled circumstances.
In my opinion, the unhappy situation brought about proxy authentication is based on a misunderstanding of the concepts of identity and authentication generally. The general distaste for proxy solutions (of which there will be increasing empirical evidence as such solutions become more widespread) illustrates a gulf between our underlying values regarding identity and the manner in which it has manifest itself online.
It was once the case (or so legend tells us) that a man's word was his bond. What that meant was that it would be such a loss to a person to be caught, say, misrepresenting his own identity, that it was almost inconceivable that he would do such a thing. This cost was reflected not in prison sentences (though if you were caught by the authorities the penalties were severe) but by the person's greatly diminished standing in the community. A man who could not be trusted would not be able to take advantage of the many small favours essential in medieval life, or in later days, would not be able to pay for goods at the hardware store merely by signing a cheque.
Above I mentioned what may have been passed over on first reading something as startling as it is true: Automatic Teller Machines (ATMs) do not depend on authentication at all, they depend solely on identification. This may seem counter-intuitive to most people; after all, what more secure system is there than the ATM network? Yet, when the card is presented to the machine and a PIN typed into the keypad, the machine takes it on faith that the presenter of the card is, in fact, the person authorized to do so. It does not use biometrics to scan the user, it does not validate the user's thumbprint against a third party authentication service. The mere possession of the card and the mere typing of the PIN number is sufficient to withdraw all the cash from a person's account, no questions asked. Anybody could do it, even smart animals.
What makes the ATM network so secure? As in the case of a man's word, the cost of allowing the misrepresentation of one's own identity is much greater than any benefit that could be obtained. Were I to allow open access to my bank card and to publish my PIN on the internet, it is a virtual certainty that my bank account would be drained of money by other people. So it is in my best interest to remain in possession of my card and to keep my PIN to myself, or at the very least, to restrict their distribution to people I know well and trust completely.
If we examine existing systems of identification, it is easy to observe that the vast majority of them operate in exactly the same way. I do not loan my driver's license to another person, for example, because I would then be responsible for the actions of that person, which could get me in legal or financial trouble. I do not give out the password to my computer because then somebody could get into the system and delete files, rewrite web pages, and engage generally in the practice now called 'hacking'. I do not make copies of my house keys and distribute them to everyone I know because I would then feel much less secure in the continued ownership of my possessions.
Moreover, there is a whole range of similar incentives that convince me not to adopt someone else's identity (and not merely legal incentives). When I write an exam at the university, for example, I make sure to write my own name on the paper, in order to receive a grade. When I publish an article, I place my own name in the byline, in order to receive credit. When I sign a cheque, I sign my own name, in order to receive the cash. I give my employers accurate information regarding my name and address to ensure that I am paid for the work that I do.
The point here is, self-identification can be trusted if it is in the interest of the self to self-identify accurately. Indeed, I can be trusted not only to correctly assert that 'I am P' but to do so in such a way that I, at least, can know that the information provided could be known by no other person or that the token provided could possessed by no other person. When sufficiently motivated, I can prove my own identity to my own satisfaction.
Indeed, on reflection, we can see that exactly the same principle applies even to proxy authentication systems. Suppose, for example, that access to a video game is authenticated by a hardware serial number. Well, what prevents me from simply giving my computer to my friend and letting him play the game for a week? Nothing - except that I would then be without my computer. What prevents me from sharing my Cisco wireless card with people in the neighbourhood? Nothing, again, except that I would now be without wireless access. Similarly, I could share ring-tones with my friends by circulating my mobile phone, enable neighbours to read online journals by letting them use the computer in my office.
Logically, no authentication system is more secure than self-identification. It is not more secure because, in the end, no authentication system consists of anything over and above self-identification. Without self-identification, authentication would not work at all. And no more rigorous standard of identification can be applied over and above self-authentication. Even if we had computers that sampled out DNA and would not function unless this input were verified at a national DNA registry, the system would be able to prevent my spitting into the DNA reader and letting my friend have a free-for-all.
What authentication actually does is two-fold: first, and most of all, it increases the cost of my incorrectly self-identifying, by attaching self-identification to devices I would not want to part with, such as my computer or my phone. And second, it increases the difficulty of falsely self-identifying by requiring specific hardware, software or network properties. But it should be evident that when the benefit obtained by falsely self-identifying exceeds the cost, then there will be significant motivation to do so. And with the cost of computer components dropping all the time, it would seem, therefore, unwise to tie identification to the computer.
Privacy and Control
As mentioned above, one of the advantages of self-identification, as opposed to authentication, is that I can control who I reveal my identity to. The control of my identity is, in other words, in my own hands. If a person or a site requires that I reveal my email address to them, it remains my choice whether or not to reveal it. If, on the other hand, my identity is authenticated by means of, say, hardware address, then I am unable to control the release of my identification information; every site gets it. And if every site gets it, then it follows that, if I release any additional information to any site, every site could get it as well, because the site has a 'trusted' association between a hardware address and an email address. Revealing one - which I cannot help but to do - reveals all.
The question of control raises the issue of privacy, and the question of privacy is a common concern with respect to authentication systems. In my opinion, privacy isn't so much a question of legislation (because people will break the law) and it isn't so much a question of technology (because technology can be circumvented) as it is a question of trust: can the user trust the service provider to respect the user's rights with respect to personal data?
And the answer, of course, is "no." There is no shortage of evidence that shows that if corporations and government entities can share personal information, they will. From the long reach of Carnivore to the carnivorous reach of Equifax, it is evident that personal data will be distributed well beyond the user's original intent. Even if the intentions of the company or the government agency are benign, there is no shortage of people willing to try to steal that personal information. Moreover, it is likely the case that companies will treat authentication information in the same manner as users; so long as the cost of sharing this information with others is greater than the cost of keeping it, the information will not be shared; but once the cost of keeping it exceeds the cost of sharing it (as is the case in virtually every corporate takeover, potential lawsuit or government action) the information will be shared.
At the heart of this issue, though, is the question of who has the right to answer the question, "Who am I?" And there are two possible approaches here, approaches coinciding (not coincidentally) with the initial distinction drawn between identification and authentication. In the case of identification, the mantra is, "You are who you say you are," where the guarantee lies in the user's interest to correctly self-identify. While in the case of authentication, it is, "You are who we say you are," where the guarantee lies in the authenticator's interest to correctly identify others. And since it is clear that the authenticator's interests will, at least from time to time, conflict with the user's interests, it seems likely that users would prefer self-identification over authentication.
For after all, the objectives of the two systems are also different. In the case of identification, the objective of a correct self-identification (and the protection of that identification) is to protect and promote the user's interests. A person will self-identify, as described above, in order to get something or to keep something. In the case of authentication, however, the objective is to promote the service provider's interests. It is to keep unauthorized people out, to protect assets; it is to enable the reliable collection of user information and user data.
Who holds the right to answer the question, "Who am I?" It is, it should be, a fundamental principle of a democratic society that each person holds the right to control their own identity, to say who they are, to have exclusive rights over the sentence that begins, "I am..." And this is the case because, without this fundamental right, no rights exist whatsoever. When the right to assert who you are is controlled by someone else, your identity is owned by someone else, and a person whose identity is owned does not own any of the attributes commonly associated with identity: attribution of authorship, ownership of houses, permission to drive, residency, citizenship, the right to vote, and more.
I think that people understand this, and I think that this is why there is an often unstated but often perceived undercurrent of dissent as one's right to one's own identity is eroded. I think that this assertion of one's individuality is what lies behind acts of creativity, acts of vandalism, and most everything in between. It is our desire to recognize individuality that leads to teams placing names and numbers on team uniforms, the personalization of news articles, the elevation of obviously talentless individuals to stardom. It is not that any of these actions is intrinsically valuable, it's that each one is a means of our enabling the expression of who we are - we look at Paris Hilton and we say, "That could be me, if I was a different person." And we either shudder or breathe a sigh of relief, depending on who in fact we are.
Though the development of authentication systems will no doubt continue to be a source of considerable churn and considerable investment in the near future, it should be evident from these considerations that authentication is (a) not necessary, (b) won't work, and (c) is not desired.
It is not necessary because, given sufficient incentive, people will correctly and honestly self-identify. And this barrier is much lower than may be supposed. Even given today's prevailing system of authentication (user registration and login with a password), and even in cases where there is no intrinsic benefit to the user, the majority of users supply accurate information, even where there is no email confirmation (I can't find the reference to this off the top of my head, however, if you dig through the Online News mailing list archives, it is there). For the benefit of obtaining access to a community or of reading some free (advertising supported) content, people will self-register accurately.
It won't work because, as argued in this article, no system of authentication provides any more security than a system of self-identification. Authentication will not work at all unless it is tied to a proxy, the identity of which can be established online, which means that the security of the authentication is no greater than the value of the proxy to the user. With cheap computation, computers on a USB (reference is out there somewhere), disposable telephones, e-paper, and more just beyond the horizon, it seems clear than the value of the physical asset to which authentication is being tied will continue to decline, at which point authentication will provide no disincentive against misrepresentation of identity whatsoever. Authentication is useless if not tied to the person, and can be tied to the person only with the compliance of the person, which in effect reduces it to self-identification.
And it is not desired because authentication essentially involves the transfer of control over one's own identity from oneself to a service provider or identity broker, and as a consequence, enables the breach of the user's security and privacy whenever it is in the interests of that service provider or broker to do so. It moreover undermines the individual's fundamental right to determine and express who they are.
So where to now?
As I mentioned earlier in this paper, the creation of authentication systems is a major industry. The creation of self-identification systems, by contrast, has remained virtually unchanged since the days of the first login prompt. On website after website, users are asked to supply their login credentials, a process as predictable as the typing of a PIN number into a keypad.
Indeed, on the wider internet, service providers face in general not a choice between authentication and identification, but rather, a choice between identification and nothing at all. This choice exists because there is a significant disincentive for users to login. Leaving aside the problem of spam email and user tracking, logging in to website after website is a tedious process for which the reward is minimal. Many users propagate toward sites that do not require registration, partially because of security concerns, but mostly because they're easier to access. Websites linking to other websites (most especially blogs) link almost exclusively to open access websites (check Blogdex for evidence of this).
Moreover, on the internet at large, there is no capability for a person to have an identity (beyond an email address). Rather, each new registration at each new website creates a new identity. What gets credited to 'datamouse2001' on Yahoo! is not related in any way to what gets created by 'StephenDownes' on NewsTrolls, even though they are the same person (and the same person who, moreover, has dozens of accounts - usually 'Downes' - on dozens of websites). Worse, these accounts are not in any important way mine -- something Netscape Netcenter users discovered to their dismay when the company was taken over by AOL.
We need a mechanism for self-identification. We need a mechanism where clear and unambiguous control is placed in the user's hands, a mechanism that enables the user to declare to every site (or none, if that's their choice), "I am me!" And a way to do this automatically, unambiguously, with as little effort as possible.
It is my belief, and my contention, that were such a system to become widely available, much of the apparent pressure for authentication would disappear, and we could rely on self-identification to carry the same load online it has always done offline.
Authentication and Identification
Part 2. mIDm - Self-Identification and the World Wide Web
My thanks to Scott Wilson, James Farmer, Scott Leslie, Luc Belliveau,
Rod Savoie, and Seb Schmoller for contributing to this article.
The idea behind mIDm - pronounced "My - Dee - Me" - is that people using the web can log in once, on their own website, and then forget about logging in anywhere else. It is, in essence, single sign-on for the people.
Billions of words have been written about user identity on the web. Numerous solutions have been proposed: to name a few, Passport, Liberty Alliance, LID, SxIP, PKI, CoSign and more...
Equally obviously, however, is the fact that no identity management solution has taken hold in any large measure on the World Wide Web. While it would be premature and in a certain sense outright wrong to call any of these initiatives a failure, it nonetheless remains true that for the vast majority of people, on the vast majority of websites, identity continues to be managed via a simple login with a username and a password.
The bulk of the initiatives listed above - if not all of them - are attempting to build something more. Sure, all of them offer some form of single sign-on - that is, a system whereby you enter your username and password once, and then access resources from a number of sites. But in addition, they are also attempting to provide some mechanism for authenticating these logins, that is, some way of asserting that the information supplied in these web forms is true.
And in order to ensure that the assertion is true, these systems employ some sort of central registry or authentication service. Part of this is driven out of pure practicality: how could a website know where to look for information about the user unless the user is registered somewhere? And part of this is driven by the desire for verification: while the website may not implicitly trust the user, it does trust the authentication service.
The purpose of this proposal is to eliminate the need for any central registry or authentication service. That does not mean that it decrees that they must not exist; certainly, there will always be a need for some sort of guarantor, some sort of third party opinion about the person in question. Rather, it means that such registries and authentication services need not exist, that everything the website needs to know about users can come from the users themselves.
The key differences, therefore, between what I propose and other systems, is:
a) You can self-declare the location of your identity server
b) You can self-identify, that is, you can state for yourself who you are and (say) how you can be reached
Which leads to the point of yesterday's paper, and the reason why I wrote it:
c) And self-authentication is good enough (and more to the point, any 'stronger' form of authentication doesn't buy you any greater security than self-authentication does)
What this does, in effect, is to establish a regime where a person's own declaration is the primary source of their identity, their own identity server; they do not need to depend on a proxy (such as a university registration, employment in a corporation, subscription to an internet service provider, or whatever). Sure, they may at a later time refer to some external agency to provide a reference or recommendation, but even this referral is at the user's discretion.
Moreover, since people choose their own identification server, the level of security they require may be as weak or as strict as they desire. If a simple login with cookie support is enough (as it is for the vast majority of people on the vast majority of websites) then this is all they use; if they want secure sockets layer with IP verification, then they may opt for this as well.
Moreover, by creating a mechanism by which anyone may self-identify, it also creates a mechanism whereby any web service may request identification. A website does not need to belong to a federation, be some part of a trusted network, or some such other secret society. The self-identification network is open: anybody can play.
In the sections below I will provide some computer code, written in a programming language called Perl. The code provided is not the self-identification service I am proposing. Eventually, I would hope that it will be an instance of it. But not yet.
What I have provided is merely a proof of concept. That is, I have written the minimal amount of code necessary to show that what I am proposing will work. Based on input that I have already received, I can say that this code will definitely change over the next few days and weeks.
Moreover, it is important to emphasise at this point that the code is not the proposal. The code is merely an instance of the proposal. It is my expectation (already fulfilled to a degree) that versions of the same proposal will be written in other languages, such as Python or PHP. It is moreover my expectation that application-specific code, such as Drupal or WordPress modules, will also be created.
Finally - it is necessary to stress again - what mIDm is not is an authentication service. That is, websites have to take the user's word that they are who they say they are. But what it does do is to provide any user who wants it with a unique identity. Also, it is not by itself a solution to other problems, such as comment spam. Though such solutions will rely on a system such as mIDm, they will require a second part (which, yes, I will illustrate in a subsequent work).
What I am trying to prove here is that we can get a free, open and distributed system of single sign-on self-identification off the ground using nothing more than Notepad, some common understandings, and a little ingenuity. And what I believe we will prove, in the long run, is that this is all we ever needed.
The proposal is dead simple.
You - a web user - create a website on which you create a program you can log in to (you don't have to do this yourself - you could use a program someone else created to do the same job - but the point is, you could do it yourself.
You then place the address of that program - its URL - into your browser.
Then, any time you go to a website, if that website wants to know who you are, it gets the URL from your browser and sends a request to the program. "Who is this?" the website will ask. "This is me!" the program will reply.
How does the website know that you've sent it to your program, and not someone else's? The same way Feedster or Technorati or Blogshares allows you to 'claim' a blog. It gives you a little bit of code which you then place into your program. Because you have to log into the program, only you could have placed the information there. So once the website gets the little bit of code back from the program, it is satisfied that you, indeed, are the person described by this program.
In a sense, it's no more and no less secure that having you type your personal information into a form. Sure, you could lie - but that's not the point here. The point is that this is a mechanism by which you, the web user, can make a declaration about who you are.
Now, in the code provided at http://www.downes.ca/idme.htm, the messages sent back and forth are very simple - too simple, actually. The 'little bit of code' is nothing more than the current time. The response back is nothing more than the little bit of code.
In the final version, these messages will be a little more complex (but not a lot more complex). They will be, in particular, valid instances of the Security Assertion Markup Language (SAML) V2.0. This means that statements made by mIDm will be predictable - everybody will know how to make a request, everybody will know how to read a response. And it means that your own little self-identification server will speak the same language as the centralized identity servers - just in the same way your home-grown web site speaks the same HTML that Yahoo! or Google speak, just the way your own little cut-and-paste RSS feeds speak the same language as those produced by LiveJournal.
How Does It Work?
In a nutshell:
A user declares the name of his or her private website - the location of an mIDm script on their own server (or a server provided by a host, such as an online community of their choosing)
When the user attempts to access a remote website, the remote website redirects their browser to that mIDm server with an access key (sometimes called a 'handle', though I don't like that name).
The mIDm server accepts and stores the key. The idea here is that only a person with access to the mIDm server can store that particular key.
The mIDm server redirects the user back to the remote website.
Upon the user's return, the remote website independently requests the key from the mIDm server.
If the key is returned, then the server accepts that the mIDm address provided by the user is valid, and hence, may request additional information (such as, say, FOAF data) from the mIDm server.
Now it should be clear from the outset that this system does not prevent the user from adopting an alternative identity. Nor does it prevent several people from sharing a single identity. This is not the purpose of the mIDm system. The sole purpose is to guarantee that the information being provided by the mIDm server is in fact being provided by the user requesting access. In essence, it is as secure as (and no more secure than) requiring a UserID and a password to access a website.
Scott Leslie provides this image of the process:
More precisely, what is proposed is an instance of 'SP Initiated: Redirect-> Artifact binding. See Figure 18, pg. 25, of the SAML 2.0 Technical Overview (PDF).
Set-Up - User End
Step One - Install the mIDm script
The mIDm script is a CGI script that runs on the user's web server (or is provided by a website host). This script checks the user's browser cookies for a valid USERID (the code provided uses two cookies, named 'person_title' and 'person_id', but can be altered to accept any cookie values already set by the server). If the cookies are not present the script exits (the code provides redirects the user to a login screen).
To install the script, copy the code listing immediately below (Listing 1) and save it as a file on your web server. Edit the cookie values if necessary. On Linux / Apache systems, chmod the script to 755 (in other words, run-enable the script). Test the script by typing the script address in a browser. You should see the message 'mIDm script installed OK'.
Note: you can't just install this script out of the box and expect it to work. It needs to be tied to a login system. The example provided below is tied to the downes.ca login system. I will, at a future point, provide a script that handles login as well as identification. But this isn't that script.
Note: for most users, access to this script will simply be something provided by their web community of choice and no installation will be required.
Step Two - Declare your mIDm Location
Using a Firefox browser go to the User Agent Switcher Extension website and install the user agent switcher.
Once the extension is installed and the browser restarted, select 'Tools' from the menu bar, then select 'User Agent Switcher', then 'Options', then 'Options'. In the box that pops up, select 'User Agents' from the left-hand menu. A list of user agent names will be displayed; select one of those or add a new one
(I simply selected 'Internet Explorer'). Click 'Edit'. In the 'Edit User Agent' box that pops up, in the second line (where it says 'User Agent'), add a semi-colon and then the address of your mIDm script.
The following image shows (part of) the URL of an mIDm script added to the 'User Agent' line (circled in red):
Clock 'OK', then 'OK' again to close the popup boxes. Then select 'Tools > 'User Agent Switcher' again and select the user agent that you just altered from the list.
Luc Belliveau also reports that the User Agent can be changed in Internet Explorer by amending the IE registry entry: "In key [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Internet Settings\5.0\User Agent], the "Version" string can be added to change it, you can even change the platform."
Note: this slightly cumbersome process relies on an existing extension to amend the user agent. Presumably, someone will write a simple extension and/or plugin that will simply allow you to input the location of your mIDm script, and will automatically append it to the end of your existing user agent.
Warning: Messing around with your User Agent may cause some websites to react in an odd way. I am testing this now and have found no ill effects so far. But you've been warned.
Set-Up - Server End
If you have a web server and would like to enable single sign-on, do the following:
Copy the script below (Listing 2) to your website and chmod it to 775. Edit the URL for your script.
Note: the script supplied does nothing more than say whether the user has been verified or not. OK, so I have no imagination. In a fuller version, once the user has been verified information would be obtained about the user and this information used by the website to provide personalized services.
Try It Before You Install
I have installed a test case of the two scripts on my website.
First, install the User Agent Switcher as instructed above and set the following as your mIDm URL: http://www.downes.ca/cgi-bin/idme.cgi
To try it out, try to access the single_signon script: here. You will notice that you are bounced to the login script.
Now try to access the single_signon on a different server (newstrolls.com): here. You will notice that you are bounced to the same login script on the downes.ca.
So go to the login script: here and log in as UserID: tester password: tester
Now go back to the single_signon script: here. You will notice that you are now verified.
Finally, go back to the single_signon script on newstrolls.com: here. You will notice that you are now verified on the NewsTrolls site as well.
The Road Ahead
So what needs to happen before we can start implementing this system?
It can be done now, with the tools we have now. For after all, we already have the two major components we need: the place to store the location of our own authentication severs, and the language (SAML) websites and identity services can use to talk to each other.
Don't like the code I provide here? Write your own in Python, PHP or whatever. Think my login system is way too loose? Embed your code in Drupal, Plone or whatever - write it as a module, write it as base code.
Nothing new needs to be invented. We don't need to wait for some major authentication project to come along and manage it all for us. We can do it ourselves. We should do it ourselves.
Of course - if you do want to wait, the code provided here will be cleaned up and written more rigorously. You will be able to simply copy the code to your website, make minor modifications, and be up and running. After all, it's not rocket science.
Or, at least, it shouldn't be.
Yes, there will be a part three: Applications of mIDm. Moreover, I will be updating the code listings at http://www.downes.ca/idme.htm. Stay tuned.
It is recommended that you go to website - http://www.downes.ca/idme.htm to receive the latest version of this script. Bookmark this site and go there for updates in the future.
Copyright © National Research Council Canada