Any URI may have to become human-readable at some point, even if it is not meant to, unfortunately. I can't recall any direct experience recently, but I do know I have had to retype URIs from a printout and an information kiosk at some time, to enter in my tablet or mobile phone.
So, I would say that one either should implement dual URIs (numeric and "meaningful") or at least:
1) make numeric URIs as short as possible, for example, by using hexadecimals rather than decimals for numbers.
So an URI like http://data.europeana.eu/agent/123512771
would become: http://data.europeana.eu/agent/75CA7C3
Hexadecimal usage is good, because:
- it always generates shorter URIs at enough digits than using decimals // cf: http://stackoverflow.com/questions/9458614/at-what-point-do-hexadecimal-representations-of-numbers-take-up-less-chars-than and it never generates longer URIs than using decimals;
- it has extremely widespread usage in computers and is trivial to implement in any programming language; you can adopt hex IDs for objects without any need for translation to decimals at any point;
- it has a limited "vocabulary" of [0-9] plus [A-F], making it easier to read off and repeat groups than using the whole alphabet
2) the digits should be grouped at predefined positions, if expected to be longer than 5-6 digits, again making them easier to remember and/or rewrite/reuse.
I suspect the software product keys have evolved mostly to be grouped in groups of 4, like: AB12-CD34-56EF for much the same reason.
So in my opinion and ideal URI might well be something like:
which while avoiding the need to describe any particular object, still retains some readability.
In decimal that would be object ID: 129512528382627, which even if grouped at four, would be 1295-1252-8382-627 -- at least one group longer.
Just my 2 cents.
Deputy director, CTO
National Library of Latvia
From: Discussion list for Europeana Technical Developments [mailto:[log in to unmask]] On Behalf Of Lizzy Jongma
Sent: Tuesday, March 24, 2015 4:57 PM
To: [log in to unmask]
Subject: Re: Your advice on minting URIs for contextual entities
If you are going to work with identifiers in a linked (open) data situation then you need to be cetrain that the uri's don't change.
Although human readable URI's sound like a good idea, because we can read them, in my experience will also be an invitation to debate and change... Johannes Sebastian Bach may become Johann Sebastiaan Bach etc. Words change, names change and people have a tendancy to change human readable stuff...
We (at the Rijksmuseum) created persistent URI's without any human readable reference for technical purposes (e.g. machines reading them): since they are just a set of numbers, no one feels the urge to debate them. And we have human readable URL's (server optimized. To a certain level... could be improved): we can change the website and and names of objects etc. but the technical references stay in place..
This is why I voted for option 1 in your poll.
Happy to hear others opinions and the outcome of your poll!
Van: Discussion list for Europeana Technical Developments [mailto:[log in to unmask]] Namens Antoine Isaac
Verzonden: dinsdag 24 maart 2015 15:45
Aan: [log in to unmask]
Onderwerp: Your advice on minting URIs for contextual entities
We're about to mint identifiers (URIs) for contextual entities to be used in Europeana. This will concern concepts, agents or places to be used for enrichment  and a couple of other things. The data will be adapted from external or providers' datasets, and will eventually have to be available as linked data on data.europeana.eu.
After internal discussions, we have to choose between two options:
1. A bare numerical identifier, as in
2. A number combined with a human-readable label, as in http://data.europeana.eu/agent/12345_johannes_sebastian_bach
In any case URIs would lead to machine-readable data for software clients, while humans would be directed to pages like . But human-readable labels in identifiers would help to identify and discuss the resources more easily. So option 2 is very tempting.
However, option 2 is slightly harder to implement. Also, we would have to choose one field in the data, and one language (as we do for other communication, including this mail). Both field and language could change from one source to the other, when we merge different datasets.
We're curious to hear whether you have a preference! We have created a small poll:
Note that it is not a a majority vote. We may end up not have the resource to implement the more complex option. Also, one could have a killer argument for one option, that defeats all other considerations :-). You can leave comments at the bottom of the poll page.
Thanks a lot for the advice!