Traditional schemes like microformats force the user to know location hierachies. A city must use the locality field, a landmark must use the extended address field, and so on. Right now, the contributor must understand a location hierarchy (see below). This is usually no problem for locations in the person's country, but is not a safe assumption for different time periods or for different locales. I just wonder if this cognitive load is necessary.

  • Country
    • Subdivision1 (constituent country in UK, oblast in Russia, and so on)
      • County (there is no classification for Regions of England, such as South West England which is both a geographic region and a governmental unit- an EU parliament consituency.)
        • Locality - can be problematic for large cities, eg London- Greater London occupies the status of a county containing the city of Westminster, but the wikipedia:City of London proper does not, and is actually a fairly small area that comprises the historic core of London. Russia has the concept of Federal cities, so for example St Petersburg and Moscow are technically subdivision1's not localities.
          • Street address- rather than encumbering Forms with all variations of location types, it was decided to simply overload street address so that it could be used as a catchall for locations smaller than a city, and landmarks, such as city squares, neighborhoods, mountains, lakes, building (Empire state building, St. Mark's cathedral). However, no matter how well documented, it is non intuitive that Mt. Everest would be a street-address.

There is no point in making the user jump through hoops to specify locations if it delivers them no practical benefit (oops, but there is a practical benefit- reporting of locations in tables- see #report3 below). I am just wondering if there aren't other ways to deliver these practical benefits. After all, we are asking them to use disambiguated place names. Why should the user have to know what county a particular village in England is located in? The main purpose for these hierarchies is for search scoping- eg: show me all the births for that period in a wider area. These searches may be actual queries, but more likely would be categorizations like [[Concept:Births in Texas in the 1840s]].

Maybe SMW efficiency issues still would force us to encode this Is-part-of hierarchy persistently in records for each person. But can't we generate these values via bot run? As an alternative, why not just ask the user to specify place1, place2, place3 and so on, and all they need concern themselves is that they put a wikipedia name for the narrowest place they know. Failing that, that they specify the names in the order of containment, just like the order of place names on an envelope. It is certainly more intuitive. I just wonder if it could be pulled off in the forms and queries so that this looseness would not come back to haunt us.

Report 2

So far, I don't see why we need to offer contributors the opportunity to make errors on place name classifications.

I going through the makeshift location trees I was creating from wp infoboxes- eg those that generate props like Property:locality of county. This is good stuff for disambiguating place names, but should the user have to know this stuff? I still don't see the compelling need for them having to grasp this complexity. There are some small advantages. For instance on a form is that if we know the field is a county, we only offer county names for autocompletion. Ambiguous names can overlap- eg city of London (locality) Greater London (county), which is probably a bad example because neither would show up in word wheel.

Report 3

Stop everything. How to tables work? Say I do a query on people born with a certain last name. I want to display locations. Now if I have names like birth place1 birth place2, and so on then place2 might be a county in one case and a city in another case. So until I have a bot that magically identifies correct type of place (subdiv to locality) I can't sort that table or even ask the reader to expect useful information in that column, because the same location like greene county, illinois is not necessarily stored in the same slot.

So that is the practical benefit of asking the user to know what classification that a place name falls into. Certainly, if we tell folks to get the most specific place name that applies that also has a wikipedia article, then we are home free and can derive all the other names in their proper slots.

Report 4

I really would like the user to be able to specify exceptions, but the problem is how in the UI to gracefully indicate the hierarchy. For example, Regions of England really should be superior to county level, but subordinate to subdivision1 value "England". Introducing the language of subdiv2, 3 etc might cover that, but it is special case. What about landmarks like a lake or mountain. Maybe we just create an "Other locations:" list and the user can just shove all the stuff they don't know how to classify in there. This field would be fair game for bots to come along and upgrade into future fields if we develop them- such as landmark, region, subdiv2 and so on.

~ Phlox 19:57, 8 July 2009 (UTC)