Forums: Index > Watercooler > Value of page name standards

(Material from 2007 moved from another forum where it didn't quite fit. — Robin Patterson (Talk) 04:38, June 22, 2010 (UTC) )

Okay, so I love the redesign on the main portal! :D I like the clear help information in the beginning, that should help a lot on getting people started.

However, I am still seeing a flaw in not alerting people there is a naming convention in place... I know it sounds a little abrupt, but I really think there should be something really early on about naming conventions. Like:

Welcome to the Genealogy Wiki! This is a place where we all get together and work on our common Genealogy. Before adding your trees, please note that there are naming conventions in place here to prevent duplication of work.

The only reason I say that is that it could be just one link that could make the difference between my tree linking into Joe Smith's tree and before we know it we have 10 records that are all Almost duplicates, that'll have to be excised and deleted.

I know we have been talking about people checking every possible iteration of name before adding an ancestor, and with trees of 15 or so people, that might be feasable... but I am afraid that most of us have trees in excess of 1500 records... and you can't check every single one of them... it already takes me 10 minutes a record to put them in (And that's assuming they have no strange information). I'm looking at putting in 250 hours just to get up to speed...

Now, before I continue on this thread; what is the real threat to the Genealogy database? I mean, am I just being over sensitive? What happens if we have two Joe Smith's, one called Joe Smith (1800-1850) and the other called Joe Smith (1800 1850)... with all the associated ancestors following the same path (So there are two sets of 50 or so records, one set with a "-" and the other without)? Is this a real problem? Or am I simply being too sensitive to this whole thing? Aabh 05:55, 17 July 2007 (UTC)

Nice set of observations. Re: Naming Conventions---we can add something to the main portal, or elsewhere. One of the main problems we have is knowing and understanding, the specific things that are needful. Its not always obvious where people will go astray, and where a helpful pointer might be placed that will save everybody a hassle. So I appreciate your pointing this out.

Re: Duplicate articles---The key point is that duplicate articles dealing with the same person can hide out under slight variations of the article title. Some of those variations occur because people don't always follow the conventions (ie, including spaces between the "-" in the dates). Sometimes they occur because people have a different opinion about the DOB or DOD---perhaps someone will give the dates as "(1754-1821)", but someone else will give it as "(c1754-aft1821)". Or perhaps someone will want to insert a by name, ie, "John Walker (c1735-c1817) aka Indian Killer".) (When I started working the wiki I liked to use these bynames---mostly because I was dealing with a number of folks with similar names, who were often being confused. Had I to do it over again, I'd not go this route. The reasons being a) makes for a long title, especially when you add subpages, and b) makes it hard for searches to find.

You question is "Is this really a problem, or am I being overly sensitive?". It depends on the perspective. Some folks REALLY want only a single article for each person---a la Wikipedia protocol. I don't thank that's a problem myself, and the Wikipedia protocol doesn't exactly apply here anyway. I believe it unrealistic to think that everyone is going to agree on basic person data. Different people will show different mothers' and fathers, not to mention spouse's, children, DOB's etc. Just check ancestry's database---you'll find no total and complete agreement on almost anyone. Yes, there probably is only one "right answer"---but which one is it? Yes, there are ways to prove which one is "right", but there just are not a heck of a lot of folks following the professional standards that allow such discrimination. So, until we have universal acceptance of something like the Standards of Proof, we are going to have some duplication of articles.

Currently, that's not much of a problem, but that's because relatively few people are using the site. Once that changes, we can expect some conflicts between articles, and I imagine that there will be some duplication---some of which will be incidental ("I didn't realize there was an existing article about a particular person") and some deliberate deliberate ("i know there's an existing article about this person, but I don't agree with it").

I imagine that eventually we will have a bot that will seek out such duplications, and disambiguate them, or perhaps flag them to the authors to think about merging them. Many problems of this sort can be resolved with a proper discussion of alternative interpretations. If someone were indeed following the "Standards of Proof", this would be automatic---one of the standards of proof is that alternative viewpoints be interpreted, and an explanation provided as to the thinking for accepting one perspective over another.---but again, most folks aren't really into that kind of precision. (I might add that some (many?) view ANY disagreement with their personal views as personal attacks. In other cases people have minds like a steel trap---once an idea gets in there, it will never again see the light of day. In those circumstances, finding a meeting of the minds is not usually possible. "Hardover" positions can not usually be modified.

So, no, I personally don't think you are being overly sensitive. There is a potential problem here, and its something that has been discussed at some length, offline by various parties. One the other hand, its a problem that has solutions. Not sure what those solutions are at the moment, but when the problem comes up, I'm sure we will work something out. IN the meantime, I wouldn't worry too much about creating duplicate articles. If doing a search to be sure that a given title does not already exist is becoming onerous, perhaps the best choice would be to let that go. After all, if you DO hit on an already existing article, and duplicate (exactly) an existing title, the system is gong to immediately flag it for you anyway. (It won't let you create an exact duplicate). Of course, you may be creating a duplicate under a slight variation in name. All of the more reason to follow the naming conventions as they minimize that potential. Bill 12:08, 17 July 2007 (UTC)

As is common, I disagree with nothing Bill has said above.
Since Aabh wrote first above, I have emphasised at least one occurrence of the idea that people should try to follow the naming conventions. Doubtless one of us three will add another in a prominent place - "early" (whatever that means - people can read wikis in any order they choose!).
We have visual checks on duplications: all of them apart from those mentioned above are categories. If you put "your" people in the standard recommended categories for surname, birth year, and death year, then find them in those categories (mostly not too enormous yet), any near-duplicate is likely to appear fairly close to your person in at least one of those lists.
Can you handle java? If so, please see whether our GEDCOM page is any help: it links to a page with a name like "Loading Gedcoms", which explains with how most of our first 3,000 articles were added, several per minute.
(Long past my proper bedtime again.) Robin Patterson 15:55, 17 July 2007 (UTC)

See Forum:Concepts for another way to notice duplicates. — Robin Patterson (Talk) 04:38, June 22, 2010 (UTC)