Person making the initial proposal- please sign the proposal. Discuss proposals on the talk page. This page should reflect a concise description of what the operation does, not rationale/ evaluative statements.

Proposals that have sufficient interest/ value for genealogy wikia then move to Current Tasks for coding and implementation.

Genealogy spider[]

AutoScan patent database, civil war nps site, famsearch for hits from our database. Post a suggestion on the page that has a package of the mined info. If the human confirms the data, then it is added to the main templates. (Can I do that from the template code? Or does the bot have to make a second pass?- I think bot, because something has to edit the data on the page.

Picture upload page[]

Genealogy:Image transfer requests is a page where users add links to individual Images, or categories of images needed on genealogy. The images are moved over by the bot unless there is some objections voiced on the talk page.

Proposal by ~ Phlox 19:31, 29 October 2007 (UTC)

Spam Patrol[]

Recent changes list is monitored by a bot 24/7. When suspicious activity is detected (eg links to spam sites, use of racial epithets/ other vandalization/ prohibited terms) the bot could do any of the following:

  • alert any admin currently logged in with a message (talk or IRC)
  • politely inquire on the user's talk page about their activities (subtext is that they are being informed that this activity has been observed, implying- don't waste your time here if you intend to do harm.)
  • take action if no admin is logged on and vandalization is certainly occuring (eg blanking articles by an unlogged in individual). EG: block them for a few hours.
Proposal by ~ Phlox 19:31, 29 October 2007 (UTC)

Additional Time structure[]

In performing the move on Birth and Death years, I notice there is a lot more useful time structure there on WP that we need. EG: for year 1910 births, I see the additional stuff we could have:

I propose I move the above stuff right away. (Edit: Done 31/11/2007)

Other stuff we could consider later would be helpful in identifying periods of photos, as well as give signs of the times to help give people a feel for what the period was like for their ancestors:

Proposer: ~ Phlox 20:22, 29 October 2007 (UTC)

Auto recognition of Place[]

Placenames are hierarchical sets, but the names of the sets vary from country to country.

Info pages will have the generic name as a field.

  • Country
    • Subdiv1 (First-level administrative country subdivision) eg State, Department (fr), Province (CAN), Bundesland (de), oblast "область" (ru), prefecture (
      • Subdiv2 (akin to counties/ boroughs/ districts "Kreis")
        • Settlement (village, town, city)

Encoding in Infobox person fields are often copy pasted from places like Family search or genealogy programs where the encoding may look like this:


Which interpreted for USA means King County, Washigton state.

However oftentimes the country is not indicated, region names are used ("normandy"), reference is made to no longer existing historic entities (Yugoslavia) -or- since much information comes from Censuses, the units have to do with census subdivisions which may have only sporadic relation to governmental units.

Automatic scanning of placenames will likely be error prone due to these factors. What the Bot can do is the following:

  • Top level Subdivision names for major western countries are hard coded into the Bot.
    • (all fifty states, all german federal states, all 54 (43?) or so departments of france and so on.
  • state/provinces/oblasts go into the subdiv1 field.
  • Match on these indicate country so that value will be set if not sepecified.
  • If two comma delimited names precede this entity, then the first is assumed to be settlement, and the second is assumed to be Subdiv2 (county, Kreis, district, borough).
  • If a single field precedes the recognized top level subdivision, then it is assumed to be a settlement.
  • "Settlement" field may be renamed to aliases, as supported in the various language and regional Infobox person templates eg: Parish, town, village.
  • If the settlement or ZIP/Postcode number is known, the latitude and longitude will be looked up using that new standard for Lat/Long that Google maps uses.

Gedcom bot[]

This was discussed in Forum:Gedcom bot and should be noted as part of the list. ~ Phlox 05:36, 16 November 2007 (UTC)

Fetch WP aliases for article names[]

  1. walk all enwp and usedwp
  2. for each wp article, walk all inbound references
  3. for each redirect article, copy to genealogy
~ Phlox 16:12, 16 November 2007 (UTC)

Remove one word from county talk forum intro[]

"To start an article" not "To to start an article". I've done one manually. (The whole idea looks great!!) Robin Patterson 01:04, 18 November 2007 (UTC)

Done ~ Phlox 08:19, 19 November 2007 (UTC)

History articles[]

cemeteries by state[]

WP has a category wikipedia:Category:Cemeteries in the United States

Proposal: scan the articles for county, store in Countyname County of Statename/cemeteries.
  • cat for state and county.
  • Move and put county navbox at top.
~ Phlox 21:00, 20 November 2007 (UTC)
I don't like the subpage idea - though I concede that what you say above leaves quite a bit of scope in the "Move" instruction and may mean that the subpage is only an intermediate step for the bot - and I doubt if we need it unless there's a good bot reason. We are away ahead of WP in some aspects of cemeteries and have established the prosaic natural form Cemeteries in Greene County, Ohio. Not many of them yet, but they can fit in perfectly with the new standard county header except for the subpage format. Here's the relevant listing from project:Cemeteries:
You will see that most or all of that was written before your bot got serious. An expression such as "unless there are over 100 articles of all sorts, in which case the larger counties should have their own cemetery-specific category" predates the idea that if we want a standard sort of article for several U.S counties we are better to get a bot to create all 3100+ instead of creating one or two as we feel like it.
Here are the current categories in that form, copied straight from the namespace listing (editing out most of the list that's not "Cemeteries in ... County..."):
Cemeteries in Atchison County, Missouri
Cemeteries in Blair County, Pennsylvania
Cemeteries in Buchanan County, Missouri
Cemeteries in Dallas County, Alabama
Cemeteries in Fremont County, Iowa
Cemeteries in Georgia
Cemeteries in Georgia (U.S. state)
Cemeteries in Maricopa County, Arizona
Cemeteries in Norfolk
Cemeteries in Norfolk, England
Cemeteries in Norfolk County, Ontario
It is generally good to give such a category a main article with the same pagename (for maximum simplicity using Template:Catmore), though I accept that it's not essential. Someone (presumably you) has already moved the only such page so that it is in the subpage form: - are you very keen to see them all like that, or may we have plain English as in the original name? The categories are so few that a manual change (if they needed changing) would be quicker than instructing a bot; but you have to convince this closet radical that the bot really would prefer a subpage instead of a plain English page name for each article.
Robin Patterson 13:02, 23 March 2008 (UTC)

submodule Map of places to county/state/country articles.[]

  • Given a scanned city name, including aliases like NYC, return a county/ state/ country.
  • Given a county name, return the same.


  • a global table loaded at startup time. Need to produce such a table.


  • Scan articles for coord template, and insert a googlemap for that location.

Create info pages[]

  • For articles in Pages in Special:Ancientpages, upgrade to info article.
  • All others, simply extract what you can and place in info page.


  • Info Person- a Python class.
    • GetFromGedcom - produce a filled in info struct given a gedcom file name.
      • Stored locally,
        • Load pass the filename to database object, return the database, then query it to extract the object.
      • On an Image: page
      • At an http address.
    • GetFromInfoPage- returns the current info page.
    • PutInfoPage- writes the current info page, with options overwrite, updateIfEmpty*, updateIfHigher confidence*. *'d items not now.
    • GetFromWpArticle
    • GetFromText - given text passed to the routine, extract what you can from it
    • GetFromArticle - attempt to extract info from Genealogy article.

extract from census page tabular data pages[]