Familypedia
Advertisement
025 AWB illustrations for AWB manual.png
Shortcut:
awb
AWB

AutoWikiaBrowser is our name for AutoWikiBrowser as adapted for Wikia.

Overview[]

What does it do?: AWB automates edits of Wikia articles. It can be run in fully automated mode, but is designed for assisted editing- to remove the drudgery of routine tasks such as spell check, and/or standard modifications related to formatting and categorization conventions. After the edits are made, the user may make additional corrections before saving.

What's the basic concept? In AWB, you make a "list" of articles you want to edit. Even if you only want to edit just one, you still make a list. You can make lists from articles in text files, categories, lots of places. Then you tell it to start, and it will load each article, perform the requested operations, then wait for you to save the article before proceeding. Alternatively, you may request that the saves be done automatically. Operation will continue until all articles in the list have been processed.

Setup and first run[]

If you run Windows 2000, or better, on a Mac or PC, download awb from sourceforge

  • If you are not an admin, request AWB approval for your account at c:Wikia talk:AutoWikiBrowser/CheckPage. If you have trouble getting approval, contact a familypedia administrator.
  • There is no installation program. Just unzip the files to a convenient location such as c:\program files\AWB
  • To start, click on the AutoWikiBrowser.exe file, or place a shortcut to it on your desktop.

Configure for Wikia

  • Menu File.Profiles... press Add button and enter just your username for now. Click add to save, then Close the box
  • Menu Options.Preferences… Click the tab "Site", enter wikia in the first pulldown, then type genealogy prefix in next box. Click the OK button to close this box.
  • Menu item File.Log in. The upper window should take you to the genealogy login window. Type your username and password to log in.

First run, try this- only one article will be affected, and only if you press the save button. Follow these steps:

  • In the lower left corner, there is a box entitled “make a list”. This is called the "Make list" box (see illustration above). Enter the name of an article in the box just to the left of the + button. Hit the plus button.
  • Just to the right of the "make a list" box is a set of tabs. Click the options tab. Uncheck all the items. Under Find and replace, click the checkbox enabled. Click normal settings.
  • A find and replace dialog will appear. In the find column type some word or phrase to search for. In the "replaces with" column type the text to replace with. Click Done.
  • Click the start tab. Click start. What will happen is the article wikitext will appear in the upper screen. If your "find" expression is found, then the upper left hand pane will show the original text, and the right will show the same passage with the replacement text highlighted. If you don't like one of the highlighted changes, click it and it will not be accepted. In the lower right corner, there is an edit box. This wikitext has the replacement, and you make further edits if you wish. Otherwise, click the green Save button in under the start tab to keep the change, or Ignore to discard the change. There may be a substantial wait while the article is saved.

Manual: See http://www.green.org/index.php?option=com_awiki&view=mediawiki&article=Wikipedia:AWB/UM

Getting help: There is a substantial help community at green.org- see bulletin board. Please exhaust other ways of figuring out your problem before contacting Phlox. This is still in the experimental stage, so there are liable to be lots of serious deficiencies.

Gotchas[]

  • You will lose all your search and replace strings as well as any of your other hard work configuring an assisted edit run if you forget to "save settings". If you save as default, it will automagically reload. Whatever loaded list you have is also saved, so this is a handy way of saving a session so that it may be resumed later.
  • During times of high server loading, AWB will sometimes fail to detect that it has successfully saved changes. It will then retry the save. You will notice this in history for an article where there are multiple saves each within seconds of the previous. The solution is to use tab option Skip. Click radio button "contains" and specify some search string that would be true after you had made an edit. Consider making a trivial change such as adding a <!-- comment --> to indicate that your bot had already processed the article.
  • Be sure to go to preferences' edit tab and have unchecked "Enable Rich Text Editing" (near the top) and checked "Disable Category Tagging" (near the bottom). If you do not, you will not be able to match category wikitext.

Simple example[]

AWB does not require sophisticated programming to effect powerful changes at Familypedia.

Example — preventing damage to info pages[]

It is clear that categories on info pages are not being handled correctly by the new WYSIWYG editor. Pages have been damaged unknowingly by users who simply save an info page with or without edits. One workaround is to convert [[Category:Info pages]] to {{Category-Info pages}}. Simple search and replace for all info pages would prevent further damage. Steps:
  1. Options tab: click enabled checkbox under the "find and replace" heading.
  2. Click Normal settings
  3. Under the find column, put [[category:info pages]] in the first box.
  4. Under the "replace with" column, put {{Category-Info pages}}
  5. Click done. Now we can immediately start running this search and replace on articles.
  6. Type an info page name in the box labeled "Make list" in the lower left corner, next to the plus button, and hit the plus button
  7. Start tab: click the start button. You will then see the article displayed, with Category-Info pages template replacement made.
  8. Click save to save the article.
  9. Return to step #6.
Already-spoilt pages
Occasionally you will find one of the pages that CategorySelect has spoilt. It will have the category in the bottom line between "noinclude" tags. Simplest workaround for that is to delete the invisible "newline" at the end of the line above so that the opening "noinclude" tag moves up to sit in that line.

Getting large lists of pages - The entire list of info articles may be retrieved for you:

  1. In place of manually typing in an info page name, in step 6 above, you may retrieve the entire list of info pages
  2. In the same "Make list" box in the lower left of the AWB main screen, next to the Source: label, use the pulldown list to select "Category".
  3. Below this in the box now labeled Category, type "info pages"
  4. Press button: Make a list
  5. You may now begin your run. Under Start tab, click start.
  6. Press save if you like the change, for each info page.
  7. For automatic mode, under the Bots tab click Auto Save. Walk away for a few hours and a few thousand articles have been automatically updated. (The already spoilt pages mentioned above won't get the desired improvement, but at least they now won't get worse and can be easily fixed by hand when they come to light.)

You are an instant hero and French women will name their babies after you. A person with Swiss ancestry may send you a multi-tooled pocket-knife. Congratulations on your super hero status. Now let's go save the world.

Advanced AWB[]

Large numbers of changes[]

If you are making more than 30 pages of changes per day using AWB, please set up a separate account (the convention is to add "Bot" to your user name eg:PhloxBot so that other users know who to contact about Bots running amok. Then contact an admin and request the Bot flag for your account. Administrators, send a Special:Contact request stating the bot account name that you have approved bot status for. Bot accounts can be blocked, so there is no reason not to freely approve these except in cases where the user may not yet have proven their ability to handle a bot. We want to be able to remove the flooding effect of bots.

Advanced:Regular expressions[]

The real magic is performed with complex search and replace expressions that programmers refer to as "regular expressions" (regex). There is nothing particularly regular about them, but you can match very complicated patterns even if they have many variations. There are several regular expression resources on the net. This one is a good cheat sheet, and there are some applications that help construct and test expressions. AWB has a regex tester built in.

Examples
Match expression ! Replace expression
(?<parameter>(state|county|region|shire[_ ]county|metropolitan[_ ]county)\s*=\s*)({{[^}]*}} *)*(\[\[(Image|File)[^\]]*\]\] *)*({{wp\||\[\[)(?<location>[^|]*)\|??.*(}}|\]\])
${parameter}${location}

This example looks for a parameter named shire county, metropolitan county and so on through list... and convert any links found into plain text names of the corresponding article. It is designed to strip any image or template preceding the link. This demonstrates capturing into a named group, then using those named groups in the output.

Example scripts[]

AWB allows your conversion script to be saved as an XML file so that it can be shared with other users. If you have created a script that others might be able to benefit from, please share it here.

Large runs[]

AWB settings that might be helpful if a bot run is halting:

  • Preferences.Site tab: Wait 60 seconds before web control times out.
  • See AWB docs for optimizing: Such as turning of IE active scripts, display of images, sounds and animations.
  • On the AWB bot tab, click resave "Nudge" if stuck, and skip page if first nudge does not work.


Administrators needing to halt a bot run: Use block to halt the run. If the contributor has a bot account, this will not interfere with their manual editing. Note that AWB will resume as soon as the block time expires, so set enough hours that the operator will notice the halt and fix it. Monitor their talk page if it is not a bot account.

If you will be making runs involving hundreds of articles in automatic mode, please create a bot account. Besides avoiding block situations, this avoids flooding the recent changes and IRC channel list with presumably trusted bot edits, making it difficult to monitor vandalism. For ease of people knowing who owns the bot, create one that is a form of your name, examples PloxBot, AMK152Bot, Rtolbot. Post your need for a bot account on forum and when the community has expressed support, a Helper can be contacted to assign a bot flag. This allows bots to be managed more effectively by wikia staff.

Very large runs involving 1000s of pages or imposing loads on the server are best run at low load times between 12:00AM to 3:00AM PST. Longer lag times 30-60 seconds can be used to lighten the burden. Runs involving greater than 60K articles should be done by arrangement with wikia staff so that they may be done on the server side. Use Special:Contact staff for this.

Automated runs should be for non controversial changes. If in doubt make a posting on Watercooler, explain what the run does revealing all implications that other contributors might care about. It is also helpful to provide information on the Bot's User page to explain what the current run is doing and what the upcoming bot runs are. Bots may not be used to enforce a POV in an ongoing dispute. Anyone that feels this is happening (whether or not the bot is being run by a long time contributor or adminstrator) should contact an administrator via Familypedia:Administrators' noticeboard or directly via a talk page.

Potential uses at familypedia[]

Advertisement