use MediaWiki::Bot; my $bot = MediaWiki::Bot->new({ assert => 'bot', host => 'de.wikimedia.org', login_data => { username => "Mike's bot account", password => "password" }, }); my $revid = $bot->get_last("User:Mike.lifeguard/sandbox", "Mike.lifeguard"); print "Reverting to $revid\n" if defined($revid); $bot->revert('User:Mike.lifeguard', $revid, 'rvv');
my $bot = MediaWiki::Bot({ host => 'en.wikipedia.org', });
Calling "MediaWiki::Bot->new()" will create a new MediaWiki::Bot object. The only parameter is a hashref with keys:
Refer to <http://mediawiki.org/wiki/Extension:AssertEdit>.
In addition, it will be integrated into the default useragent (which may not be used if you set agent yourself). The message will tell you that $useragent is logged out, so use a descriptive one if you set it.
Please refer to the MediaWiki documentation prior to changing this from the default.
1 provides only error messages; 2 provides further detail on internal operations.
For example:
my $bot = MediaWiki::Bot->new({ assert => 'bot', protocol => 'https', host => 'secure.wikimedia.org', path => 'wikipedia/meta/w', login_data => { username => "Mike's bot account", password => "password" }, });
For backward compatibility, you can specify up to three parameters:
my $bot = MediaWiki::Bot->new('My custom useragent string', $assert, $operator);
This form is deprecated will never do auto-login or autoconfiguration, and emits deprecation warnings.
If you don't set any parameter, it's previous value is used. If it has never been set, the default settings are 'http', 'en.wikipedia.org' and 'w'.
For example:
$bot->set_wiki({ protocol => 'https', host => 'secure.wikimedia.org', path => 'wikipedia/meta/w', });
For backward compatibility, you can specify up to two parameters:
$bot->set_wiki($host, $path);
This form is deprecated, and will emit deprecation warnings.
Logs the use $username in, optionally using $password. First, an attempt will be made to use cookies to log in. If this fails, an attempt will be made to use the password provided to log in, if any. If the login was successful, returns true; false otherwise.
$bot->login({ username => $username, password => $password, }) or die "Login failed";
Once logged in, attempt to do some simple auto-configuration. At present, this consists of:
You can skip this autoconfiguration by passing "autoconfig => 0"
For backward compatibility, you can call this as
$bot->login($username, $password);
This form is deprecated, and will emit deprecation warnings. It will never do autoconfiguration or SUL login.
Single User Login
On WMF wikis, "do_sul" specifies whether to log in on all projects. The default is false. But even when false, you still get a CentralAuth cookie for, and are thus logged in on, all languages of a given domain ("*.wikipedia.org", for example). When set, a login is done on each WMF domain so you are logged in on all ~800 content wikis. Since "*.wikimedia.org" is not possible, we explicitly include meta, commons, incubator, and wikispecies.
Basic authentication
If you need to supply basic auth credentials, pass a hashref of data as described by LWP::UserAgent:
$bot->login({ username => $username, password => $password, basic_auth => { netloc => "private.wiki.com:80", realm => "Authentication Realm", uname => "Basic auth username", pass => "password", } }) or die "Couldn't log in";
$bot->logout();
The logout method logs the bot out of the wiki. This invalidates all login cookies.
my $text = $bot->get_text('My page'); $text .= "\n\n* More text\n"; $bot->edit({ page => 'My page', text => $text, summary => 'Adding new content', section => 'new', });
This method edits a wiki page, and takes a hashref of data with keys:
An MD5 hash is sent to guard against data corruption while in transit.
You can also call this as:
$bot->edit($page, $text, $summary, $is_minor, $assert, $markasbot);
This form is deprecated, and will emit deprecation warnings.
$bot->move($from_title, $to_title, $reason, $options_hashref);
This moves a wiki page.
If you wish to specify more options (like whether to suppress creation of a redirect), use $options_hashref, which has keys:
my @pages = ("Humor", "Rumor"); foreach my $page (@pages) { my $to = $page; $to =~ s/or$/our/; $bot->move($page, $to, "silly 'merricans"); }
my @hist = $bot->get_history($title, $limit, $revid, $direction);
Returns an array containing the history of the specified $page_title, with $limit number of revisions (default is as many as possible).
The array returned contains hashrefs with keys: revid, user, comment, minor, timestamp_date, and timestamp_time.
A blank page will return wikitext of "" (which evaluates to false in Perl, but is defined); a nonexistent page will return undef (which also evaluates to false in Perl, but is obviously undefined). You can distinguish between blank and nonexistent pages by using defined:
my $wikitext = $bot->get_text('Page title'); print "Wikitext: $wikitext\n" if defined $wikitext;
my $pageid = $bot->get_id("Main Page"); die "Page doesn't exist\n" if !defined($pageid);
my @pages = ('Page 1', 'Page 2', 'Page 3'); my $thing = $bot->get_pages(\@pages); foreach my $page (keys %$thing) { my $text = $thing->{$page}; print "$text\n" if defined($text); }
$buffer = $bot->get_image('File:Foo.jpg', {width=>256, height=>256});
Download an image from a wiki. This is derived from a similar function in MediaWiki::API. This one allows the image to be scaled down by passing a hashref with height & width parameters.
It returns raw data in the original format. You may simply spew it to a file, or process it directly with a library such as Imager.
use File::Slurp qw(write_file); my $img_data = $bot->get_image('File:Foo.jpg'); write_file( 'Foo.jpg', {binmode => ':raw'}, \$img_data );
Images are scaled proportionally. (height/width) will remain constant, except for rounding errors.
Height and width parameters describe the maximum dimensions. A 400x200 image will never be scaled to greater dimensions. You can scale it yourself; having the wiki do it is just lazy & selfish.
my $revid = $bot->get_last("User:Mike.lifeguard/sandbox", "Mike.lifeguard"); print "Reverting to $revid\n" if defined($revid); $bot->revert('User:Mike.lifeguard', $revid, 'rvv');
$bot->undo($title, $revid, $summary, $after);
Reverts the specified $revid, with an edit summary of $summary, using the undo function. To undo all revisions from $revid up to but not including this one, set $after to another revid. If not set, just undo the one revision ($revid).
See <http://www.mediawiki.org/wiki/API:Edit#Parameters>.
my $revid = $bot->get_last('User:Mike.lifeguard/sandbox', 'Mike.lifeguard'); if defined($revid) { print "Reverting to $revid\n"; $bot->revert('User:Mike.lifeguard', $revid, 'rvv'); }
Returns an array containing the $limit most recent changes to the wiki's main namespace. The array contains hashrefs with keys title, revid, old_revid, and timestamp.
my @rc = $bot->update_rc(5); foreach my $hashref (@rc) { my $title = $hash->{'title'}; print "$title\n"; }
The ``Options hashref'' is also available:
# Use a callback for incremental processing: my $options = { hook => \&mysub, }; $bot->update_rc($options); sub mysub { my ($res) = @_; foreach my $hashref (@$res) { my $page = $hashref->{'title'}; print "$page\n"; } }
The first parameter is a hashref with the following keys:
An ``Options hashref'' can be used as the second parameter:
my @rc = $bot->recentchanges({ ns => 4, limit => 100 }); foreach my $hashref (@rc) { print $hashref->{title} . "\n"; } # Or, use a callback for incremental processing: $bot->recentchanges({ ns => [0,1], limit => 500 }, { hook => \&mysub }); sub mysub { my ($res) = @_; foreach my $hashref (@$res) { my $page = $hashref->{title}; print "$page\n"; } }
The hashref returned might contain the following keys:
For backwards compatibility, the previous method signature is still supported:
$bot->recentchanges($ns, $limit, $options_hashref);
Additional optional parameters are:
A typical query:
my @links = $bot->what_links_here("Meta:Sandbox", undef, 1, { hook=>\&mysub } ); sub mysub{ my ($res) = @_; foreach my $hash (@$res) { my $title = $hash->{'title'}; my $is_redir = $hash->{'redirect'}; print "Redirect: $title\n" if $is_redir; print "Page: $title\n" unless $is_redir; } }
Transclusions are no longer handled by what_links_here() - use ``list_transclusions'' instead.
Other parameters are:
Set max to limit the number of queries performed.
Set hook to a subroutine reference to use a callback hook for incremental processing.
Refer to the section on ``linksearch'' for examples.
A typical query:
$bot->list_transclusions("Template:Tlx", undef, 4, {hook => \&mysub}); sub mysub{ my ($res) = @_; foreach my $hash (@$res) { my $title = $hash->{'title'}; my $is_redir = $hash->{'redirect'}; print "Redirect: $title\n" if $is_redir; print "Page: $title\n" unless $is_redir; } }
my @pages = $bot->get_pages_in_category('Category:People on stamps of Gabon'); print "The pages in Category:People on stamps of Gabon are:\n@pages\n";
The options hashref is as described in ``Options hashref''. Use "{ max => 0 }" to get all results.
my @pages = $bot->get_all_pages_in_category($category, $options_hashref);
Returns an array containing the names of all pages in the specified category (include the Category: prefix), including sub-categories. The $options_hashref is described fully in ``Options hashref''.
my @categories = $bot->get_all_categories(); print "The categories are:\n@categories\n";
Use "{ max => 0 }" to get all results. The default number of categories returned is 10, the maximum allowed is 500.
Additional parameters are:
Set max in $options to get more than one query's worth of results:
my $options = { max => 10, }; # I only want some results my @links = $bot->linksearch("slashdot.org", 1, undef, $options); foreach my $hash (@links) { my $url = $hash->{'url'}; my $page = $hash->{'title'}; print "$page: $url\n"; }
Set hook to a subroutine reference to use a callback hook for incremental processing:
my $options = { hook => \&mysub, }; # I want to do incremental processing $bot->linksearch("slashdot.org", 1, undef, $options); sub mysub { my ($res) = @_; foreach my $hashref (@$res) { my $url = $hashref->{'url'}; my $page = $hashref->{'title'}; print "$page: $url\n"; } }
If you really care, a true return value is the number of pages successfully purged. You could check that it is the same as the number you wanted to purge - maybe some pages don't exist, or you passed invalid titles, or you aren't allowed to purge the cache:
my @to_purge = ('Main Page', 'A', 'B', 'C', 'Very unlikely to exist'); my $size = scalar @to_purge; print "all-at-once:\n"; my $success = $bot->purge_page(\@to_purge); if ($success == $size) { print "@to_purge: OK ($success/$size)\n"; } else { my $missed = @to_purge - $success; print "We couldn't purge $missed pages (list was: " . join(', ', @to_purge) . ")\n"; } # OR print "\n\none-at-a-time:\n"; foreach my $page (@to_purge) { my $ok = $bot->purge_page($page); print "$page: $ok\n"; }
my %namespace_names = $bot->get_namespace_names();
Returns a hash linking the namespace id, such as 1, to its named equivalent, such as ``Talk''.
Additional parameters are:
my @pages = $bot->image_usage("File:Albert Einstein Head.jpg");
Or, make use of the ``Options hashref'' to do incremental processing:
$bot->image_usage("File:Albert Einstein Head.jpg", undef, undef, { hook=>\&mysub, max=>5 } ); sub mysub { my $res = shift; foreach my $page (@$res) { my $title = $page->{'title'}; print "$title\n"; } }
my @data = $bot->global_image_usage('File:Albert Einstein Head.jpg');
The keys in each hashref are title, url, and wiki. $results is the maximum number of results that will be returned (not the maximum number of requests that will be sent, like "max" in the ``Options hashref''); the default is to attempt to fetch 500 (set to 0 to get all results). $filterlocal will filter out local uses of the image.
This method is deprecated, and will emit deprecation warnings.
my $blocked = $bot->is_blocked('User:Mike.lifeguard');
Checks if a user is currently blocked.
This method is deprecated, and will emit deprecation warnings.
If you pass in an arrayref of images, you'll get out an arrayref of results.
my $exists = $bot->test_image_exists('File:Albert Einstein Head.jpg'); if ($exists == 0) { print "Doesn't exist\n"; } elsif ($exists == 1) { print "Exists locally\n"; } elsif ($exists == 2) { print "Exists on Commons\n"; } elsif ($exists == 3) { print "Page exists, but no image\n"; }
$bot->get_pages_in_namespace($namespace, $limit, $options_hashref);
Returns an array containing the names of all pages in the specified namespace. The $namespace_id must be a number, not a namespace name.
Setting $page_limit is optional, and specifies how many items to retrieve at once. Setting this to 'max' is recommended, and this is the default if omitted. If $page_limit is over 500, it will be rounded up to the next multiple of 500. If $page_limit is set higher than you are allowed to use, it will silently be reduced. Consider setting key 'max' in the ``Options hashref'' to retrieve multiple sets of results:
# Gotta get 'em all! my @pages = $bot->get_pages_in_namespace(6, 'max', { max => 0 });
my $count = $bot->count_contributions($user);
Uses the API to count $user's contributions.
($timed_edits_count, $total_count) = $bot->timed_count_contributions($user, $days);
Uses the API to count $user's contributions in last number of $days and total number of user's contributions (if needed).
Example: If you want to get user contribs for last 30 and 365 days, and total number of edits you would write something like this:
my ($last30days, $total) = $bot->timed_count_contributions($user, 30); my $last365days = $bot->timed_count_contributions($user, 365);
You could get total number of edits also by separately calling count_contributions like this:
my $total = $bot->count_contributions($user);
and use timed_count_contributions only in scalar context, but that would mean one more call to server (meaning more server load) of which you are excused as timed_count_contributions returns array with two parameters.
my $latest_timestamp = $bot->last_active($user);
Returns the last active time of $user in "YYYY-MM-DDTHH:MM:SSZ".
my ($timestamp, $user) = $bot->recent_edit_to_page($title);
Returns timestamp and username for most recent (top) edit to $page.
my @recent_editors = $bot->get_users($title, $limit, $revid, $direction);
Gets the most recent editors to $page, up to $limit, starting from $revision and going in $direction.
for ("Mike.lifeguard", "Jimbo Wales") { print "$_ was blocked\n" if $bot->was_blocked($_); }
Returns whether $user has ever been blocked.
This method is deprecated, and will emit deprecation warnings.
my $expanded = $bot->expandtemplates($title, $wikitext);
Expands templates on $page, using $text if provided, otherwise loading the page text automatically.
my @users = $bot->get_allusers($limit, $user_group, $options_hashref);
Returns an array of all users. Default $limit is 500. Optionally specify a $group (like 'sysop') to list that group only. The last optional parameter is an ``Options hashref''.
my @wikis = ("enwiki", "kowiki", "bat-smgwiki", "nonexistent"); foreach my $wiki (@wikis) { my $domain = $bot->db_to_domain($wiki); next if !defined($domain); print "$wiki: $domain\n"; }
You can pass an arrayref to do bulk lookup:
my @wikis = ("enwiki", "kowiki", "bat-smgwiki", "nonexistent"); my $domains = $bot->db_to_domain(\@wikis); foreach my $domain (@$domains) { next if !defined($domain); print "$domain\n"; }
my $db = $bot->domain_to_db($domain_name);
As you might expect, does the opposite of ``domain_to_db'': Converts a domain name (meta.wikimedia.org) into a database/wiki name (metawiki).
Additional parameters are:
my @prefix_pages = $bot->prefixindex("User:Mike.lifeguard"); # Or, the more efficient equivalent my @prefix_pages = $bot->prefixindex("Mike.lifeguard", 2); foreach my $hashref (@pages) { my $title = $hashref->{'title'}; if $hashref->{'redirect'} { print "$title is a redirect\n"; } else { print "$title\n is not a redirect\n"; } }
Additional optional parameters are:
my @pages = $bot->search("Mike.lifeguard", 2); print "@pages\n";
Or, use a callback for incremental processing:
my @pages = $bot->search("Mike.lifeguard", 2, { hook => \&mysub }); sub mysub { my ($res) = @_; foreach my $hashref (@$res) { my $page = $hashref->{'title'}; print "$page\n"; } }
The second is the familiar ``Options hashref''.
my $log = $bot->get_log({ type => 'block', user => 'User:Mike.lifeguard', }); foreach my $entry (@$log) { my $user = $entry->{'title'}; print "$user\n"; } $bot->get_log({ type => 'block', user => 'User:Mike.lifeguard', }, { hook => \&mysub, max => 10 } ); sub mysub { my ($res) = @_; foreach my $hashref (@$res) { my $title = $hashref->{'title'}; print "$title\n"; } }
my $is_globally_blocked = $bot->is_g_blocked('127.0.0.1');
Returns what IP/range block currently in place affects the IP/range. The return is a scalar of an IP/range if found (evaluates to true in boolean context); undef otherwise (evaluates false in boolean context). Pass in a single IP or CIDR range.
print "127.0.0.1 was globally blocked\n" if $bot->was_g_blocked('127.0.0.1');
Returns whether an IP/range was ever globally blocked. You should probably call this method only when your bot is operating on Meta - this method will warn if not.
my $was_locked = $bot->was_locked('Mike.lifeguard');
Returns whether a user was ever locked. You should probably call this method only when your bot is operating on Meta - this method will warn if not.
my $page = 'Main Page'; $bot->edit({ page => $page, text => rand(), summary => 'test', }) unless $bot->get_protection($page);
You can also pass an arrayref of page titles to do bulk queries:
my @pages = ('Main Page', 'User:Mike.lifeguard', 'Project:Sandbox'); my $answer = $bot->get_protection(\@pages); foreach my $title (keys %$answer) { my $protected = $answer->{$title}; print "$title is protected\n" if $protected; print "$title is unprotected\n" unless $protected; }
This method is deprecated, and will emit deprecation warnings.
$bot->patrol($rcid);
Marks a page or revision identified by the $rcid as patrolled. To mark several RCIDs as patrolled, you may pass an arrayref of them. Returns false and sets "$bot->{error}" if the account cannot patrol.
$bot->email($user, $subject, $body);
This allows you to send emails through the wiki. All 3 of $user (without the User: prefix), $subject and $body are required. If $user is an arrayref, this will send the same email (subject and body) to all users.
my @pages = $bot->top_edits("Mike.lifeguard", {max => 5}); foreach my $page (@pages) { $bot->rollback($page, "Mike.lifeguard"); }
Note that accessing the data with a callback happens before filtering the top edits is done. For that reason, you should use ``contributions'' if you need to use a callback. If you use a callback with top_edits(), you will not necessarily get top edits returned. It is only safe to use a callback if you check that it is a top edit:
$bot->top_edits("Mike.lifeguard", { hook => \&rv }); sub rv { my $data = shift; foreach my $page (@$data) { if (exists($page->{'top'})) { $bot->rollback($page->{'title'}, "Mike.lifeguard"); } } }
my @contribs = $bot->contributions($user, $namespace, $options);
Returns an array of hashrefs of data for the user's contributions. $ns can be an arrayref of namespace numbers. $options can be specified as in ``linksearch''.
Specify an arrayref of users to get results for multiple users.
$bot->upload({ data => $file_contents, summary => 'uploading file' }); $bot->upload({ file => $file_name, title => 'Target filename.png' });
Upload a file to the wiki. Specify the file by either giving the filename, which will be read in, or by giving the data directly.
$bot->upload_from_url({ url => 'http://some.domain.ext/pic.png', title => 'Target_filename.png', summary => 'uploading new pic' });
If on your target wiki is enabled uploading from URL, meaning $wgAllowCopyUploads is set to true in LocalSettings.php and you have appropriate user rights, you can use this function to upload files to your wiki directly from remote server.
my @usergroups = $bot->usergroups('Mike.lifeguard');
The hashref can have 3 keys:
$category ="Cat\x{e9}gorie:moyen_fran\x{e7}ais"; # latin1 string $bot->get_all_pages_in_category($category); # OK $category = "Cat". pack("U", 0xe9)."gorie:moyen_fran".pack("U",0xe7)."ais"; # unicode string $bot->get_all_pages_in_category($category); # OK $category ="Cat\x{c3}\x{a9}gorie:moyen_fran\x{c3}\x{a7}ais"; # unicode data without utf-8 flag # $bot->get_all_pages_in_category($category); # NOT OK $bot->get_all_pages_in_category($category, { skip_encoding => 1 }); # OK
If you need this, it probably means you're doing something wrong. Feel free to ask for help.
The latest version of this module is available from the Comprehensive Perl Archive Network (CPAN). Visit <http://www.perl.com/CPAN/> to find a CPAN site near you, or see <https://metacpan.org/module/MediaWiki::Bot/>.
This is free software, licensed under:
The GNU General Public License, Version 3, June 2007