The Article Title
        
          This is the first paragraph of the summary.
          This is the second paragraph of the summary.
        
        
          
            
            First paragraph of this section.
            Second paragraph of this section.
          
          
            First paragraph of this section.
            Second paragraph of this section.
            
              
              First paragraph of this sub-section.
              Second paragraph of this sub-section.
            
          
        
      
EXAMPLE
    The following example computes and prints the median, mean, and standard
    deviation of the fraction of words (ignoring repeats) in a summary that
    also occur in the body of the text for all the articles in the corpora.
      use Text::Corpus::Summaries::Wikipedia;
      use Statistics::Descriptive;
      use File::Slurp;
      use Encode;
      my $corpus = Text::Corpus::Summaries::Wikipedia->new;
      my $statistics = Statistics::Descriptive::Full->new;
      foreach my $textFilePair (@{$corpus->getListOfTextFiles})
      {
        my $summary = lc decode ('utf8', read_file ($textFilePair->{summary}, binmode => ':raw'));
        my %summaryWords = map {($_, 1)} split (/\P{Letter}/, $summary);
        my $totalUniqueSummaryWords = keys %summaryWords;
        next unless $totalUniqueSummaryWords;
        my $body = lc decode ('utf8', read_file ($textFilePair->{body}, binmode => ':raw'));
        map {delete $summaryWords{$_}} split (/\P{Letter}/, $body);
        my $totalUniqueSummaryWordsNotInBody = keys %summaryWords;
        $statistics->add_data (1 - $totalUniqueSummaryWordsNotInBody / $totalUniqueSummaryWords);
      }
      print 'count: ', $statistics->count(), "\n";
      print 'median: ', $statistics->median(), "\n";
      print 'mean: ', $statistics->mean(), "\n";
      print 'standard deviation: ', $statistics->standard_deviation(), "\n";
SCRIPTS
    The script create_summary_corpus.pl makes a corpus for summarization
    testing using this module.
INSTALLATION
    Use CPAN to install the module and all its prerequisites:
      perl -MCPAN -e shell
      >install Text::Corpus::Summaries::Wikipedia
BUGS
    This module creates corpora by parsing Wikipedia pages, the xpath
    expressions used to extract links and text will become invalid as the
    format of the various pages changes, causing some corpora not to be
    created.
    Please email bugs reports or feature requests to
    "bug-text-corpus-summaries-wikipedia@rt.cpan.org", or through the web
    interface at