R – How to print all the lines between the previous and next empty lines when a match is found

perl

I've racked my brain trying to come with a solution but in vain. Any guidance would be appreciated.

_data_
mascot
friend
ocean
\n
parsimon
**QUERY**
apple
\n
jujube
\n
apricot
maple
**QUERY**
rose
mahonia
\n

….Given the search keyword is QUERY, it would output:

parsimon
**QUERY**
apple

apricot
maple
**QUERY**
rose
mahonia

I wrote a code that doesn't work as I would like:

#!/usr/bin/perl

use strict; 
use warnings;

open my $fh, '<', 'FILE' or die "Cannot open: $!";
my @file = <$fh>;
close $fh;

for (0 .. $#file) {   # read from the first line to the last
  if($file[$_] =~ /QUERY/){  # if the contents of a particular line matches the query pattern
        my $start = $_-- until $file[$_--] =~ /^$/; #check the previous line for an empty line. continue until success. store the index of the empty line to $start.
        my $end = $_++ until $file[$_++] =~ /^$/; #check the next line for an empty line. continue until sucess. store the index of the empty line to $end.

print "\n @file[$start..$end]"; #print all lines between the stored indexes
}
}

I also tried something like this but there was syntactic error:

if($file[$_] =~ /QUERY/){
        my $start = $_-4 if $file[$_-4] =~ /^$/;
      continue  my $start = $_-3 if $file[$_-3]=~/^$/;
  ------
my $end = $_+4 until $file[$_+4] =~ /^$/;
.....

print "\n @file[$start..$end]";
}
.....

Seems that the only good thing that I've so far succeeded in achieving is I can print everything between the matching lines and next empty lines using the following code:

for (0 .. $#file) {
  if($file[$_+1] =~ /QUERY/) {
   print $file[$_] until $file[$_++]=~/^$/;

Can someone point me in the right direction?
Thanks!

Mike

Edit

I think brian d foy's solution to my problem is the best. By best I mean the most efficient. But Jeff's solution is the most helpfult and I benefit a lot especially from his detailed line-by-line explanations and what's even better, using his code, with only a few tweaks, I can do something else, for example, print all lines between the lines starting with a digit when a pattern is found. And Kinopiko's code is the very code I was hoping to be able to write.

Best Answer

Wow, you guys really like doing a lot of work in those answers. Remember, in text processing, Perl makes the easy things easy (and the hard things possible). If you're doing a lot of work for something that's easy to explain, you're probably missing the easy way. :)

Just redefine a line to be a paragraph and print the matching paragraphs as you read them. You can change Perl's idea of a line by setting the input record separator, $/, to be the line-ending that you want. When you use the line input operator, you'll get back everything up to and including what is in $/. See perlvar for the details on Perl special variables:

#!perl

{
    local $/ = "\n\n";

    while( my $group = <DATA> ) {
        print $group if $group =~ /\Q**QUERY**/;
    }
 }


__DATA__
mascot
friend
ocean

parsimon
**QUERY**
apple

jujube

apricot
maple
**QUERY**
rose
mahonia

ghostdog74 posted his one-liner version, which I modified slightly:

perl -ne "$/=qq(\n\n); print if /\Q**QUERY**/" fileA fileB ...

perl has a special command-line switch for this, though. You set the input record separator with -0, and if you set it to 0, it means you're setting it to use the paragraph mode:

perl -00 -ne "print if /\Q**QUERY**/" fileA fileB ...

The perlrun shows you all the nifty things you can do on the command line.

Related Topic