Windows – How to grep (Select-String) the output of curl (Invoke-WebRequest) in PowerShell

curlgreppowershellwindows

I'm trying to curl a website and grep it for a specific line in PowerShell. How can I do that?

Here's the equivalent of what I'm trying to do, but in BASH

user@host:~$ curl -s www.isxkcdshittytoday.com | grep YES
title="YES">YES</a>
user@host:~$ 

But when I do what I'd expect to be the equivalent in PowerShell, I get the entire response–not just what matches my sls (grep) expression

PS C:\Users\user> curl -UseBasicParsing isxkcdshittytoday.com | sls -ca YES

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Is xkcd shitty today?</title>
<link rel="alternate" title="Is xkcd shitty today?" href="rss.xml" type="application/rss+xml" />

<script>
  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
  })(window,document,'script','//www.google-analytics.com/analytics.js','ga');

  ga('create', 'UA-26584170-2', 'isxkcdshittytoday.com');
  ga('send', 'pageview');
</script>

</head>

<body style="text-align: center; padding-top: 200px;">

<a href="rss.xml" style="font-weight: bold; font-size: 120pt;
font-family: Arial, sans-serif; text-decoration: none; color: black;"
title="YES">YES</a>

</body>
</html>



PS C:\Users\user>

Why is the above command on PowerShell not working like it should? How can I pipe something from curl into sls as an alternative to grep in PowerShell?

Best Answer

It looks like Select-String is quite confused about the concept of "lines".

The solution is to split the text into an array of actual lines and then use Select-String on them:

$r = Invoke-WebRequest www.isxkcdshittytoday.com
$lines = $r.Content.Split([Environment]::NewLine)
$lines | Select-String 'yes'

or, on a single line and with shortcut aliases:

(curl www.isxkcdshittytoday.com).Content.Split([Environment]::NewLine) | sls 'yes'

Please note that in PowerShell curl is an alias for Invoke-WebRequest, which returns a complex object of type HtmlWebResponseObject; in order to get the actual text of the response you have to look at the Content attribute, which instead is a String (and as such can be split); your example only works because of implicit string conversion.

Please also note that splitting a text into lines is not as easy as it seems.

Related Topic