You'll recall from that PowerShell's pipeline allows object data to be passed between cmdlets. This section discusses the key built-in object manipulation cmdlets that allow you to "slice and dice" object data before passing it along the pipeline, where it may be subsequently used for further cmdlet input or used to generate some kind of output, for example, a text file, CSV file, XML file, console output, HTTPS stream, Etc.
We'll begin by discussing the three most fundamental, and therefore most commonly used, object manipulation cmdlets (, and ).
We need some form of input data in order to use (and experiment with) the Powershell pipeline. For the purposes of this primer, we'll be using a small selection of Powershell's Get cmdlets, for example, Get-Process, Get-Service and Get-WmiObject. PowerShell provides many cmdlets for retrieving system or application data, with the actual number of cmdlets being dependent on the version of Powershell being used and the currently loaded and snap-ins.
The first of the "slice and dice" operations is achieved using Where-Object. This cmdlet achieves the same result as SQL's where clause, filtering the required object instances (think rows) from the pipeline. Only objects that match the criteria provided will continue to be passed along the pipeline.
The example below retrieves all Windows Services whose name begins with "W" that are in a running state:
In order to be able to apply a Where-Object operation, we must first know which properties are available for the object class that we are dealing with. Using the above example, how did we know that the ServiceName and Status properties were available for the objects returned by Get-Service and how did we know that Running was a valid status value? The answer was first discussed in section , where we used a combination of Get-Member and the Out-GridView cmdlets to explore object classes. To recap, the Get-Member cmdlet can be used to view available methods and properties for a given class, for example:
And the Out-GridView cmdlet provides a useful method of viewing an object's property values, for example:
Using the above information, we can see that the Get-Service cmdlet returns numerous object properties. This information allows us to build suitable criteria for our Where-Object cmdlet.
The Where-Object cmdlet has a built-in alias of ?, which can be used for brevity. However, I strongly recommend not using aliases in production scripts or online examples (see for further information) and instead recommend limiting their use to command-line interaction, for example:
More importantly, PowerShell V3.0 introduced a shorthand variation of the Where-Object cmdlet, that is more akin to natural language and reads like SQL's where clause. The syntax of this variation is very prescriptive and requires that:
The example below demonstrates this shorthand form and uses the where alias in place of Where-Object:
When dealing with data it is always a good idea to dispense with unwanted data as early as possible in order to reduce the amount of in-memory data operations (moves, copies, Etc.) as possible. This holds true with PowerShell's pipeline. Wherever possible, a filter operation should be applied as early as possible. Thankfully, most of PowerShell's Get cmdlets support one or more filtering parameters. The table below lists just a few examples:
Cmdlet | Filtering Parameter | Example |
---|---|---|
Get-ADUser | -Filter | Get-ADUser -Filter {(ObjectClass -eq 'user') -and (mail -like '*.powershellprimer.com')}; |
Get-ADUser | -LDAPFilter | Get-ADUser -LDAPFilter (&(objectCategory=person)(objectClass=user)(userAccountControl:1.2.840.113556.1.4.803:=2)); |
Get-ChildItem | -Filter | Get-ChildItem -LiteralPath C:\ -Recurse -Filter *.xml; |
Get-NetTCPConnection | -State | Get-NetTCPConnection -State Listen; |
Get-Service | -Include | Get-Service -Include @('a*', 'w*'); |
Get-Service | -DependentServices | Get-Service -DependentServices 'rpcss'; |
Get-WmiObject | -Filter | Get-WmiObject -Class Win32_Volume -Filter 'DriveType = 3'; |
The performance characteristics of various coding techniques is often discussed (or argued) at great length. However, the most pragmatic way to determine the relative efficiencies of different coding techniques is to actually take measurements. It's for this reason that PowerShell V3.0 (and above) includes the cmdlet, which can be used to generate empirical performance metrics. For example, the code examples below demonstrate the time taken to enumerate all .XML files in %SystemRoot% on a given reference system. The first example performs the file-type filter as a post-enumeration operation, whereas the second example includes the filtering as part of the file enumeration operation:
As you can see, the second technique is nearly 4½ times faster than the first. Similar timings can be observed with other data operations and it is therefore very important for the developer to consider what data is being manipulated and what filtering options are available. And, of course, it's also important to actually carry out performance testing rather than make assumptions.
The second of the "slice and dice" operations is achieved using Select-Object. Whereas Where-Object filtered just rows, the Select-Object cmdlet can be used to filter rows and object properties (think columns). Once again, to make good use of this cmdlet, we must first understand which columns are available. And again, we can use the techniques described above to achieve this.
The row filtering capabilities of Select-Object are as follows:
Filtering Parameter | Description | Example |
---|---|---|
-First | Returns the first n rows/objects passed along the pipeline. | Get-EventLog -LogName System -EntryType Error | Select-Object -First 10; |
-Index | Returns specific rows/objects, based on their position in the data being passed along the pipeline. | Get-WmiObject -Class Win32_SystemSlot | Select-Object -Index 0, 1, 2; |
-Last | Returns the last n rows/objects passed along the pipeline. | (Invoke-WebRequest -Uri 'https://www.google.co.uk/#q=Hello%2C+World!').Links | Select-Object -Last 5; |
-Skip | Skips the first n rows/objects passed along the pipeline. | Get-ChildItem -File -Filter *.queued | Sort-Object -Property LastWriteTime -Descending | Select-Object -Skip 1; |
-Unique | Returns a list of unique row/object values based on the given property. | Get-Process | Select-Object -Unique ProcessName; |
The -First, -Last and -Skip parameters can be combined, for example:
The Select-Object cmdlet's primary purpose is filtering object properties (again, think columns), much like we do with an SQL select statement, for example:
Here, we've returned just the ProcessName, process Id (PID) and PrivateMemorySize (in bytes) values for all SVCHOST.EXE processes.
, below).');Probably the most powerful aspect of the Select-Object cmdlet is its ability to add calculated properties to objects in the pipeline. The format for adding a calculated property is as follows:
...where:
A simple example of this capability involves using a calculated property to change the way an existing property is presented. For example, we can easily use Get-Process to return the process name, process ID and private memory size of each svchost process:
But, what if we wanted the private memory size to be reported in megabytes (MB) instead? This is where a calculated property can be used. In this case, we'll create a new property which divides the existing private memory size by 1MB (see ). This can be seen below:
We can then dispense with the original PrivateMemorySize property (column):
The Sort-Object cmdlet simply performs a sort operation against the object data being passed along the pipeline. The sort evaluation is performed against the property (or properties) specified by the -Property parameter. By default, the sort operation is case-insensitive and returns results in ascending order; use the -CaseSensitive parameter to enforce case-sensitivity and the -Descending parameter to reverse the sort order. Finally, use the -Unique parameter to remove duplicate values prior to sorting.
The example below returns all .EXE files in %SystemRoot%\System32 and orders the result by file size (largest first):
Up until now, we have been manipulating the object data being passed along the pipeline. We've used Where-Object and Select-Object to filter the object data and Sort-Object to re-order (sort) the object data. As already discussed in the section, if we do nothing more at this stage, PowerShell will "burst" the object data and render the results to the console as a textual representation.
However, if we wish to do something with the object data, we can use the Foreach-Object cmdlet. The example below uses Get-ChildItem to enumerate all .XML files in a given directory. Then, using Foreach-Object, it carries out an operation against each file that is found. In this particular case, the operation begins by determining if the current file is locked (we use Rename-Item to ascertain this). If the file is locked, a warning message is displayed, otherwise, processing continues for the given file.
As seen above, the current pipeline object being processed can be referenced using the $_ or $PSItem .
call would in a normal loop). If you wish to jump to the next object in the pipeline, make a return call to exit the current script block.'); for Foreach-Object. This is often confused with the standalone \ loop construct, and presents another example as to why aliases should be avoided (see \).' );Sometimes it might be useful to carry out some kind of pre-processing and/or post-processing when calling Foreach-Object, this can be achieved using the -begin and -end parameters, e.g:
The PowerShell code in the -Begin will execute once, before the pipeline objects are enumerated. The PowerShell code in the -End will execute once, when all of the pipeline objects have been processed.
The Foreach-Object cmdlet also has a built-in alias of %, which, again, can be used for brevity, for example:
The above example also uses the gci for Get-ChildItem and launches Microsoft Paint for all JPEG files in the current directory. Once again, aliases should be reserved for command-line interaction only.
The Tee-Object cmdlet is very similar to the UNIX tee command, in that it concurrently outputs pipeline data to the specified file (or variable) whilst continuing to pass object data along the pipeline. Although this may be useful in a PowerShell script, it can be particularly useful at the command line, for example:
In this example, we use Get-ChildItem to enumerate all .TXT files in C:\TEMP. We then use Tee-Object to write the file details to a log file whilst simultaneously passing the file objects along the pipeline to Remove-Item, which removes (deletes) the file(s).
As we've seen, the PowerShell pipeline is a powerful tool. However, as the adage goes, "With great power comes great responsibility". It is all to easy to write a "deep" pipeline operation that performs a potentially crippling commit of some sorts in its final stages. The default action for most cmdlets is to continue upon failure; this means it's possible for a pipeline operation to complete having failed processing one or more objects along the way.
Couple this behaviour with the fact that most of the built-in cmdlets are able to deal with multiple input objects (i.e.: arrays of objects) means that system-altering pipeline operations need to be approached with care.
The simplified example below attempts to demonstrate this. In this example we create a set of text files, open one of them for writing (therefore creating an exclusive lock) and then attempt to delete them all:
As you can see, the locked file could not be deleted, and the Remove-Item call generated an . However the pipeline operation continued. In order to prevent the pipeline continuing when an error occurs, we must specify -ErrorAction Inquire or -ErrorAction Stop when calling any cmdlets that may fail. However, this approach doesn't scale well and isn't appropriate for enterprise-class operations, such as manipulating Microsoft Active Directory user objects, Microsoft Exchange mailboxes or Hyper-V virtual machines en masse.
An arguably more robust method of dealing with large scale enterprise pipeline operations is discussed in section , below.
Most of PowerShell's built-in cmdlets that are capable of changing the system state (for example, those beginning with Add, Clear, Remove, Start, Stop, Etc.) support the -Confirm and -WhatIf parameters. The -Confirm parameter can be used to explicitly state that the cmdlet must prompt the user for confirmation before executing (specify -Confirm:$true). The -WhatIf parameter causes PowerShell to show the user what the cmdlet would do, if run without -WhatIf, and is particularly useful for sanity-checking potentially risky command-line operations.
The example below demonstrates the -WhatIf parameter being used to test a directory removal operation and gives the user an opportunity to spot that they may be in the wrong directory.
As discussed above, it is often necessary to break out of a PowerShell pipeline before system-affecting operations are carried out. In this mode of operation, the pipeline is still used to obtain, filter and sort input objects. However, once we have our candidate objects (our work "queue"), we use a ForEach call (not to be confused with ) to process each item of work in a more programmatic, controlled fashion. This gives us an opportunity to detect and handle errors in a more robust way.
Continuing with the file deletion example in section , we can type the following to ascertain the object class returned by Get-ChildItem, when dealing with file objects:
We can therefore create an (covered, later, in section ) to hold instances of System.IO.FileInfo objects, plus a single instance of the object to be used as an array enumerator:
We can then populate the array using the Get-ChildItem output:
Finally, we can establish a variable to hold a return code and use the ForEach call to step through (enumerate) each member of the array (i.e.: each file):
As you can see, this lets us properly detect and handle errors by means of an , which we'll cover in section . This technique gives us far more control over how objects (e.g.: files, mailboxes, virtual machines, Etc.) are processed when making changes to system state; more specifically, we can:
Please refer to section for further information regarding the foreach loop construct.