ps1.soapyfrog.com

doing inappropriate things with powershell

Pages

  • About
  • Downloads
  • Archives

Search

Popular Posts

  • Grrr 1.1 and Big Invaders
  • Space Invaders
  • Convert images to text (ASCII art)
  • Cmdlet clashes
  • Console screen grabs in html

Recent Posts

  • Grrr source code, including Invaders
  • Google going down the pan
  • Cmdlet clashes
  • Grrr 1.1 and Big Invaders
  • Grrr, Cmdlets and PSInvaders revival

Categories

  • Announce (7)
  • Cmdlets (2)
  • Cool (16)
  • Grrr (6)
  • Hint (2)
  • Invaders (5)
  • Odd (2)
  • PowerShell (27)
  • Quiz (3)
  • Rant (7)
  • Uncategorized (1)
  • Utility (5)

Months

  • August 2007 (1)
  • April 2007 (1)
  • March 2007 (1)
  • February 2007 (3)
  • January 2007 (25)
  • December 2006 (1)

Bookmarks

  • Blogroll
    • $script Fanatics
    • blog.soapyfrog.com
    • Brian Long
    • Lee Holmes
    • Nik Crabtree
    • PowerShell-Scripting (French)
    • Richy Rich
    • The PowerShell Guy
    • Windows PowerShell
  • Links
    • Carbon-neutral web hosting!

Meta

  • Log in
  • Posts RSS
  • Comments RSS
  • Valid XHTML
  • Valid CSS
« What colour is yours?
Script Updates »
 

Text and path gotchas

09 Jan 2007 01:00 pm// Rant    

PowerShell is now my default Windows shell, but every now and then, little things crop up that make me want to go back to CMD.exe, or better cygwin/zsh.

Someone gave me an example of filtering an ASCII text file, stripping out the first 3 characters of each line.

On a Mac/UNIX box, you’d simply do:

cut -c4- infile.txt > outfile.txt

With PowerShell, you do:

gc infile.txt | % { $_.remove(0,3) } | sc outfile.txt

As an asside, on my machine, the latter takes 80 times longer to do the job on a 8MB text file than the former , but we’ll skip over that :-) So, to the gotchas.

Gotcha 1 – text gets recoded

You might be tempted to do this instead:

gc infile.txt | % { $_.remove(0,3) } > outfile.txt

If you did, the resulting outfile.txt would be UTF16 and so be double the size of the original (less the cut 3 chars per line).

You might not notice this at first, as PS1 handles all this sort of thing transparently, but at some point, something might choke on it. You can be specific about output encoding, but that’s not something you might think about.

Gotcha 2 – path weirdness

My working area is: c:\Documents and Settings\adrian\My Documents\proj\top. That’s quite a lot to deal with, so I usually make it shorter by doing this:

new-psdrive -name doc: -psprovider filesystem -root "c:\Documents and Settings\adrian\My Documents"

so my working directory is actually doc:\proj\top. Much shorter, however, when I run the first command above, I get this:

PS doc:\\proj\\top> gc infile.txt | % { $_.remove(0,3) } | sc outfile.txt
Set-Content : The given path's format is not supported.
At line:1 char:42
+ gc infile.txt | % { $_.remove(0,3) } | sc  ﹤﹤﹤﹤ outfile.txt
The pipeline has been stopped.
At line:1 char:30
+ gc 100.txt | % { $_.remove( ﹤﹤﹤﹤ 0,3) } | sc outfile.txt
Set-Content : The given path's format is not supported.
At line:1 char:42

Odd that gc didn’t mind, but sc did.

You can’t even use resolve-path to fix it because if outfile.txt didn’t already exist, you’d get an error… maybe you need to use new-item -type file or something first?

To be honest, I gave up. PowerShell should really be making things easier, not throwing up barriers at every turn.

Comments (1)

One comment to “Text and path gotchas”

  1. On 10 Jan 2007 at 4:36 am, jsnover said:   

    Please file a feature on this. Seems like we need a Cmdlet like Remove-String that takes either a FILENAME or a stream of strings and removed.
    Instructions for how to do this can be found at: http://blogs.msdn.com/powershell/archive/2006/05/09/filing-bugs.aspx .

    Apologizes for the inconvenience.

    Jeffrey Snover [MSFT]
    Windows PowerShell/MMC Architect
    Visit the Windows PowerShell Team blog at: http://blogs.msdn.com/PowerShell
    Visit the Windows PowerShell ScriptCenter at: http://www.microsoft.com/technet/scriptcenter/hubs/msh.mspx

Copyright © 2006-2009 Adrian Milliner

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 2.5 License.