<br><br><div class="gmail_quote">On Sat, Mar 5, 2011 at 11:46 PM, Mike Miller <span dir="ltr"><<a href="mailto:mbmiller%2Bl@gmail.com">mbmiller+l@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

On Sat, 5 Mar 2011, Adam Morris wrote:<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Try \x{8a0} instead.  I think that \x normally accepts only two following characters, so you have to use \x{} for long hexadecimal numbers.<br>

</blockquote>

<br>

You top posted, so I have to ignore you.<br>

<br>

Just kidding.  I did try that and that didn't work either.  Then I did this...<br>

<br>

perl -pe 's/[[:ascii:]]//g ; s/(.)/$1\n/g' file.txt | sort | uniq -c >| bad_chars.txt<br>

<br>

...and when I looked at the resulting bad_chars.txt file in emacs again, the characters looked different.  Before they were appearing as purple rectangles, but now they appeared as a pair of characters that looked like this: \302\240<br>


<br>

I could represent them exactly that way in perl and delete them.  I don't really get what was happening there.<br></blockquote><div><br>I'm guessing you were looking at (possibly variable-length) unicode characters, and your perl filter split them into fixed-length octets or something.  <br>

<br>-Rob<br><br></div></div>