Ok, I now have cygwin supporting UTF-8 to some extent using the following patch:
http://www.okisoft.co.jp/esc/utf8-cygwin/So I can at least see the UTF-8 characters properly in the console now. It says it will properly use UTF-8 for file IO as well.
But the awk class I mentioned above is clearly dropping accented characters.

After saving the awk file itself as UTF-8 (NO BOM! which is important) I can now properly match on UTF-8 characters typed into the file, such as ’
Only the accented characters left to figure out...