SYNOPSIS

       unihist ([option flags])


DESCRIPTION

       unihist  generates  a  histogram  of the characters in its input, which
       must be encoded in UTF-8 Unicode. By default,  for  each  character  it
       prints the frequency of the character as a percentage of the total, the
       absolute number of tokens in the input, the UTF-32 code  in   hexadeci-
       mal,  and,  if  the character is displayable, the glyph itself as UTF-8
       Unicode. Command line flags  allow  unwanted  information  to  be  sup-
       pressed.   In  particular, note that by suppressing the percentages and
       counts it is possible to generate a list of the  unique  characters  in
       the input.

       Output  is produced ordered by character code. To sort it in descending
       order of frequency, pipe the output into the command:

              sort -k1 -n -r

       By default, unihist handles all of Unicode. To reduce memory usage  and
       increase speed, it may be compiled so as to handle only the Basic Mono-
       lingual Plane (plane 0) by defining BMPONLY.


COMMAND LINE FLAGS

       -c     Suppress printing of counts and percentages.

       -g     Suppress printing of glyphs.

       -h     Print usage information.

       -u     Suppress printing of the Unicode code as text.

       -v     Print version information.



SEE ALSO

       uniname (1)


REFERENCES

       Unicode Standard, version 5.0


AUTHOR

       Bill Poser
       billposer@alum.mit.edu


LICENSE

       GNU General Public License






Man(1) output converted with man2html