wc in 44 lines of C
Go to file
2023-09-09 11:22:41 +02:00
makefile improvements based on other wc versions 2023-09-09 11:01:04 +02:00
README.md small simplification. 2023-09-09 11:22:41 +02:00
ww small simplification. 2023-09-09 11:22:41 +02:00
ww.c small simplification. 2023-09-09 11:22:41 +02:00

ww: count words in 50 lines of C

Desiderata

  • Simplicity: Just count words, as delimited by: spaces, tabs, newlines.
  • No flags.
  • Avoid off-by-one errors.
  • Allow piping, as well as reading files.
  • Small.
  • Linux only.

Comparison with wc.

The GNU utils version (github, savannah) is a bit over 1K lines of C. It does many things and checks many possible failure modes.

The busybox version (git.busybox.net) of wc is much shorter, at 257 lines, while striving to be POSIX-compliant, meaning it has flags.

The plan9port version of wc (github) implements some sort of table method, in 352 lines. So does the plan9 version, which is worse documented, but shorter.

Here is a version of wc from UNIX V7, at 86 lines, and allowing for both word and line counts. I couldn't find a version in UNIX V6. Of all the versions, I think I understand this one best.

Steps:

  • Look into how C utilities both read from stdin and from files.
  • Program first version of the utility
  • Compare with other implementations, see how they do it, after I've read my own version
    • Compare with gnu utils.

    • Compare with musl/busybox implementations,

    • Maybe make some pull requests, if I'm doing something better? => doesn't seem like it

  • Install to ww, but check that ww is empty (installing to wc2 or smth would mean that you don't save that many keypresses vs wc -w)
  • [ ] Could use zig? => Not for now
  • Look specifically at how other versions
    • Distinguish between reading from stdin and reading from a file
      • If it doesn't have arguments, read from stdin.
    • Open files, read characters.
  • Write version that counts lines
  • Document reading from user-inputed stdin (end with Ctrl+D)