wc/README.md

48 lines
2.7 KiB
Markdown
Raw Normal View History

# ww: count words in 50 lines of C
2023-09-08 20:27:32 +00:00
## Desiderata
- Simplicity: Just count words, as delimited by: spaces, tabs, newlines.
- No flags.
- Avoid off-by-one errors.
2023-09-08 20:27:32 +00:00
- Allow piping, as well as reading files.
- Small.
- Linux only.
## Comparison with wc.
The GNU utils version ([github](https://github.com/coreutils/coreutils/tree/master/src/wc.c), [savannah](http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/wc.c;hb=HEAD)) is a bit over 1K lines of C. It does many things and checks many possible failure modes. I think it detects whether it should be reading from stdin using some very wrapped fstat.
The busybox version ([git.busybox.net](https://git.busybox.net/busybox/tree/coreutils/wc.c)) of wc is much shorter, at 257 lines, while striving to be [POSIX-compliant](https://pubs.opengroup.org/onlinepubs/9699919799/), meaning it has flags.
The plan9port version of wc ([github](https://github.com/9fans/plan9port/blob/master/src/cmd/wc.c)) implements some sort of table method, in 352 lines. So does the [plan9](https://9p.io/sources/plan9/sys/src/cmd/wc.c) version, which is worse documented, but shorter.
[Here](https://github.com/dspinellis/unix-history-repo/blob/Research-V7-Snapshot-Development/usr/src/cmd/wc.c) is a version of wc from UNIX V7, at 86 lines, and allowing for both word and line counts. I couldn't find a version in UNIX V6. Of all the versions, I think I understand this one best.
2023-09-08 20:27:32 +00:00
## Steps:
2023-09-08 20:27:32 +00:00
- [x] Look into how C utilities both read from stdin and from files.
- [x] Program first version of the utility
- [x] Compare with other implementations, see how they do it, after I've read my own version
- [x] Compare with gnu utils.
- [x] Compare with musl/busybox implementations,
- ~~Maybe make some pull requests, if I'm doing something better? => doesn't seem like it~~
- [ ] Install to ww, but check that ww is empty (installing to wc2 or smth would mean that you don't save that many keypresses vs wc -w)
- ~~[ ] Could use zig? => Not for now~~
2023-09-10 13:05:28 +00:00
- [ ] Look specifically at how other versions do stuff.
- [ ] Distinguish between reading from stdin and reading from a file
- If it doesn't have arguments, read from stdin.
- [ ] Open files, read characters.
- [ ] Write version that counts lines
2023-09-09 09:22:41 +00:00
- [ ] Document reading from user-inputed stdin (end with Ctrl+D)
2023-09-10 13:05:28 +00:00
- [ ] Write man files?
- [ ] Write a version for other coreutils? <https://git.busybox.net/busybox/tree/coreutils/>? Would be a really nice project.
- Simple utils.
- zig?
- https://github.com/leecannon/zig-coreutils
- https://github.com/keiranrowan/tiny-core/tree/master
- [x] Add lc
- Take into account what happens if file doesn't end in newline.
- [ ] add chc (cc is "c compiler")