diff --git a/README.md b/README.md index ae409a0..37834c8 100644 --- a/README.md +++ b/README.md @@ -1,21 +1,38 @@ -Desiderata +# ww: count words in 50 lines of C -- Simple: Simple operation in terms of counting spaces and \n. -- Avoid "off by one" errors; make sure an empty file is reported as such. - - Words as space or enter, followed by nonspace, followed by space? Make sure two spaces aren't two words? -- Keep Linux only. -- No flags. Only count words, not lines. +## Desiderata + +- Simplicity: Just count words, as delimited by: spaces, tabs, newlines. +- No flags. +- Avoid off-by-one errors. - Allow piping, as well as reading files. - - Wonder how normal utilities handle this. -- Could use zig? => Not for now +- Small. +- Linux only. -Steps: +## Comparison with wc. + +The GNU utils version ([github](https://github.com/coreutils/coreutils/tree/master/src/wc), [savannah](http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/wc.c;hb=HEAD)) is a bit over 1K lines of C. It does many things and checks many possible failure modes. + +The busybox version ([git.busybox.net](https://git.busybox.net/busybox/tree/coreutils/wc.c)) of wc is much shorter, at 257 lines, while striving to be [POSIX-compliant](https://pubs.opengroup.org/onlinepubs/9699919799/), meaning it has flags. + +The plan9port version of wc ([github](https://github.com/9fans/plan9port/blob/master/src/cmd/wc.c)) implements some sort of table method, in 352 lines. So does the [plan9](https://9p.io/sources/plan9/sys/src/cmd/wc.c) version, which is worse documented, but shorter. + +[Here](https://github.com/dspinellis/unix-history-repo/blob/Research-V7-Snapshot-Development/usr/src/cmd/wc.c) is a version of wc from UNIX V7, at 86 lines, and allowing for both word and line counts. I couldn't find a version in UNIX V6. Of all the versions, I think I understand this one best. + +## Steps: - [x] Look into how C utilities both read from stdin and from files. - [x] Program first version of the utility -- [ ] Compare with other implementations, see how they do it, after I've read my own version - - [ ] Compare with gnu utils, - - Compare with musl/busybox implementations, - - Maybe make some pull requests, if I'm doing something better? +- [x] Compare with other implementations, see how they do it, after I've read my own version + - [x] Compare with gnu utils. + + - [x] Compare with musl/busybox implementations, + - ~~Maybe make some pull requests, if I'm doing something better? => doesn't seem like it~~ - [ ] Install to ww, but check that ww is empty (installing to wc2 or smth would mean that you don't save that many keypresses vs wc -w) -- [ ] ... +- ~~[ ] Could use zig? => Not for now~~ +- [ ] Look specifically at how other versions + - [ ] Distinguish between reading from stdin and reading from a file + - If it doesn't have arguments, read from stdin. + - [ ] Open files, read characters. +- [ ] Write version that counts lines +- [ ] diff --git a/makefile b/makefile index 13bbdf6..5ea6258 100644 --- a/makefile +++ b/makefile @@ -15,7 +15,7 @@ OUT=ww DEBUG= #'-g' STANDARD=-std=c99 WARNINGS=-Wall -OPTIMIZED=-O0 +OPTIMIZED=-O3 # OPTIMIZED=-O3 #-Ofast ## Formatter @@ -29,6 +29,9 @@ build: $(SRC) format: $(SRC) $(FORMATTER) $(SRC) +install: + cp -n $(OUT) /bin/$(OUT) + test: $(OUT) /bin/echo -e "123\n45 67" | ./$(OUT) /bin/echo -n "" | ./ww diff --git a/ww b/ww index c75e193..9f8de9b 100755 Binary files a/ww and b/ww differ diff --git a/ww.c b/ww.c index 072e529..1bd42fd 100644 --- a/ww.c +++ b/ww.c @@ -1,7 +1,6 @@ #include -#include // read, isatty +#include -// STDIN_FILENO int process_fn(int fn) { char c[1]; @@ -41,6 +40,7 @@ int main(int argc, char** argv) perror("Could not open file"); return 1; } + fclose(fp); return process_fn(fileno(fp)); } else { printf("Usage: ww file.txt\n");