From 0b01807f10760f40e4196b83763d45523b65b239 Mon Sep 17 00:00:00 2001 From: NunoSempere Date: Sat, 9 Sep 2023 11:01:04 +0200 Subject: [PATCH] improvements based on other wc versions --- README.md | 45 +++++++++++++++++++++++++++++++-------------- makefile | 5 ++++- ww | Bin 17040 -> 17048 bytes ww.c | 4 ++-- 4 files changed, 37 insertions(+), 17 deletions(-) diff --git a/README.md b/README.md index ae409a0..37834c8 100644 --- a/README.md +++ b/README.md @@ -1,21 +1,38 @@ -Desiderata +# ww: count words in 50 lines of C -- Simple: Simple operation in terms of counting spaces and \n. -- Avoid "off by one" errors; make sure an empty file is reported as such. - - Words as space or enter, followed by nonspace, followed by space? Make sure two spaces aren't two words? -- Keep Linux only. -- No flags. Only count words, not lines. +## Desiderata + +- Simplicity: Just count words, as delimited by: spaces, tabs, newlines. +- No flags. +- Avoid off-by-one errors. - Allow piping, as well as reading files. - - Wonder how normal utilities handle this. -- Could use zig? => Not for now +- Small. +- Linux only. -Steps: +## Comparison with wc. + +The GNU utils version ([github](https://github.com/coreutils/coreutils/tree/master/src/wc), [savannah](http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/wc.c;hb=HEAD)) is a bit over 1K lines of C. It does many things and checks many possible failure modes. + +The busybox version ([git.busybox.net](https://git.busybox.net/busybox/tree/coreutils/wc.c)) of wc is much shorter, at 257 lines, while striving to be [POSIX-compliant](https://pubs.opengroup.org/onlinepubs/9699919799/), meaning it has flags. + +The plan9port version of wc ([github](https://github.com/9fans/plan9port/blob/master/src/cmd/wc.c)) implements some sort of table method, in 352 lines. So does the [plan9](https://9p.io/sources/plan9/sys/src/cmd/wc.c) version, which is worse documented, but shorter. + +[Here](https://github.com/dspinellis/unix-history-repo/blob/Research-V7-Snapshot-Development/usr/src/cmd/wc.c) is a version of wc from UNIX V7, at 86 lines, and allowing for both word and line counts. I couldn't find a version in UNIX V6. Of all the versions, I think I understand this one best. + +## Steps: - [x] Look into how C utilities both read from stdin and from files. - [x] Program first version of the utility -- [ ] Compare with other implementations, see how they do it, after I've read my own version - - [ ] Compare with gnu utils, - - Compare with musl/busybox implementations, - - Maybe make some pull requests, if I'm doing something better? +- [x] Compare with other implementations, see how they do it, after I've read my own version + - [x] Compare with gnu utils. + + - [x] Compare with musl/busybox implementations, + - ~~Maybe make some pull requests, if I'm doing something better? => doesn't seem like it~~ - [ ] Install to ww, but check that ww is empty (installing to wc2 or smth would mean that you don't save that many keypresses vs wc -w) -- [ ] ... +- ~~[ ] Could use zig? => Not for now~~ +- [ ] Look specifically at how other versions + - [ ] Distinguish between reading from stdin and reading from a file + - If it doesn't have arguments, read from stdin. + - [ ] Open files, read characters. +- [ ] Write version that counts lines +- [ ] diff --git a/makefile b/makefile index 13bbdf6..5ea6258 100644 --- a/makefile +++ b/makefile @@ -15,7 +15,7 @@ OUT=ww DEBUG= #'-g' STANDARD=-std=c99 WARNINGS=-Wall -OPTIMIZED=-O0 +OPTIMIZED=-O3 # OPTIMIZED=-O3 #-Ofast ## Formatter @@ -29,6 +29,9 @@ build: $(SRC) format: $(SRC) $(FORMATTER) $(SRC) +install: + cp -n $(OUT) /bin/$(OUT) + test: $(OUT) /bin/echo -e "123\n45 67" | ./$(OUT) /bin/echo -n "" | ./ww diff --git a/ww b/ww index c75e193bc3edcb46c1c653a9d1a660b9a5f806eb..9f8de9b396cd082cfcc684ec16716b7ebef81c76 100755 GIT binary patch delta 1730 zcmZ8h4Qx|Y6u$TMza6jb#%MdjT5vFiuHD;TnXIg@>);KCOjt2G=S)}@2B?E{5Ky;~ zGGZAun@5tx7z`waXe9hpjhRMkMgWZ=(-7h^5`kb+#0(2e34hOXUtbf$P44;q?m6e4 z_wE@PqGLmJ#A%&mMa>kmbfKBfa^sULSJlRh6-H zLsnrn102^MqsC;!5Hp^(h3P;q0YWw<39>$&OjC`Axtj3R!+#>5e#ecvP?RTvATQFLTOm{ z9r{>7Ci#lhJ%~}rgvt>=#hi(nU(4Zh`+V+DIs9l}P<9B7r0?m76btNNq7?rlnkNkh zcA}DYJ(lHk7bZQcdmXI#MFVQEJJ#a+Vnl3f60S$$k2VO?*ePe1G^|~`R0$C-F$$@c z#I_Pcx78gcWSrftB|a0smCiiEmJFr(pRs-wE-@rp{~f$fcUtQ`QOh?<11?(KK{7NQ z{aK8ip~CddGfcVm&&nb@CZgQP+fZ(@5DUR(lRT{KdjIGe`76>sVGPW z=kB(cbB_cc#es4!mo9+u@_u>=>b#wHn>spFMq1%h?>f$=hx^`QI+JwyZc#dep$au7 zVuKfiLX8O9EV)!M3z^dU@DeN2t8yV}iN zDS!{Fm(VZZhw62vlv=y1ZwZ`F^M%~V? z>B(l%s>%LjXH1XLn~xRtf>2xPVlSO(x$*5&u{>u`MeVb}ksa4@*Kr8(5tS?R?woLp0P zmL~ZXM=5^)Q}sU6j4qpN25!{*?KX8_=6Dc(zS;JYnIhko_;OJ2hLVh|9J)zS70QfP*c>n+a delta 1696 zcmZ8hZA?>F7(VCHmKKZji`~{K77e4_;ssi2C8I*o>n=`5wuxrT630eJ_^}0DiP5ZL z(rDIoO~5xQLo;xHP{W_C#yPfS-C!^iTqGcg3CpUB$xS2?mBnhNckj9Pk~lA%^SmF= z`*A*c`j66H{CC?M?L_?=H6Yje{KJ-Wg_?G*??zQog+Ws)4B((bQI=|ImS(ooGS2mN zO~DT{WrwDZ4W4N@IT!5vYf1bSJ}qdcJ#e#thsy>x4T0Uj(>3ZWsB^tK?@{N|j7OtP zh9QF+Y(WFK3@&;>wM#&a>!8}W-qf|f^W#_tyk(5QX=7Uphi!1`A494RLTp-d#MRkZ zOh^IgM4j2XS9Mw#?-SjBB%%psUH3wpiRW^5IAm&~h47P!r)Icr8m6X9yf9&gT%qq< z+g1v*cE|2ZSXxRKH>eOfh5MnD?iUDI>Xp)wFHw-v#^Vg?^087WWqgClN3$`Dk}Fwz zgcTgq5&4}eKlFhB^^O<03IX16tnpW%k#xV(DkUNu6Q$(sm{m$eOsL4`pDK#nt|-cc z@d*q%R`on6DSOQOu*F zH)dDS60;zk9q^svL|i7ZLJVW7s_=>U0*ToZ;v&_PrbIa@_AE&0oexvu{3v5e$v={l zauj>SCHD-dfd&?m5*L#bxZs!ZTrMM#$W|n&l9X!Aw@X2n+I7q*|NTT!u8cjiEN?Qt zASJtVKdW&U(S0=AZPhfT2h-f?!rtiNol;87Nr|jkO5q`d#;vTMNqH6nq@;a& zCnUPNIXM??ZBG6Z-4#v#+bX3ujHCJO9sVbHGG@6d^OZATJ?@Q_w`2UD_Y`FoF>_y0 zmJoM6P!tz7GmiKkobayaK3jxQZ^!o4===f)#eg|PBCfO@o27zxqut41R${ct?m26L% zLYwIcxL3>5VK#dvu&9lA2x(l-ui*@Q6XHF$&}-*B#QdE1F5C?9^eM|=ap3#mz#K#1 z*;v^;g%M0!evhm4wCj*%duFZ6)dW3S(Hm8d9-ToryOF2c)cIYQ-pCs|kO}sHlOsof zhTTvWK4my?K==?YhHpEyN42r>wH?h(TlNNofDnTIO@o;?>wcpd_p3t`FT!-d2~+i} zE%QdLI39R?*q+z5Wvqs}NR`E>UldQQ8XyE7>sA~$-i?FY$@;_D$-8KLK diff --git a/ww.c b/ww.c index 072e529..1bd42fd 100644 --- a/ww.c +++ b/ww.c @@ -1,7 +1,6 @@ #include -#include // read, isatty +#include -// STDIN_FILENO int process_fn(int fn) { char c[1]; @@ -41,6 +40,7 @@ int main(int argc, char** argv) perror("Could not open file"); return 1; } + fclose(fp); return process_fn(fileno(fp)); } else { printf("Usage: ww file.txt\n");