The latest Twisted fixes the web.Agent code we need for proper async
support. There's still a daemonization bug that prevents 'wormhole
server start' from succeeding (it hangs).
I'm planning to leave non-EventSource "/get" in until after 0.6.0, then
remove it. I think it's cleaner for the logs to have the two
forms (EventSource and immediate) use different URLs.
At PyCon 2007, Robert "r0ml" Lefkowitz gave a keynote comparing the rise
of actual-paper literacy (the development of whitespace, punctuation,
sentences, pages, bookmarks, an index, argumentative forms, forensics,
rhetoric) with the rise of computer-language literacy (macros,
multicharacter variable names, loops, comments, OOP, reusable code,
collaborative review). He pointed out that many classical written
techniques do not yet have analogues in our programming practices,
citing "antithesis" as one such tool. In writing, Antithesis is where
you lay out the opposite of the idea you really want to convey, to
explain what's wrong with it. By including antithesis, you can capture
some valuable knowledge, and might anticipate (and head off) future "but
what about X" arguments.
This branch documents a wrong turn: an API that I thought would be a
good idea, but which turned out to not be worth it. Rather than
discarding the branch entirely, I decided to merge the history (but not
the changes) into trunk, so I don't lose the decision-making process or
the implementation.
The impetus for this feature was the unfortunate extra round trip
introduced when I added "confirmation" messages in 3220014. Confirmation
messages were necessary to avoid a hang when "wormhole receive" was
given the wrong codephrase. The previous messages flow was:
* sender->receiver: PAKE1
* receiver->sender: PAKE2
* sender->receiver: DATA
* receiver->sender: ACK
Both sides compute a key when they hear the other's PAKE message, but if
the wormhole codes are different, they will compute different keys. When
they discover this, they should raise a WrongPasswordError to notify
their users. But when exactly does this happen?
The receiver learns about this when they hear the DATA message,
and (before commit d1cf1c6) would hang up immediately, before allowing
the application code to send any ACK. As a result, the sender never sees
the ACK (which would be mis-encrypted, and thus reveal that the codes
were different), and waits forever.
Adding confirmations to the flow gives us:
* sender->receiver: PAKE1
* receiver->sender: PAKE2
* sender->receiver: CONFIRM1
* sender->receiver: DATA
* receiver->sender: CONFIRM2
* receiver->sender: ACK
Both sides send a CONFIRM message as soon as they hear the other's PAKE
message, before computing a shared key or returning control to
application code. The receiver's CONFIRM2 goes out before it processes
DATA. A moment later, in the same function call, the receiver gets a
decrypt error on the DATA message and aborts the connection. However the
sender will see CONFIRM2 arrive, tries (and fails) to validate it, and
can abort the connection itself, giving the "wormhole send" user a clear
error message (WrongPasswordError).
The sender is now sending two messages in close succession: CONFIRM1 and
DATA. Both are sent in response to the incoming PAKE2 message, and in an
ideal world both would be sent in the same round trip. In the hopes of
achieving this, I spent quite a bit of time changing the architecture on
both client and server sides, and improving the server API:
* POST to the server would accept multiple messages, not just one
* the EventSource "watch" API could deliver multiple messages in a
single line
Those changes worked, however when I finally came to change the sender
to put both messages in a single call, I found that I could not: the
messages come from very different places. The CONFIRM1 is sent just
after waiting for (and receiving) PAKE2, in `_get_key()`. The DATA
message it sent after getting the key, in `send_data()`. Despite both
happening in the same turn of the event loop (or, equivalently, in the
same stack frame), the Wormhole API would have to be unpleasantly
changed to make it possible for both messages to go out together. In
particular, `_get_key()` is called from both `send_data()` (which sends
DATA) and `get_verifier()` (which deliberately does not). The least-bad
approach I could come up with was to have CONFIRM1 be accumulated in a
Nagle-like queue until the caller allowed all messages to be sent.
In the end I decided it wasn't worth the complexity. Sufficiently
motivated senders can manually pipeline the two messages without
explicit API support (there's no reason an async sender must wait for
CONFIRM1 to be delivered before sending DATA down the same wire). And
receivers don't really need their "watch" (EventSource) API to deliver
batches of messages instead of single ones: apps should treat messages
as an unordered set anyways. I also realized that the prioritization
aspect of the new "get_first_of" API was unnecessary: any client that
wants a CONFIRM message for key confirmation would be just as well
served by any DATA message (either can be used for key-confirmation):
the important property is that we accept CONFIRM *in addition to* a
DATA, because in some error cases we'll never see the DATA (ACK).
So, having watched the reasons for these changes crumble to the ground,
I decided to not land them. But the lessons learned in the process were
still valuable, so I'm including this branch in the mainline history
even though the actual code changes were abandoned.
In the twisted-style code, the close_on_error() decorator forces the
return value to be a Deferred, which is all wrong for internal uses of
derive_key() (verification string and confirmation message). It might be
useful to have a synchronous form of close_on_error(), but since the
actual close() is async, that's not very straightforward.
So for now, tolerate unclosed Wormhole objects when someone calls
derive_key() too early, or with a non-unicode type string.
This removes the 'allocations' table entirely, and cleans up the way we
prune old messages. This should make it easier to summarize each
connection (for usage stats) when it gets deallocated, as well as making
pruning more reliable.