magic-wormhole/docs/api.md
Brian Warner 6ee09f5316 add demo of twisted flow, update docs
python -m wormhole.twisted.demo send-text TEXT -> CODE
python -m wormhole.twisted.demo receive-text CODE -> TEXT
2015-06-20 19:18:29 -07:00

171 lines
6.7 KiB
Markdown

# Magic-Wormhole
This library provides a primitive function to securely transfer small amounts
of data between two computers. Both machines must be connected to the
internet, but they do not need to have public IP addresses or know how to
contact each other ahead of time.
Security and connectivity is provided by means of an "invitation code": a
short string that is transcribed from one machine to the other by the users
at the keyboard. This works in conjunction with a baked-in "rendezvous
server" that relays information from one machine to the other.
## Modes
This library will eventually offer multiple modes.
The first mode provided is "transcribe" mode. In this mode, one machine goes
first, and is called the "initiator". The initiator contacts the rendezvous
server and allocates a "channel ID", which is a small integer. The initiator
then displays the "invitation code", which is the channel-ID plus a few
secret words. The user copies the invitation code to the second machine,
called the "receiver". The receiver connects to the rendezvous server, and
uses the invitation code to contact the initiator. They agree upon an
encryption key, and exchange a small encrypted+authenticated data message.
## Examples
The synchronous+blocking flow looks like this:
```python
from wormhole.transcribe import Initiator
from wormhole.public_relay import RENDEZVOUS_RELAY
mydata = b"initiator's data"
i = Initiator("appid", RENDEZVOUS_RELAY)
code = i.get_code()
print("Invitation Code: %s" % code)
theirdata = i.get_data(mydata)
print("Their data: %s" % theirdata.decode("ascii"))
```
```python
import sys
from wormhole.transcribe import Receiver
from wormhole.public_relay import RENDEZVOUS_RELAY
mydata = b"receiver's data"
code = sys.argv[1]
r = Receiver("appid", RENDEZVOUS_RELAY)
r.set_code(code)
theirdata = r.get_data(mydata)
print("Their data: %s" % theirdata.decode("ascii"))
```
## Twisted
The Twisted-friendly flow looks like this:
```python
from twisted.internet import reactor
from wormhole.public_relay import RENDEZVOUS_RELAY
from wormhole.twisted.transcribe import SymmetricWormhole
outbound_message = b"outbound data"
w1 = SymmetricWormhole("appid", RENDEZVOUS_RELAY)
d = w1.get_code()
def _got_code(code):
print "Invitation Code:", code
return w1.get_data(outbound_message)
d.addCallback(_got_code)
def _got_data(inbound_message):
print "Inbound message:", inbound_message
d.addCallback(_got_data)
d.addBoth(lambda _: reactor.stop())
reactor.run()
```
On the other side, you call `set_code()` instead of waiting for `get_code()`:
```python
w2 = SymmetricWormhole("appid", RENDEZVOUS_RELAY)
w2.set_code(code)
d = w2.get_data(my_message)
```
You can call `d=w.get_verifier()` before `get_data()`: this will perform the
first half of the PAKE negotiation, then fire the Deferred with a verifier
object (bytes) which can be converted into a printable representation and
manually compared. When the users are convinced that `get_verifier()` from
both sides are the same, call `d=get_data()` to continue the transfer. If you
call `get_data()` first, it will perform the complete transfer without
pausing.
## Application Identifier
Applications using this library must provide an "application identifier", a
simple bytestring that distinguishes one application from another. To ensure
uniqueness, use a domain name. To use multiple apps for a single domain, just
use a string like `example.com/app1`. This string must be the same on both
clients, otherwise they will not see each other. The invitation codes are
scoped to the app-id.
Distinct app-ids reduce the size of the connection-id numbers. If fewer than
ten initiators are active for a given app-id, the connection-id will only
need to contain a single digit, even if some other app-id is currently using
thousands of concurrent sessions.
## Rendezvous Relays
The library depends upon a "rendezvous relay", which is a server (with a
public IP address) that delivers small encrypted messages from one client to
the other. This must be the same for both clients, and is generally baked-in
to the application source code or default config.
This library includes the URL of a public relay run by the author.
Application developers can use this one, or they can run their own (see
src/wormhole/servers/relay.py) and configure their clients to use it instead.
## Polling and Shutdown (TODO)
The reactor-based (Twisted-style) forms of these objects need to establish
TCP connections, re-establish them if they are lost, and sometimes (for
transports that don't support long-running connections) poll for new
messages. They may also time out eventually. Longer delays mean less network
traffic, but higher latency.
These timers should be matched to the expectations, and expected behavior, of
your users. In a file-transfer application, where the users are sitting next
to each other, it is appropriate to poll very frequently (perhaps every
500ms) for a few minutes, then give up. In an email-like messaging program
where the introduction is establishing a long-term relationship, and the
program can store any outgoing messages until the connection is established,
it is probably better to poll once a minute for the first few minutes, then
back off to once an hour, and not give up for several days.
The `schedule=` constructor argument establishes the polling schedule. It
should contain a sorted list of (when, interval) tuples (both floats). At
`when` seconds after the first `start()` call, the polling interval will be
set to `interval`.
The `timeout=` argument provides a hard timeout. After this many seconds, the
sync will be abandoned, and all callbacks will errback with a TimeoutError.
Both have defaults suitable for face-to-face realtime setup environments.
## Serialization
TODO: only the Twisted form supports serialization so far
You may not be able to hold the Initiator/Receiver object in memory for the
whole sync process: maybe you allow it to wait for several days, but the
program will be restarted during that time. To support this, you can persist
the state of the object by calling `data = w.serialize()`, which will return
a printable bytestring (the JSON-encoding of a small dictionary). To restore,
use the `from_serialized(data)` classmethod (e.g. `w =
SymmetricWormhole.from_serialized(data)`).
There is exactly one point at which you can serialize the wormhole: *after*
establishing the invitation code, but before waiting for `get_verifier()` or
`get_data()`. If you are creating a new code, the correct time is during the
callback fired by `get_code()`. If you are accepting a pre-generated code,
the time is just after calling `set_code()`.
To properly checkpoint the process, you should store the first message
(returned by `start()`) next to the serialized wormhole instance, so you can
re-send it if necessary.
## Detailed Example
```python
```