This package takes a markdown file, and creates a new markdown file in which each link is accompanied by an archive.org link, in the format [...](original link) ([a](archive.org link)).
- Add [this file](https://github.com/NunoSempere/longNowForMd/blob/master/longnowformd.sh) to your path, for instance by moving it to the `/usr/bin` folder and giving it execute permissions, or
This utility requires [archivenow](https://github.com/oduwsdl/archivenow) as a dependency, which itself requires a python installation. It can be installed with
For a reasonably sized file, the process will take a long time, so this is more of a "fire and forget, and then come back in a couple of hours" tool. The process can be safely stopped and restarted at any point, and archive links are remembered, but the errors file is created again each time.
## To do
- Deal elegantly with images. Right now, they are also archived, and have to be removed manually afterwards.
- Possibly: Throttle requests to the internet archive less. Right now, I'm sending a link roughly every 12 seconds, and then sleeping for a minute every 15 requests. This is probably too much throttling (the theoretical limit is 15 requests per minute), but I think that it does reduce the error rate.
## sed -r 's/([0-9]*\.)/\n\1/g': Makes lists nicer.
## tr -s " ": Replaces multiple spaces
}
## Use: pandocodt FileNameWithoutExtension
```
Then, convert the file back to html with
```
function pandocmd(){
source="$1.md"
output="$1.html"
pandoc -r gfm "$source" -o "$output"
## sed -i 's|\[ \]\(([^\)]*)\)| |g' "$source" ## This removes links around spaces, which are very annoying. See https://unix.stackexchange.com/questions/297686/non-greedy-match-with-sed-regex-emulate-perls
}
## Use: pandocmd FileNameWithoutExtension
```
(this requires changing the name of the output file from `Source.md.longnow` to `Source.longnow.md` before running `$ pandocmd Source.longnow`)
Then copy and paste the html into a Google doc and fix fomatting mistakes.