feat: add tex.
This commit is contained in:
parent
99fc9fc9ee
commit
ad65ae7e18
87
blog/2023/03/01/computable-solomoff/src/main.tex
Normal file
87
blog/2023/03/01/computable-solomoff/src/main.tex
Normal file
|
@ -0,0 +1,87 @@
|
||||||
|
\documentclass[12pt,authoryear]{elsarticle}
|
||||||
|
\makeatletter
|
||||||
|
\def\ps@pprintTitle{%
|
||||||
|
\let\@oddhead\@empty
|
||||||
|
\let\@evenhead\@empty
|
||||||
|
\def\@oddfoot{\centerline{\thepage}}%
|
||||||
|
\let\@evenfoot\@oddfoot}
|
||||||
|
|
||||||
|
%% The amssymb package provides various useful mathematical symbols
|
||||||
|
|
||||||
|
\begin{document}
|
||||||
|
|
||||||
|
\begin{frontmatter}
|
||||||
|
|
||||||
|
|
||||||
|
\title{A computable version of Solomonoff induction}
|
||||||
|
|
||||||
|
%% use optional labels to link authors explicitly to addresses:
|
||||||
|
%% \author[label1,label2]{}
|
||||||
|
%% \address[label1]{}
|
||||||
|
%% \address[label2]{}
|
||||||
|
|
||||||
|
\author[1]{Nu\~{n}o Sempere}
|
||||||
|
|
||||||
|
\address[1]{Quantified Uncertainty Research Institute, Mexico}
|
||||||
|
|
||||||
|
\begin{abstract}
|
||||||
|
I present a computable version of Solomonoff induction.
|
||||||
|
\end{abstract}
|
||||||
|
|
||||||
|
\begin{keyword}
|
||||||
|
%% keywords here, in the form: keyword \sep keyword
|
||||||
|
Solomonoff \sep induction \sep computable
|
||||||
|
\end{keyword}
|
||||||
|
|
||||||
|
\end{frontmatter}
|
||||||
|
|
||||||
|
\section{The key idea: arrive at the correct hypothesis in finite time}
|
||||||
|
|
||||||
|
\begin{enumerate}
|
||||||
|
\item Start with a finite set of turing machines, $\{T_0, ..., T_n\}$
|
||||||
|
\item If none of the $T_i$ predict your trail bits, $(B_0, ..., B_m)$, compute the first $m$ steps of Turing machine $T_{n+1}$. If $T_{n+1}$ doesn't predict them either, go to $T_{n+2}$, and so on\footnote{Here we assume that we have an ordering of Turing machines, i.e., that $T_i$ is simpler than $T_{(i+1)}$}
|
||||||
|
\item Observe the next bit, purge the machines from your set which don't predict it. If none predict it, GOTO 2.
|
||||||
|
\end{enumerate}
|
||||||
|
|
||||||
|
|
||||||
|
Then in finite time, you will arrive at a set which only contains the simplest TM which describes the process generating your train of bits. Proof:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item Let $T_j$ be the simplest machine which describes the process generating your trail of bits. Then by virtue of it being such, all TM machines $\{T_1,...,T_{j-1}\}$ are not the simplest machine generating your trail of bits. Meaning that at some point, they predict something distinct from your trail of bits, meaning that at some point step 2. of the process above discards them. In particular, the step after discarding $T_{j-1}$ is arriving at $T_j$, and staying there forever.
|
||||||
|
\item Therefore, this process arrives at the simplest TM which describes the process generating your trail of bits in finite time.
|
||||||
|
\end{itemize}
|
||||||
|
QED.
|
||||||
|
|
||||||
|
\section{Using the above scheme to arrive at a probability }
|
||||||
|
|
||||||
|
Now, the problem with the above scheme is that if we use our set of turing machines to output a probability for the next bit
|
||||||
|
|
||||||
|
$$ P\Big(b_{m + 1} = 1 | (B_0, ..., B_m) \Big) := \frac{1}{n} \cdot \sum_0^n \Big(T_i(m+1) = 1\Big) $$
|
||||||
|
|
||||||
|
then our probabilities are going to be very janky. For example, at step $ j - 1 $, that scheme would output a 0\% probability to the correct value of the next bit.
|
||||||
|
|
||||||
|
To fix this, in step 2, we can require that there be not only one turing machine that predicts all past bits, but multiple of them. What multiple? Well, however many your compute and memory budgets allow. This would make your probabilities less janky.
|
||||||
|
|
||||||
|
Interestingly, that scheme also suggests that there is a tradeoff between arriving at the correct hypothesis as fast as possible—in which case we would just implement the first scheme at full speed—and producing accurate probabilities—in which case it seems like we would use the modification just outlined.
|
||||||
|
|
||||||
|
\section{A downside}
|
||||||
|
|
||||||
|
Note that a downside of the procedures outlined above is that at the point we arrive at the correct hypothesis, we don't know that this is the case.
|
||||||
|
|
||||||
|
\section{An irrelevant epicycle: dealing with Turing machines that take too long or do not halt.}
|
||||||
|
|
||||||
|
When thinking about Turing machines, one might consider one particular model, e.g., valid C programs. But in that case, it is easy to consider
|
||||||
|
|
||||||
|
\begin{verbatim}
|
||||||
|
void main(){
|
||||||
|
while(true){
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
|
||||||
|
\end{document}
|
||||||
|
|
||||||
|
\endinput
|
||||||
|
%%
|
||||||
|
%% End of file `elsarticle-template-harv.tex'.
|
1132
blog/2023/03/01/computable-solomoff/src/old.text
Normal file
1132
blog/2023/03/01/computable-solomoff/src/old.text
Normal file
File diff suppressed because it is too large
Load Diff
Loading…
Reference in New Issue
Block a user