# Probability spreadsheet

in-progressA *very* rough demo of the probability spreadsheet is available here.
The source code is available on GitHub.

I want to create a spreadsheet that shows the effect of income, voluntary repayments, and indexation rate on the time it takes to pay off a HELP loan. The motivator for this is the indexation of HELP loans to inflation, which is estimated to be between 6-7% this year.

A user should be able to enter income and indexation rate as a range, and the spreadsheet should show the time it takes to pay off the loan for each “combination”.

A similar problem exists when doing napkin maths for business planning. For example, a business might have “range estimates” for the following:

- monthly user growth (e.g. 10-20%)
- annual headcount growth
- salary increases
- churn

These “range estimates” can be described as a probability distribution, and the spreadsheet should be able to show the effect of each of these on the final result.

For example when trying to calculate the profit of a subscription-based business,

```
= subscriber_count * // subscriber_count is a probability distribution
income // price is a constant
price
=
expenses
( * // salary_increase is a probability distribution
salary_increase // salaries is a constant
salaries +
)
(* // headcount_growth is a probability distribution
headcount_growth * // headcount is a constant
headcount // average_salary is a constant
average_salary
)
= income - expenses // profit is a probability distribution profit
```

It would then be interesting to see the how each “uncertainty” affects the final result.

The core problem is estimation with uncertainty. I want to be able to create a Directed Acyclic Graph (DAG) of the relationships between variables, and then be able to propagate uncertainty through the graph.

This is a problem that I’ve encountered more than enough times now.

### Update

#### 19 April 2023

I’ve pivoted back to the original idea of a spreadsheet. I think it’s much easier
to suspend state and do “what-if” analysis in a spreadsheet. I’ve also decided
to *not* support natural language input. I’ve since discovered these wonderful
tools:

#### 13 April 2023

After thinking about this problem for while, I think a hybrid notepad/calculator (like Soulver) would be a better solution than a spreadsheet. Natural language input would be a nice-to-have, but not essential. It would be especially challenging to define a grammar for probability distributions.

An range estimate like “5 to 10 days” could be interpreted as a triangular distribution with a mean of 7.5 days. The calculator part would then show a sparkline of the distribution, with an option to change the distribution type and the parameters of the distribution.