ODDS AND PROBABILITES EXPLANATION

This method calulates the odds or probability that teams will be selected to and win the Championship
Tournament. A description of the method we use to calculate these probabilities and a list of assumptions
and limitations to this method are discussed below.

The short answer for describing the method is that we use a “Monte Carlo” simulation to predict all
outcomes. “Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms
that rely on repeated random sampling to obtain numerical results. The underlying concept is to use
randomness ------------. They are ---- are most useful when it is difficult or impossible to use other
approaches." --- Wikipedia.So let’s apply this to lacrosse, and specifically predict what is the
probability a team will be crowned champion.

What we know?: We know a team’s regular season schedule and scores of games already played.
What we do not know?: We don’t know the outcome of games yet to be played for the remainder
of the season and post season tournament games

If we can predict the scores of games to be played, we can predict who will be champion! But obviously
we cannot predict the future so where do we go from here? The answer is we use some basic assumptions
and hypotheses that, if true, will predict a probable outcome when computed over a large enough sample.
So let me repeat this in sport’s lingo. Every team plays a schedule of games and we guess at the score
of each game that have not been played for the remainder of the season. Based on these scores, we
predict who will get in the tournaments and advance to their final destination.

Now how do we predict the game score of a game yet to be played? We use a random number to
generate the winner for each game but we bias the score based on the strength of the two teams playing
each other. In this way, the outcome is semi-random, but favors the better team. How do we know the
strength of each team? We use a rating based on the RPI, SOSnd QWF of each team and reveals which team
is better and by how many goals. What this means is that if a team A is much better than team B, team A
will randomly win a disproportionate number of games to team B. On the other hand, if team A and B are of
qual strength, then team A and B will win about the same number of games if the sample is large enough.
Thus for a particular game a weak team will occasionally beat a stronger team, but if we were to generate
his game score multiple times the better team will win more often.

So if we apply this technique to one entire season and predict the outcome of remaining games, then
we will get results but they will be at the mercy of the random numbers selected and final results will
not be accurate. But what if we applied this technique to 100,000 seasons or simulations, where each
season is replayed with new random outcomes of game scores. Then the results are no longer dependent
on the random number but rather by the validity of the power ratings and other assumptions. In short,
we made an assumption that the RPI, SOS and QWF accurately predict the strength of teams and that by
running 100,000 simulations we managed to collect results that satisfy other guidelines (e.g., the NCAA
selection criteria) to predict the final outcome. So it’s possible that a weak team can get lucky in
defeating stronger teams all the way to the championship. It is highly unlikely though and that team
will have a low or zero probability of being champion. ON the otherhand no team is completely left out.

How do we get the final probabilities? We count for each simulation how many times a team advances
and wins the tournament. Then we divide these results by the total number of simulations. As an example,
team A wins the championship 10,000 mes out of 100,000 simulations. Then team A has a 10% chance
{10,000/100,000 )* 100} of being champion.

Where does this method break down? If fails if (1) the RPI, SOS and QWF do not accuratly rate the
the teams; and thus does not accurately represent the true strength of teams; the sample size is not
sufficient to reach ‘convergence’ (the results stop changing with an increased sample size) and possibly
other poor assumptions.