PodCacher Forums Forum Index
 PodCacher   Search   Memberlist   Usergroups   Register   Profile   Log in to check your private messages   Log in 
Show 257
Goto page 1, 2, 3, 4  Next
 
Post new topic   Reply to topic    PodCacher Forums Forum Index -> Show Discussions
View previous topic :: View next topic  

Author

Message

Sandy



Joined: 03 Feb 2006
Posts: 673
Location: San Diego, CA

PostPosted: Mon Feb 22, 2010 12:34 am    Post subject: Show 257

Reply with quote


The conversation continues here...
_________________
Cachin' with my sweetie...
Sandy of Team PodCacher

Back to top

View user's profile Send private message

batsgonemad



Joined: 30 Jan 2008
Posts: 279
Location: Northern Ireland

PostPosted: Mon Feb 22, 2010 1:36 am    Post subject:

Reply with quote


FTP, sweet, downloading show now, making coffee, then back to bed for a ittle while still
_________________

Back to top

View user's profile Send private message

batsgonemad



Joined: 30 Jan 2008
Posts: 279
Location: Northern Ireland

PostPosted: Mon Feb 22, 2010 4:29 am    Post subject:

Reply with quote


Ended up listening to the show on Stitcher
_________________

Back to top

View user's profile Send private message

ePeterso2



Joined: 30 Jun 2009
Posts: 143
Location: 26.1ºN 80.1ºW 98.6ºF

PostPosted: Mon Feb 22, 2010 8:20 am    Post subject:

Reply with quote


If anyone's interested in the math behind the One Million Caches forecast page, here's how it works ...

I started collecting data on Feb 8 after I first heard about the contest. I kept a record of the number of active caches each day, and I began plotting them on a graph (date on the horizontal axis, caches on the vertical axis). I figured that once I had enough data, I'd see if there was any kind of pattern.

When I had a week's worth of data, it seemed to be fairly linear - that is, the plot on the graph ended up pretty close to a straight line. But it's not exactly a straight line - there is a bit of variation from day to day. (The R^2 number is a measure of how good the estimate is - the closer that value is to 1.0, the better the forecast matches the actuals.)

I used a mathematical technique called linear regression to find the straight line that most closely fits the observed data and minimizes the variation. I then project that line into the future to find the point where it crosses the 1,000,000 mark. That's how I arrive at the most likely date.

Since the data seems to be linear, that means that the difference in the number of caches from one day to the next should be about the same. So I calculated the difference in the cache count each day, then used that to derive the standard deviation in the daily difference, which gives you an idea of how much the future data is likely to vary (assuming, of course, that the future data are measures of the same processes that generated the past data).

The upshot of all this is that the forecast date range is calculated by taking linear regression line, then adding and subtracting 3 standard deviations of the daily difference data. I don't know if that's a perfectly solid statistical basis for the prediction, but it's probably a reasonable estimate for what the true outcome will be.

(No, I didn't run control charts or a Monte Carlo simulation. I figured that by the time I got the Perl code for that to work, the contest entry date would be long gone Smile )

(FWIW - I ran an XmR chart on the data as of this posting and found that the data does display statistical control, passing all of the run tests. X-CL is 510, X-UCL is 1002, and X-LCL is 18 Smile )
_________________
Check out Puzzlehead.org
Information, resources, stories and fun for puzzle solvers and creators
Join the Puzzleheads group on Facebook!

Back to top

View user's profile Send private message Visit poster's website

CoronaKid



Joined: 04 Aug 2006
Posts: 908
Location: Corona, CA

PostPosted: Mon Feb 22, 2010 8:38 am    Post subject:

Reply with quote


ePeterso2 wrote:

The upshot of all this is that the forecast date range is calculated by taking linear regression line, then adding and subtracting 3 standard deviations of the daily difference data. I don't know if that's a perfectly solid statistical basis for the prediction, but it's probably a reasonable estimate for what the true outcome will be.

(No, I didn't run control charts or a Monte Carlo simulation. I figured that by the time I got the Perl code for that to work, the contest entry date would be long gone Smile )


I just KNEW statistics was good for SOMETHING other than telling you never to play the lottery. Finally, all those months of studying can finally pay off. Very Happy

I also now understand why you love puzzles so much. Nice post.

Back to top

View user's profile Send private message

ePeterso2



Joined: 30 Jun 2009
Posts: 143
Location: 26.1ºN 80.1ºW 98.6ºF

PostPosted: Mon Feb 22, 2010 9:20 am    Post subject:

Reply with quote


Most puzzle-solving is all about pattern recognition. It's a very handy and underappreciated skill, one I wish was emphasized in school more ...
_________________
Check out Puzzlehead.org
Information, resources, stories and fun for puzzle solvers and creators
Join the Puzzleheads group on Facebook!

Back to top

View user's profile Send private message Visit poster's website

Sonny
Site Admin


Joined: 03 Aug 2006
Posts: 1375
Location: San Diego, California

PostPosted: Mon Feb 22, 2010 1:53 pm    Post subject:

Reply with quote


ePeterso2 wrote:
If anyone's interested in the math behind the One Million Caches forecast page, here's how it works ...



Shocked Shocked Shocked

Wow. I'm impressed!
_________________
Have you found it yet?

Back to top

View user's profile Send private message Visit poster's website

ePeterso2



Joined: 30 Jun 2009
Posts: 143
Location: 26.1ºN 80.1ºW 98.6ºF

PostPosted: Mon Feb 22, 2010 7:19 pm    Post subject:

Reply with quote


Okay, you talked me into it - I switched the app to use a Monte Carlo simulation instead. I wasn't comfortable with the results ... they didn't "feel" right ...

If you really want to know all of the details, Wikipedia has an excellent article on it. Here's the synopsis:

The web page takes the historical data, figures out the average growth rate as well as the variability (standard deviation), then runs a simulation of how growth will happen in the future. It picks growth numbers at random based upon the average and the variability, then remembers how long it took the simulation to reach 1,000,000 caches.

If you run just a single simulation, it doesn't tell you a whole lot. But if you run a lot of simulations and add up the results, a pattern begins to appear in the results. In the web app's case, it runs 10,000 simulations every time you load the page.

What it reports is a table that shows the frequency with which each date ended up as the result of the simulation. So if a particular date shows up as the solution 2,700 times out of 10,000, then that date is estimated to have a 27% chance of being the actual date.

The result is something that looks a lot more reasonable to me ... and will be more accurate as we get closer and closer to the actual date.

The simulation runs from scratch each time you load the page, which is why there's a slight delay, and your results will look a tad different each time. But there's not really any benefit to running it repeatedly - the pattern is basically the same (which is the entire point of a Monte Carlo simulation).

Percentages are rounded off, so dates with likelihoods shown as 0% are actually nonzero, just really really small. If a date doesn't appear, then it never came up as a result of the simulation run.

The page also shows the mode, or the most likely outcome. This date has the best chance of being the actual date. However, a date that has a 25% chance of being correct still has a 75% chance of being incorrect. As we get closer to the real date, the likelihood of the real date should increase more and more.

http://www.epeterso2.com/geocaching/onemillioncaches/

(Yes, this actually is fun for me Smile )

-eP

Back to top

View user's profile Send private message Visit poster's website

CoronaKid



Joined: 04 Aug 2006
Posts: 908
Location: Corona, CA

PostPosted: Tue Feb 23, 2010 8:28 am    Post subject:

Reply with quote


Great show Triple S! Thanks for the nice interview with the German cachers. It's always interesting to get a global perspective of the sport. I'm glad to hear that Geothief was caught and I'm hopeful that it will help curtail such activity. I'm just worried that this slap on the wrist might only infuriate him even more and cause him to lash out even more. I guess time will tell.

BTW, am I the only one that has absolutely no idea what Sean said?? Laughing

Back to top

View user's profile Send private message

batsgonemad



Joined: 30 Jan 2008
Posts: 279
Location: Northern Ireland

PostPosted: Tue Feb 23, 2010 12:18 pm    Post subject:

Reply with quote


CoronaKid wrote:


BTW, am I the only one that has absolutely no idea what Sean said?? Laughing

uh no i have noidea either, maybe its cause we dont have our own kids
_________________

Back to top

View user's profile Send private message

addisonbr



Joined: 01 Mar 2008
Posts: 82
Location: New York, New York

PostPosted: Tue Feb 23, 2010 2:10 pm    Post subject:

Reply with quote


ePeterso2 wrote:
The result is something that looks a lot more reasonable to me ... and will be more accurate as we get closer and closer to the actual date.

It'll be interesting to see if human behavior mucks up the models. I know that when interesting waypoints roll around, a lot of people rapid-register a bunch of new caches to try to capture it... I'm wondering if people trying to publish the "1,000,000th Cache" will somehow cause new cache listing frequencies to deviate from the historical.

It's not obvious to me how it would, only that... it's always the non-obvious things that break my models.

Back to top

View user's profile Send private message

CoronaKid



Joined: 04 Aug 2006
Posts: 908
Location: Corona, CA

PostPosted: Tue Feb 23, 2010 2:19 pm    Post subject:

Reply with quote


batsgonemad wrote:
CoronaKid wrote:


BTW, am I the only one that has absolutely no idea what Sean said?? Laughing

uh no i have noidea either, maybe its cause we dont have our own kids


Well, I have two young kids so I don't have that excuse. I do remember that my wife and I were the only ones that understood what my daughter was saying.

I'll be amazed if anyone gets all 5.

Back to top

View user's profile Send private message

BuckeyeBeth



Joined: 27 Apr 2009
Posts: 8
Location: Columbus, Ohio

PostPosted: Tue Feb 23, 2010 7:03 pm    Post subject:

Reply with quote


Quick, somebody loan me a toddler so I can figure out what on earth Sean is saying! Razz

Back to top

View user's profile Send private message

Sonny
Site Admin


Joined: 03 Aug 2006
Posts: 1375
Location: San Diego, California

PostPosted: Tue Feb 23, 2010 7:35 pm    Post subject:

Reply with quote


BuckeyeBeth wrote:
Quick, somebody loan me a toddler so I can figure out what on earth Sean is saying! Razz


We've gotten lots of responses from various sources that people can't figure out what Sean is saying. Yes, it might be a good idea to go find a toddler and play the show for them and ask for an interpretation!

Realize, this one is a toughie. Sandy and I have to really pay attention and even then our only hope is that we can see what he's talking about and it's in context.

Bonne chance, Buena suerte and Sana swertehin ka
_________________
Have you found it yet?

Back to top

View user's profile Send private message Visit poster's website

Sonny
Site Admin


Joined: 03 Aug 2006
Posts: 1375
Location: San Diego, California

PostPosted: Tue Feb 23, 2010 7:37 pm    Post subject:

Reply with quote


batsgonemad wrote:
Ended up listening to the show on Stitcher


We were going to ask this on a show: How do you listen to the show? Some at computers, some on MP3 players, others ... ?
_________________
Have you found it yet?

Back to top

View user's profile Send private message Visit poster's website

Display posts from previous:   
Post new topic   Reply to topic    PodCacher Forums Forum Index -> Show Discussions All times are GMT - 7 Hours
Goto page 1, 2, 3, 4  Next
Page 1 of 4

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group

SoftGreen 1.1 phpBB theme by DaTutorials.com
Copyright © DaTutorials 2005