“I think back to this example when I was at HotelTonight where we built this beautiful statistical model to forecast hotel prices because we were trying to figure out what to price hotels out on different days. It was the heaviest data science I’ve done in my career from technical complexity, and it performed terribly. And we went and looked at it. We were just like. “Why is this doing so badly?”
And we found out that one of the main drivers of hotel room prices is when conferences are in town. I know that’s obvious. When Dreamforce happens, it affects San Francisco hotel prices. If that’s not in your forecast dataset, it’s pointless to build a nice model.
So what we actually needed to do was build a process to understand the levers that weren’t in our data set and make data so that we could use a more naive forecasting. So we ended up with a logistic regression, but we ended up with a new business process around flagging events.
That is data-driven to me. It’s not smarter. It’s applying the concepts better”