I write this in Singapore, city number three in a ten city voyage obtained via brute force enumeration of route permutations for a set of cities I desired to visit in Southeast Asia. I ended up purchasing 11 one-way tickets, starting and ending in San Francisco. Here I’ll describe my route and give a high level algorithm.
My route, which I generated algorithmically, is the following:
Depart from San Francisco, United States [Jul 30] (via Philippine Airlines)
- Tokyo, Japan for 1 night (layover) [Aug 01] (via Jetstar Asia)
- Manila, Philippines for 5 nights [Aug 05] (via Jetstar Asia)
- Singapore, Singapore for 7 nights [Aug 13] (via Jetstar Asia)
- Ho Chi Minh City, Vietnam for 4 nights [Aug 17] (via AirAsia)
- Bangkok, Thailand for 4 nights [Aug 21] (via AirAsia)
- Phuket, Thailand for 4 nights [Aug 26] (via AirAsia)
- Kuala Lumpur, Malaysia for 3 nights [Aug 30] (via AirAsia)
- Phnom Penh, Cambodia for 4 nights [Sep 03] (via AirAsia)
- Jakarta, Indonesia for 5 nights [Sep 08] (via AirAsia)
- Bali, Indonesia for 5 nights [Sep 14] (via Garuda Indonesia)
Arrive at San Francisco, United States [Sep 14]
The total cost of airfare was $1670 USD, a good majority of the cost coming from the two tickets out/into SFO (San Francisco). It is worth noting this cost isn’t particularly optimal as the tickets were purchased only 6 weeks in advice, and fare prices during July/August tend to skyrocket.
Consider the traveling salesman problem (TSP) for this scenario: given N=10 cities, find the optimal route that reaches all cities (without revisiting) which minimizes some cost function. In this case, the TSP search space can be amended to include a few more dimensions that influence cost. Most importantly, these dimensions are outbound date and time until departure (i.e. days in advance a ticket is purchased), as prices heavily fluctuate based on these factors. Using historical data, one can build a model and predict the fare prices as well as time to buy, however that’s not the focus here (although worth a separate discussion). Quite simply, pick your favorite regression to do this. Additionally I imposed a days-per-city constraint: I wanted to spend between a=4 and b=8 days per city, so this was included as a per-city dimension.
I implemented the above in Python using historical data of Adioso, whom have the best fare coverage over Southeast Asia–although as a data scientist at Adioso, I am biased. I generated a couple dozen candidate routes and programmatically validated the fares (prioritizing direct flights), and purchased the “best” route by hand.
Once my trip is farther along or complete, I’ll point out some pros and cons of this approach and describe two philosophies I have used for traveling: stochastic local search and global optimization (a la the traveling salesman problem).