Least cost paths have been used extensively in the archaeological study of ancient routeways. In this paper the principal interest is less in tracing detailed paths than in modelling long-distance travel through an extensive network over land and water. We present a novel, computationally-efficient method for avoiding the direction-dependent, positive biases in least cost paths encountered in standard algorithms. A methodology for generating networks of such paths is introduced based on a trade-off between building and travel costs, minimizing the total cost. We use the Peutinger Table, an illustrated

This study was prompted by an interest in how ancient journeys were planned and undertaken, with an emphasis on long distance travel taking days, weeks or even months. A number of questions arise, for example which itineraries travellers most likely followed, how long the journeys took, and whether it was more expedient to go by land or sea. In the early nineteenth century, antiquarians, guided by the ancient writings of the likes of Strabo and Pausanias, documented their own travels in Greece and Asia Minor, making notes on routeways, journey times, topographical features and ancient monuments. Among the better known are William Leake, William Gell and Edward Dodwell who put themselves in the shoes of their ancient guides, experiencing, as near as is possible, the paths and landscape described two millennia earlier.

Today it is possible to apply Geographic Information Systems (GIS) to model an optimal routeway and travel time between places through least cost path (LCP) analysis. However, simulating multi-stage journeys through an optimized network of paths over large distances is not easily addressed using GIS and for the following reasons we found it necessary to develop a bespoke methodology and software: (a) to allow the implementation of innovative optimization techniques; (b) to automate the process of handling, selecting and ordering large amounts of data; and (c) to speed up the intensive computation necessary to calculate the tens of thousands of LCPs over hundreds of thousands of kilometres, with over five million cost values.

The specific research aim was to create a tool to reconstruct the itineraries followed by groups of state-sponsored religious envoys travelling from Asia Minor to multiple cities across the Hellenistic world of the late third century BCE. To address this it became clear that separate optimizations must be applied at various stages and at different scales, from LCP modelling to large-scale network formulation and routing through a set of destinations on that network. In the spirit of Verhagen’s exhortation (

Computational techniques to improve our understanding of the ancient world have gained much traction in the last two decades. In the field of archaeology in particular, LCP and network analysis have been used with the aim of elucidating different types of connectivity, often in the form of physical, civic, economic, cultural and social interaction (

Following the ‘Principle of Least Effort’ (

The availability of tools to calculate LCPs via GIS means that they are easily accessible to the archaeologist or historian. However, incomplete understanding of how LCP applications work may lead the user to produce aesthetically-pleasing graphics despite erroneous inputs, in part due to them being treated as a ‘black box’. A number of publications have addressed the pitfalls associated with LCP analysis (

Generation of LCPs is often limited to those between a small number of places due either to computational costs, or a confined geographical area of interest. For example, McHugh (

On the other hand, a network stored as a matrix of values opens the door to various types of mathematical manipulation, e.g. the generation of metrics to characterize connectivity or complexity, the clustering of nodes or optimal routing through nodes. Whereas there is a clear optimization methodology for individual LCPs, the cost, or efficiency, metric for network optimization is less well defined. The construction of different least cost networks is addressed in Herzog (

Python is used to code the elements which extract and organize the data and to call a Fortran routine which performs the computationally-intensive, heavy lifting of LCP generation, since the latter language executes much faster. The use of Python also allows the Fortran to run in parallel on multiple processors, accelerating execution significantly and extending the overarching principle of cost minimization to our computational approach.

The methodology implemented can be divided into four main steps: (a) Minimize the cost of travel between known ancient settlements avoiding the direction-dependent biases in LCPs encountered in standard algorithms; (b) generate a

As in GIS software, Dijkstra’s algorithm (

Efficiency of a path is defined in terms of minimization of the cost function with which the accumulated cost of walking along different routes is calculated. Most commonly, time elapsed is the measure of cost, the program giving the options of using the Tobler hiking function (

The standard approach to creating an LCP is employed initially, briefly described as follows: Working outwards from the defined start point, multiple costs are calculated for each point on the grid, representing accumulated values for different routes. Any newly-calculated value replaces a pre-existing one if it is lower. Dijkstra’s algorithm makes this process efficient by greatly reducing the number of different routes which must be explored. When a lower cost to reach a given point is discovered, the direction taken in the final step to reach it is recorded in a separate,

A key consideration is how to define the edges connecting neighbouring points. The most common technique involves the so-called queen’s move pattern, in which each of the immediately adjacent eight points on the grid is connected. This restricts changes in direction in any LCP to 45 degrees with the result that the cost of reaching a destination lying between the eight compass points is artificially increased. A map of isochrones of time elapsed walking on a flat surface shows concentric octagons rather than the circles which would result from an analytical solution (e.g.

The calibration of network complexity covered later in this paper requires that LCPs be made as optimal and with as little bias as possible. Therefore a method has been derived of largely eliminating such artificial, direction-dependent cost penalties at little extra computational cost. Termed the

Derive the LCP using the queen’s move.

Break this path into sections – most experiments used a section length of 1 kilometre (A to B in

Create a separate high-resolution grid which encompasses each rotated section. This has a grid spacing

Populate each high-resolution grid with linearly-interpolated elevation values mapped from the main DEM grid.

Re-apply Dijkstra’s algorithm to derive a cost field, thence a high-resolution LCP from the start to end point on the high-resolution sub grid. The original LCP route section is discarded, having served its purpose in giving start and end points as well as defining the grid dimensions.

Construct a new path by taking every

Following this new path, recompute the cost from point to point, for the full journey. If the cost function minimized with Dijkstra is energy expended, there is the opportunity here of computing the time taken to follow this path using one of the time-based cost functions.

The rotation step solves the standard queen’s move distance bias for travel on flat surfaces, leading to straight paths and isochrones in concentric circles rather than polygons.

However, the deviations from straight-line paths still lead to errors, which are significantly lessened by the oversample step. The end result is a path which is constrained to follow the general route of the originally-derived LCP, but in which directional changes of about 45/

Rotate and oversample LCP (thick) compared with standard queen’s move LCP (thin).

There is a special problem associated with paths in domains in which there is perfectly uniform (or no) terrain slope. In this situation, with no variation in forcing from topography, there are many equally long queen’s move solutions to the shortest path and, as Herzog (

It should be noted that an LCP between two settlements varies according to which is nominated as the start point due to the anisotropic treatment of cost on slopes, with the journey in one direction occasionally differing substantially in its route from that in the opposite direction. In order to generate a plausible road network, we follow Herzog (

It is possible to reduce computational time considerably by applying what will be called an

One may question the degree to which real routes achieved the degree of optimization represented in LCPs. Even today, someone travelling by road from one town to another rarely takes an optimal route, instead being constrained to move through a network of roads which generally forces an approach to the destination which is indirect to some degree. The complexity of the network is a strong determinant of how direct longer routes or itineraries are, so to recreate the degree of connectedness of settlements across a wide region in the ancient world, a suitable network must be created. Networks lying between the two extremes of ‘least cost to the builder’ and ‘least cost to the user’ (

Here we introduce a methodology that addresses these notional costs, which we call

Consider three sites, A, B and C as in the schematic at

For the creation of a road to be cost-effective the total benefit should be positive, i.e.

We will use the term ‘build cost’ to include the initial construction cost as well as the ongoing maintenance and repair cost. Whilst construction might be thought to involve a one-off, capital cost, we assume that this is spread over an extended period of time, so that like travel costs, build costs are deemed to be continuous. The build cost of a road might at first be thought of as proportional to the geometric length of the road rather than time taken or energy expended to walk it. However, it seems logical that, like travel, road building is more challenging on difficult terrain, so for our purposes build cost per unit length of road has been treated as being proportional to time or energetic cost of travelling that length. The term

The formula could be made more elaborate by factoring in the traffic volumes on the road, which could be made proportional to the population of the sites A and C. However, in the absence of population data and to keep things as simple as possible, we reason that not only cost of travel, but also of road building is likely to be in some degree proportional to traffic levels, so that this factor would tend to cancel out in

Denoting the ratio

In order to make computation manageable direct links are limited to those which can be made within some defined cost termed the maximum temporal radius, usually set to 36 hours. This is tunable, and at first 12 hours was specified on the basis that this was the approximate limit of a day’s travel, and that travellers would not want to overnight between settlements. However, where settlements are sparsely distributed it left holes in some networks, and it was reasoned that there would likely be stations or small settlements of which we have no record. Two methods for implementing the cost minimization technique were coded,

The LOS method requires that

The GPO method replicates a central authority in decision making. It involves looking across the whole network and finding the road, the establishment of which would confer most benefit per unit distance to the two nodes at either end, using benefit =

By specifying varying values of

A weakness of the technique is the failure to re-use sections of existing roads in establishing new routes by defining junctions, as is done e.g. in the Steiner tree approach (

Ideally, known ancient roads and networks of roads would be put to this purpose, but their fragmentary nature makes this unviable. Instead we can turn to the ancient

We took 117 route distances in Roman miles from the TP for a region extending from Sicily to Cyprus, this region chosen to correspond with the simulated network for application to itineraries of the earlier Greek world. Network nodes were extracted from the Pleiades gazetteer of ancient places (

Different networks of least cost paths (LCPs) were derived across the region based on the

Ultimately, consideration of the quality of the TP data should be made in light of the fact that the comparisons here are not made against truth, which cannot be known, but against the LCP network, which doubtless has its own bias and noise. However, given the wide range of ratios and evident errors in TP values, with e.g. around 25% of the sample being shorter than geodesic (effectively straight-line) distances, it is considered that the large majority of the variance comes from them rather than the LCPnet values. The large spread in ratios and the presence of gross errors in the TP may seem to militate against extracting value from them; however, it is a fundamental tenet of statistics that poor quality of data can be compensated by quantity, and we contend that judicious use of diagnostics can yield a useful signal sufficient for our purpose. At the same time something about the nature of the errors in the TP can be inferred.

Errors in the TP values could have many different causes, such as transcription mistakes (

To test this, an iterative process was adopted in which incremental changes were made independently to ^{2} are retained, causing the composite modelled population to converge towards the actual population until no significant improvements are made. The distribution was found to be fitted best to a composite distribution consisting of the sum of two unbiased (^{2} = 0.981) gives confidence that the modelled distribution approximates the observed one well, indicating that 98.1% of the observed frequency within bins can be explained by the idealized distributions. Two of the values not accounted for in the Gaussian populations lie outside of the plotted range shown, with errors up to a factor of 5.

The network generated by

However, Pikoulas (

Whilst the TP depicted a system of roads on the grand scale, it is easy to imagine that some of the individual edges grew out of minor routes serving local networks, but which came to achieve a greater significance when repurposed and upgraded as arteries in an empire-wide network. The better fit provided by the LOS network generation method over GPO gives some support to this idea, GPO having 29 results within 5% of parity between TP and LCPnet, versus 33 for LOS, marginally suggesting that the network evolved principally from local rather than large-scale organization. This contrasts with the findings of Prignano

One might speculate that the Roman roads on which the TP distances are based could have been on average shorter than LCPs since Romans roads are known for following more direct courses than their predecessors, their surveyors realizing impressively rectilinear constructions, sometimes for tens of kilometres at a time (

Salway (

Verhagen, Polla & Frommer (

The applicability to earlier periods of a value of

A network of water routes (over sea and lakes) was constructed using

No account of prevailing winds or currents was taken, as is done e.g. in the ORBIS tool (

In view of these difficulties, a constant, adjustable speed was specified for the network, which could be tuned up or down when combined with the land network values to form a super network. A small random perturbation of between plus and minus 1% is added to the distances between grid points resulting in trajectories which are not quite straight. Largely cancelling in aggregate, these perturbations change journey times by much less than 1%. Clearly, tacking courses adopted when sailing into the wind are not represented, the velocity being interpreted as an effective velocity, or ‘velocity made good’ as defined by Whitewright (

In attempting to reconstruct the routes taken by ancient travellers, the relative cost of land and sea travel becomes an important consideration affecting decisions ranging from whether an individual leg of a journey is more likely to have been made by land or by sea, to the whole order of an itinerary.

In order to calibrate the relative costs of land and sea travel, we turn to the Delphic

This can be tackled as a variant of the classic travelling salesman problem (TSP), in which the most efficient (least cost) order in which to visit a set of locations is established, starting and finishing at the same point. Specifically, this is an asymmetric travelling salesman problem because costs between nodes vary with direction. The TSP can be solved for a small number of destinations (less than about 14) by using a brute force method whereby the cost of every ordering permutation is compared and the lowest selected. Since computation time is proportional to the factorial of the number of destinations, this method soon becomes impractical as the number increases. For example, going through permutations on a computer we were able to solve the TSP by comparing all possible routes for the nine locations in Cyprus (

For our purposes we depart from the pure TSP in two respects: in allowing the route to go through the same place twice if this produces a shorter overall journey, and in allowing itineraries in which start and end locations are not the same. The solution algorithm is presented with an anisotropic matrix of costs between the nodes representing the destinations. The algorithm orders these in the most efficient way, then the network backlink matrix is used to find the intermediate nodes from the full network and the journey can be reconstructed. These intermediate nodes themselves may coincide with destinations, in which case the route is allowed to pass through that destination more than once. If it is required to specify which is the last place visited, the cost matrix is altered so that the cost from the nominated last destination to the start point is very small (e.g. 0.1 seconds), whilst those to all the other destinations from the nominated last place are very large (e.g. 1000 years).

Just as one may question the extent to which real paths between two nodes of a network really were optimal enough to be considered LCPs, it is a matter of conjecture whether ancient travellers would even have the data to compare the efficiency of different itineraries, let alone be able to solve what can be a computationally challenging task. However, the DTL itineraries are considered more useful than an account of a one-off journey since they result from the experience of regular journeys over multiple years during which there will have been ample opportunity for fine tuning.

Mount Pachnes (2453 m) lies between the Araden/Anopolis area (numbers 8 and 9 in top map) and Kydonia (10), forcing an indirect route which goes through Aptera (11), resulting in a double visit here. The TSP solution based on a more complex network of

It is documented that there was an overland

In order to test the sensitivity to networks of differing complexity, the experiment was repeated with networks of

Threshold sea speeds (knots) at which TSP routes become similar to Delphic

2.5 | 2.5 | 2.5 | |

1.9 | 2.2 | 2.3 | |

Overall the TSP experiments provide some data points which suggest that a sea speed in the range of around 2.0 to 2.5 knots should be specified in a combined land/sea network based on time costs. In addition to fitting centrally in Whitewright’s (

Note that it is the ratio of land to sea speed which is important in this context, rather than absolute sea speed. Tobler’s hiking function (

This study has brought to bear three different quantitative optimization procedures on the study of ancient travel. The well-known least cost path technique has been extended by a

The same principles outlined in this paper could be applied to other geographic areas, such as the wider Roman empire, as well as to later or earlier periods. A source of data on node locations, such as Pleiades, is necessary along with a suitable DEM and some means of estimating an appropriate network complexity metric,

The authors are willing to make available the software used to generate networks of least cost paths upon reasonable request.

The technique relies firstly on establishing the

Two alternative algorithms have been coded for network construction. The first is termed

Create a new ^{10} was used in the experiments.

Calculate the matrix BENEFIT = COST/FGCOST – (1 + 1/

Run the local elements of the COST matrix (within MAXRAD hours of the nodes at either end of the route just established) through Dijkstra’s algorithm to update network distances in the light of the latest route added. Go back to step (2).

The GPO method has some similarities to the EE (

A second approach named

Set

Examine values of FGCOST(

Establish an LCP to the closest

For each subsequent destination

Increment

The LOS method differs from the GPO method in placing the decision-making for road building at the level of the individual settlement, with regard only to optimization of routes from each settlement separately. The order in which the start-point settlements is taken makes no difference to the final network. It has the important limitation of only allowing networks for which

When calculating the cost surfaces for the purposes of establishing LCPs, the most easily quantified costs are time or energetic output. In reality, a whole host of costs can simultaneously be attributed to travel including monetary expenditure, organizational effort, and, along with a direct energetic cost, the mental and physical stress of a journey. A second class of cost is the risk of mishap, such as falling prey to pirates or brigands, or of shipwreck. Whilst these eventualities will not materialize on most journeys, even a low probability that they might can be treated as a continuous cost.

It is not easy to represent these costs without introducing complexity, much less to quantify them accurately. However, we can do so approximately and with minimal complexity by introducing the concept of

Taking the example of time (

Generalizing to include other costs, we define a

where _{v}

where

When ancient travellers made journeys, it seems likely that their choice of route and mode of travel would be made not just on the basis of time or effort, but on an amalgam of costs as outlined above. It might be the case that a leg of a journey could be made more quickly by sea than by land, but that the traveller would opt to go by land because it was safer. Alternatively, it might be quicker to go by land than by sea but the traveller might chose the sea route because it involved less energy expenditure.

When using the Delphic

The authors have no competing interests to declare.