Strata, Scarabs and Synchronisms: A Framework for Synchronizing Strata and Artifacts

This paper presents a formal framework for synchronizing strata and datable artefacts in multi-layered sites. We first present a simple set of rules regarding the definition of safe termini post quem, taking into account cases of uncertain dating and/or uncertain stratigraphic attribution of the artifacts. We then propose a definition of chronologically optimal termini post quem, and a procedure to represent these optimal termini graphically by a step function in a two-dimensional graph. We also propose a definition of chronologically critical artifacts, as a minimal set of artifacts that determine all the optimal termini post quem of a stratigraphic sequence. Finally, we define a measure of the robustness of a terminus post quem, expressed in terms of the number of different artifacts supporting this terminus. We illustrate our approach through the case study of Egyptian scarabs from the site of Beth Shean (northern Israel), a well-known Bronze and Iron Age site that hosted an Egyptian garrison during the New Kingdom (ca. 1540–1070 B.C.E.). We also provide a software utility which assists users in applying our methodology.

explicitly addressed. Neither is the question of how to deal with uncertainties in the dating or stratigraphic affiliation of the artifacts. These questions will be addressed in Section 3. This paper will not address the question of termini ante quem, i.e. upper bounds on the dating of a layer, since these cannot directly and safely be deduced by the artifacts present within the layer.

Statistical methods
Research on stratified artifacts used also statistical techniques in order to identify the presence of non-indigenous artifacts (mainly pottery sherds) among large sets of retrieved items (Gerrard 1993, see also Shot 2010 and references therein). Such studies use, among other techniques, statistical measures of the diversity and heterogeneity of ceramic assemblages and provide mean dates for the deposition of the associated layers. These measures are then compared with each other using statistical tests and confronted with the relative chronology provided by the Harris matrix in order to identify the probable presence of significant amounts of non-indigenous sherds in a deposit. These powerful techniques necessitate relatively large sets of data for the statistical tests to be significant. The techniques presented in this paper are non-statistical however, and aim at a more rigorous definition, graphical representation and evaluation of the quality of TPQs, rather than at the identification of nonindigenous artifacts.

Seriation techniques
A classical way to provide relative chronology between artifacts, even in the absence of stratigraphic constraints provided by the Law of Superposition, is seriation (O'Brien andLyman 1999, Lipo, Madsen andDunnell 2015). In its variant called frequency seriation, the relative frequencies of each type of artifacts found within the same layer are computed and presented in chart form. The relative order between these layers (possibly coming from different sites, thus not featuring a relative stratigraphic order with each other) can then be established using the hypothesis of unimodality of artifact production. This hypothesis states that production of a specific type of artifact begins in small numbers, then reaches a unique peak, and finally decreases, thus producing a bell-shaped curve. The notion of unimodality refers to the assumption that only one such peak is attained for each type of artifact. This enables a reordering of the different contexts in such a way as to make the relative frequencies of each type appear as a bell-shaped curve, and thus to obtain a relative dating for these contexts. The techniques presented in this paper will not assume unimodality of artifact production. They will only depend on the possibility to assign an earliest possible production date for each artifact.

Radiocarbon dating
Another standard way of anchoring a relatively ordered stratigraphic sequence in absolute time is radiocarbon dating. Some might argue that artifact-based dating of archaeological contexts in the age of radiocarbon dating is obsolete (see discussion in O'Brien and Lyman 1999, p. 226). We think otherwise. First, radiocarbon samples from secure contexts are not always available, or not always present for each stratum of interest. Furthermore, it is important, whenever possible, to have another time scale for a site, independent of radiocarbon dating, for purposes of cross-checking. Discrepancies between the historical and the radiocarbon dates then point to a problem in (at least) one of these two bodies of evidence, thus warranting further research into detecting the source of the disparity. Finally, and more importantly, the derivation of secure TPQs is important as these can be included, as Bayesian priors, in the radiocarbon calibration process itself, in order to refine that dating intervals provided by the radiocarbon method (Bronk Ramsey 2009). In such cases, potentially large intervals can sometimes be significantly reduced thanks to the TPQs.

Temporal logic
The mathematical field of temporal logic offers a formal way to deal with time intervals and the relations between them. As such, the concepts, notations and algorithms developed in this field can find direct applications in archaeological dating. Yet, few of these techniques have penetrated into standard archaeological practice. The seminal works of Allen (1984Allen ( , 1991 in the field of artificial intelligence have provided us with a set of temporal relations on intervals, and composition rules between them. Holst (2004) later showed the applicability of these concepts for archaeology chronology. More recently, Geeraerts, Levy and Pluquet (2017) showed how further temporal logic techniques can be applied to describe complex chronological models in archaeology, to compute bounds on dates and durations in an automated way, and to check the coherence of a set of temporal assumptions. Since the present paper uses only one type of temporal relation, namely the terminus post quem, we will refrain here from using formal mathematical notations. The other notions presented here, such as the step function and the robustness of a TPQ, although deeply rooted in mathematics, will rather be explained in plain English and illustrated by examples. In this we follow the philosophy of Harris's book (1989), which exposed notions deeply rooted in mathematics (such as directed acyclic graphs, transitive reduction and topological sorting), yet explained them in natural language, without mathematical notations or technical terms. This approach enabled generations of archaeologists with no advanced mathematical training to easily understand and apply these notions.

Outline of the paper
We first introduce our case study: Egyptian royal scarabs from the site of Beth Shean (Section 2). We then present a simple set of rules regarding the definition of safe TPQs, including specific cases of uncertain dating or stratigraphic attribution of the artifact (Section 3). We propose a definition of chronological optimality of TPQs, and a definition of chronological criticality of artifacts (Section 4). We also propose a measure of robustness of TPQs in order to quantify their strength (Section 5). We then propose a general discussion of the chronology of Beth Shean with regard to the TPQs derived from Egyptian scarabs (Section 6). We end by presenting the "TPQ Composer", a software utility that applies our methodology and produces a graphical representation of TPQs, computes their robustness and identifies critical artifacts (Section 7).
In this paper, the word "stratum" will not refer to single, low-level, stratigraphic units, or loci, but rather to larger chrono-stratigraphic units, otherwise often termed "phases" or "levels" (Harris 1989, p. 108-109). This use is in keeping with the practice of our case study (Pennsylvania Museum excavations of Beth Shean, see below), where the successive site-wide levels are termed "strata".

Case study: Egyptian scarabs from Beth Shean
Our case study concerns the site of Beth Shean (northern Israel), a well-known Bronze and Iron Age site that hosted an Egyptian garrison during the New Kingdom (ca. 1540-1070 B.C.E.). This site produced more inscribed Aegyptiaca than any other site in the southern Levant (Mazar 2011, Levy 2017, including a large number of scarabs. The Beth Shean layers we are interested in are Strata XB, XA, IXB, IXA, VIII, VII, VI and Late VI, covering the period from the Late Middle Bronze IIB to the Iron Age IB (see Table 1). 1 For the sake of simplicity, we first restrict ourselves to scarabs bearing a readable royal name, i.e. we exclude scarabs whose dating is obtained only through stylistic considerations. The full corpus of Egyptian scarabs from Beth Shean has been masterfully gathered by Othmar Keel (2010). Going through his corpus, we have identified 25 items bearing a clearly readable royal name (Table 2), excluding surface finds, scarabs originating in tombs (for which no stratum is usually provided by Keel), and scarabs which Keel mentions as being intrusions. The choice of keeping only scarabs bearing a royal name has been taken here for the sake of simplicity, since our goal is to illustrate the proposed methodology, rather than offer new results regarding Beth Shean. Royal names can provide secure TPQs (provided they originate in secure contexts), since a scarab bearing a monarch's name, although possibly being later than the monarch's reign, cannot antedate it.
One of our goals will be to provide secure TPQs even in cases where the precise stratum or precise reign is not certain. These cases (shown in bold in Table 2) include scarabs originating in "Stratum IX" (thus featuring an uncertainty as to whether their origin is Stratum IXB or IXA, see Table 2, no 1, 3, 15, 17) and in "Stratum VIII-VII" (no 9). They also include one scarab bearing the sole name "Ramesses" (no 4), which could correspond to several different pharaohs bearing that name (Ramesses I to Ramesses XI).

Termini Post Quem
The following set of rules enables us to define safe synchronisms between strata and datable artifacts: Rule 1 (Basic rule). An artifact from the reign of King K found in Stratum S is formalized as "Stratum S ends after the start of King K's reign".
Since a stratum cannot end before the accession date of a king whose artifact was found in it, the start of the reign provides a terminus post quem for the end of the stratum.
Notice that considering the stratum as contemporaneous with the given reign is a common mistake, still occasionally found in the literature. The only case where the given TPQ would actually provide the real date of the stratum's end is the (unlikely) case where the artifact would have been manufactured at the very start of the king's reign and would have reached the stratum immediately after its manufacture, and would have been immediately buried during that stratum's demise/destruction. The probability that these four events (king's accession, manufacture of the scarab, arrival of the scarab to the stratum, demise of the stratum) are synchronous is small. Most probably, the actual terminal date of the stratum is thus significantly later (i.e. by at least a few decades) than the given TPQ. We now wish to extend Rule 1 to cases where a certain degree of uncertainty exists regarding the artifact's original stratum or associated reign. Such artifacts, as well as artifacts suspected of being heirlooms, are often simply discarded as unfit for chronological considerations. We argue otherwise, noting that such artifacts can indeed serve as chronological markers, provided the uncertainty is limited. Let us start by extending Rule 1 to cases of uncertain reign: Rule 2 (uncertain reign). An artifact of unknown reign but having King K as earliest possible reign, found in Stratum S, is formalized as "Stratum S ends after the start of King K's reign".
Rule 2 is useful in particular for scarabs only datable to a specific dynasty, part of a dynasty, or when the royal name is only partly legible, making the identification of the precise king equivocal.
Rule 1 can also be extended to cases of uncertain stratum: Rule 3 (uncertain stratum). An artifact of King K found in an unknown stratum, but with Stratum S as the latest possible stratum, is formalized as "Stratum S ends after the start of King K's reign".
Rule 3 is handy for cases where the excavator hesitates between several strata for his artifact (for example, in the case of scarabs found under the floor of a house).
Finally, the rule can be extended to cases of uncertain reign and stratum: Rule 4 (uncertain reign and stratum). An artifact of unknown reign but having King K as earliest pos sible reign, found in an unknown stratum, but with Stratum S as latest possible stratum, is formalized as "Stratum S ends after the start of King K's reign".
Note that these TPQ rules are much more cautious than positing a simple contemporaneity relationship between the artifact and its stratum of discovery, as sometimes is mistakenly done.
In summary, the above rules (1 to 4) simply amount to building a TPQ that combines the latest possible stratum of the artifact with the earliest possible reign of the associated monarch. It can also be noted that our rules do not require knowledge of an earliest possible stratum nor of a latest possible reign for the artifact.
We argue that, whenever possible, earliest possible strata and latest possible reigns should be assigned when publishing such artifacts, as very often the excavator possesses this information, yet does not mention it in his or her reports. Note however that the rules discussed above hold true for a wide variety of cases, even cases often considered as rendering an artifact unfit for chronological modelling. A few such cases are listed below: • Heirlooms: valuable artifacts are known to be sometimes kept for centuries, making them considerably older than the strata they were found in. Rules 1 to 4 still apply even in cases of heirlooms, since a heirloom, having by definition originated earlier than its stratum of discovery, does not contradict the determination of the latest possible stratum. • Late manufacture of archaizing artifacts: artifacts related to a specific king are sometimes produced even long after his death. Among Egyptian scarabs, this phenomenon is exemplified by the famous Thutmosis III scarabs, which were still mass-produced and distributed centuries after his death (Jaeger 1982). In this case, our rules still hold since late production of scarabs bearing a king's name does not contradict the determination of the earliest possible reign estimate. • Intrusions from below to above: an artifact which has moved, through bioturbation for example, from an earlier to a later stratum does not violate TPQs obtained by Rules 1 to 4 as, again, the knowledge of the latest possible stratum is not violated.
In fact, only the case of intrusions from later to earlier strata (for example items originating in pits, robber trenches or bioturbation, also known as infiltrated items) can render our TPQs invalid, since they would violate the presumed knowledge of the latest possible stratum. Hence artifacts known or suspected to be in such conditions need to be excluded from this type of modelling. Table 3 illustrates the application of the above rules to Egyptian scarabs from Beth Shean. We see that in most cases, Rule 1 can directly apply. Rule 2 is useful for Scarab 5, which bears the sole name "Ramesses", a name borne by 11 different pharaohs in the New Kingdom. Following this rule, Ramesses I, the earliest of those monarchs, should be used in order to determine the TPQ. In the same way, Rule 3 was applied in cases where the excavators hesitate between two consecutive strata: "Stratum VIII-VII" (Scarab 9) and "Stratum IX" (i.e. IXB or IXA) (Scarabs 1, 3, 15, 17). In both cases, Rule 3 states that the latest stratum, hence VII and IXA respectively, must be retained. Cases where Rule 2 or 3 were applied are shown in bold in Table 3.

The artifacts graph
We define the "artifacts graph" (Figure 1) as a two-dimensional graph where each point represents one or several artifacts. The artifact's latest possible stratum determines the point's coordinate on the horizontal axis, and its ear-liest possible reign determines the coordinate on the vertical axis. We will refer to these coordinates as "reign" and "stratum", for the sake of conciseness. In cases where a point represents more than one artifact, the number of artifacts is written next to it. The artifacts graph provides a convenient and synthetic way to look at the chronological distribution of the artifacts, and to immediately spot which reigns are represented in which strata. For example, we see here at first glance that no scarabs with Ramesside names appear before Stratum VII, and that no named scarabs at all appear in Strata XA and IXB. For the sake of simplicity, in Figures 1-7 we use a uniform grid rather than placing pharaohs in their correct position on the time axis. Figure 8 shows the correct placement of the points in time.

The step function of optimal TPQs
In Section 3, we defined rules to derive a safe TPQ for each artifact. Each stratum is thus assigned several TPQs. Which one of them should be retained as the final TPQ of a given stratum? A naïve approach would be to assign to each stratum the TPQ of its latest artifact, i.e. the highest point in the graph of Figure 2. For example, the best TPQ for Stratum VII is clearly given by the highest point of this stratum (the Ramesses IV scarab), yielding the TPQ: "Stratum VII ends after the start of Ramesses IV". However, this simple rule does not always produce optimal TPQs. Stratum VI is an example of such a case: this stratum has Ramesses III as latest TPQ, yet a better TPQ for it is obtained through the Ramesses IV scarab found in Stratum VII. Indeed, a TPQ of an earlier stratum automatically applies as well to all later strata, hence the dismissal of Stratum VIII's Ramesses III scarab as an optimal candidate. This observation yields the following definition of the optimal TPQ of a stratum: Rule 5 (optimal TPQ of a stratum). The optimal TPQ of a stratum is the latest TPQ derived from artifacts of this stratum and all earlier strata.
Applying Rule 5 repeatedly for each stratum shows that optimal TPQs for each stratum are obtained by building a step function, starting from the latest artifact of the earliest stratum (the Neferhotep I scarab of Stratum XB in our case) and repeatedly tracing a horizontal line towards the Figure 1: Artifacts graph for Beth Shean scarabs. Each point represents one or several scarabs, with its associated latest possible stratum (horizontal axis) and earliest possible reign (vertical axis, with start of reign for each pharaoh). A number next to the point represents the number of scarabs in cases where more than one scarab is involved. The identical start year for the reigns of Thutmosis III and Hatshepsut are due to a coregency. right (bypassing strata with no scarabs or with only earlier or contemporary scarabs) and extending it upwards towards the latest artifact of the next stratum. Figure 3 illustrates this procedure. The resulting step function provides the optimal TPQ of each stratum. We can see for example that the optimal TPQ for Stratum IXB is provided by Stratum XB's Neferhotep scarab, and that the optimal TPQ for Stratum Late VI is provided by Stratum VII's Ramesses IV scarab (see Table 4).

Critical artifacts
Clearly, not all artifacts of our dataset have equal chronological value. For example, the artifacts of points situated below the step function do not contribute to the final TPQs (see for example the Thutmosis III scarab of Stratum VIII). In the same way, artifacts located in the middle of a step (imagine a scarab of Neferhotep I from Stratum XA, for example) are not indispensable, since the step is already determined by an artifact located in an earlier stratum (the Stratum XB scarab of Neferhotep I in our case). These observations can be used to define a restricted set of critical artifacts: Rule 6 (critical artifacts). The chronologically critical artifacts of a stratigraphic sequence are defined as the artifacts corresponding to the "corners" of the step function, i.e. to points that are strictly later (i.e. higher) than all other points of the same and preceding strata.
In our case, there are four scarabs corresponding to the above definition: the Neferhotep I scarab from Stratum XB (no 21), the Thutmosis IV scarab from Stratum IX (no 15), the Amenophis III scarab from Stratum VIII (no 7) and the Ramesses IV scarab from Stratum VII (no 19). These artifacts are circled in red in Figure 4. The critical scarabs are critical in the following sense: • Sufficiency. The critical artifacts are sufficient for providing all of the optimal TPQs. In the Beth Shean example, all of the 21 non-critical scarabs can be discarded from the dataset, without changing the optimal TPQ of any stratum. • Necessity. Removal of any corner of the step function immediately weakens at least one of the optimal  TPQs. In other words, the corner points, and their associated critical artifacts, are indispensable in order to ensure chronological optimality.
Note that the number of critical artifacts can be significantly smaller than the number of strata: here only four critical scarabs provide the optimal TPQs for eight strata. Note also that a corner of the step function can correspond to more than one artifact (see Figure 7 for an example). In that case, none of the artifacts of the corner point is essential per se, and it is only the removal of all the artifacts of the corner point that would deteriorate an optimal TPQ. We stress the importance of the proper identification of chronologically critical artifacts. Once they have been identified, great care should be taken to ensure they come from secure contexts, and that their historical dating is also secure, since discarding some of them could lead to degraded TPQs and an altered final chronology. Table 4 summarizes our data: optimal TPQ of each stratum and list of critical artifacts found in each stratum (if any). Note that in cases where one step of the step function spans several strata (for example Strata XB, XA and IXB in our case), one can also improve the TPQs of the non-initial strata by adding a minimum stratum duration estimate. In our example, if we assume a 30 years minimum stratum duration for Strata XA and IXB, the TPQs of these strata would be 1719 BCE and 1689 BCE, respectively, rather than 1749 BCE (see Table 4).
The above discussion did not consider non-critical scarabs in the determination of optimal TPQs. The next section argues for the contribution of non-critical scarabs to determine the robustness of given TPQs.

Robustness
We now wish to quantify the robustness-or strengthof a TPQ, in order to distinguish between stronger and weaker TPQs: Definition. The robustness of a TPQ is defined as the total number of artifacts implying this TPQ.
For example, the TPQ "Stratum IXA ends after the start of Thutmosis IV" has a robustness of 1, as it can be deduced from only one scarab (Scarab 15). On the other hand, the TPQ "Stratum IXA ends after the start of Thutmosis III" has a robustness of 4, as it can be deduced from four different scarabs: the Stratum IX Thutmosis IV scarab (Scarab 15) and three Stratum IX Thutmosis III scarabs (Scarabs 1, 3, 17) (see Figure 5). Hence, the Thutmosis III TPQ for the end of Stratum IXA is in a certain sense stronger, i.e. more robust, than the Thutmosis IV TPQ, since dismissing it would require rejecting four scarabs as being intrusive (or incorrectly identified), rather than one.
Note that the computation of the robustness of a given TPQ can sometimes require adding artifacts from several different strata, as in the case of the following TPQ: "Stratum VIII ends after the start of Thutmosis III". The robustness of this TPQ is 6, since it can be independently deduced from six different artifacts: two scarabs from Stratum VIII (Thutmosis III and Amenophis III) and four scarabs from Stratum IXA (Thutmosis III [three scarabs] and Thutmosis IV [one scarab]). Indeed, a TPQ obtained for an earlier stratum also holds for all later strata, as noted above (Rule 5). Hence, a TPQ with given reign for a given stratum depends on all artifacts from the given and later reigns, originating from the given and earlier strata.
A simple graphical procedure to determine the robustness of a TPQ for a given reign and stratum is the following: starting from the point corresponding to the reign and stratum, add to it all the artifacts situated in the north-western quadrant originating at this point, as shown in Figure 5 for the Stratum VIII-Thutmosis III TPQ referred to above.
Knowing the robustness of a given TPQ is important, and we advocate for the systematic mention of the robustness of TPQs in chronological discussions. The robustness of the optimal TPQs obtained before are listed in Table 5.
It is easy to see, through the graphical procedure defined above, that in this case, each of the optimal TPQs has a robustness of only 1. 3 Hence, were any of our four optimal scarabs intrusions from above to below, some of our optimal TPQs would no longer hold. One might therefore wish to also present a more robust set of TPQs for each stratum, by imposing for example that each TPQ would be backed by at least two different artifacts, i.e. that each TPQ would have a robustness of at least 2.

Beth Shean example with robustness of at least 2
Imposing a robustness of at least 2 in the Beth Shean example yields a new artifacts graph (Figure 6, with updated step function) and a new table of optimal TPQs ( Table 6).
We can see in Figure 6 that the updated step function is lower (i.e. earlier) than the former one and that no TPQ with robustness of at least 2 is possible for Strata XB to IXB, because only one named scarab was found in these three cumulated strata. The step function also has the strange feature that its corners do not necessarily correspond to a given artifact. For example, the Stratum VIII-Thutmosis IV corner does not correspond to a given Thutmosis IV scarab associated with Stratum VIII, but rather to two different scarabs: the Thutmosis IV scarab associated with Stratum IXA and the Amenophis III scarab associated with Stratum VIII. In such cases, our former definition of critical artifacts, being defined by artifacts situated at the corners of the step function, cannot hold anymore, and will be replaced in Table 6 by a simple list of supporting artifacts for each TPQ. Table 6 provides our updated TPQs with a robustness of at least 2. We see that in some cases, the actual robustness is greater than 2, since some of the induced points refer to more than one artifact. The table also provides the list of supporting artifacts for each TPQ. For Stratum IXA, our former Thutmosis IV TPQ (1401 BCE) deteriorates to a Thutmosis III TPQ (1479 BCE), attested by 4 scarabs Figure 5: Illustration of the robustness principle: the north-western quadrant of the point (Stratum VIII, Thutmosis III) contains six artifacts, hence a robustness of 6 for the TPQ "Stratum VIII ends after start of Thutmosis III". . We thus see that increasing robustness yields a real deterioration of our TPQs, in some cases mild (Stratum VIII, VI, Late VI), in others more drastic (Stratum IXA, VII). The matter of how much one wants to trade chronological accuracy for the sake of increased robustness should of course be left to each individual researcher. The exact robustness requirement can also vary from site to site, according to the excavator's confidence in the cleanness of the contexts from which the artifacts originated. Sites with more intrusions/less secure contexts, or having been excavated in the early days of archaeological research, might warrant the imposition of harsher robustness requirements than sites with clean and well-controlled stratigraphies. The robustness requirement can therefore serve as an adjustment variable, in order to improve the quality of the final TPQs derived from the set of unearthed artifacts.

Analysis of the Beth Shean Artifacts Graph
The present section lies outside the scope of the strict exposition of our modelling techniques (Sections 3 to 5). It contains a site-specific discussion of the chronology of Beth Shean with regard to Egyptian scarabs. It also serves as an example of how the artifacts graph can help rest such discussions on a more rigorous footing, by providing a graphical way to examine the whole body of evidence. As mentioned before, the artifacts graph enables the researcher to obtain a global view of the chronological distribution of all artifacts in a single glance. For example, a noticeable feature in Figure 3 is a steep ascent of the step function between Strata VIII and VII. Looking at Table 4, we can associate this ascent with a gap of almost 250 years Step function of optimal TPQs with a robustness requirement of at least 2 (bold line). The dotted line represents the standard step function (robustness at least 1), for comparison purposes. between the optimal TPQ of Stratum VIII (Amenophis III, 1391 BCE) and of Stratum VII (Ramesses IV, 1153 BCE). This gap is noteworthy because we expect succeeding strata to be close in time, with the later stratum featuring scarabs only slightly later than those of the preceding one (as for Strata IXA and VIII for example). Our gap therefore needs to be analyzed and explained. A first way to reduce the gap is by questioning the Ramesses IV TPQ of Stratum VII. Thus Weinstein (1993, p. 221), noting that Stratum VII produced otherwise no Ramesses III inscribed material whatsoever (as opposed to Stratum VI), and that this stratum is usually dated to well before Ramesses III, deduced that the Ramesses IV scarab is an intrusion from Stratum VI. In the same way, Mazar rejected the possibility that Stratum VII was still standing in the days of Ramesses IV, considering that Stratum VI was built during the days of Ramesses III (Mazar 2010, p. 163-164).
The second possibility to reduce the gap would be to re-examine the very early TPQ obtained for Stratum VIII (Amenophis III, 1391 BCE). Early Beth Shean excavation reports indeed accepted a very early dating of the stratum, in the second half of the 15th century (Rowe 1930, p. 7), based on the interpretation of the Amenophis III scarab of Stratum VIII as a foundation deposit related to Stratum VII constructions (Rowe 1930, p. 19). Soon however, Albright (1936, p. 76-77), based on ceramic evidence, showed that Stratum VIII cannot start before the end of the 14th century, setting a chronological scheme that prevailed to this day.
The lack of Ramesside named scarabs from Stratum VIII therefore deserves an explanation. First, it should be noted that Stratum VIII is not very well-known and poorly preserved (James F.W. and McGovern 1993, p. 4;Mazar 2010, p. 161). Second, turning to stylistically-dated Egyptian scarabs from Beth Shean, one can note that Keel dates three Stratum IX scarabs to the 19th-early 20th dynasties (Keel 2010, no. 112, 125, 134), but he notes that these are most likely intrusions, since Stratum IX is normally dated to the 18th dynasty (Mazar 2010, p. 159-160).
If we indeed accept them as intrusions, let us explore the implications of them having possibly originated in the above stratum, Stratum VIII. We can then apply our rules to obtain updated TPQs and redraw the artifacts graph. Assuming that these three scarabs originated in Stratum IX or VIII, and accepting their 19th-20th dynasty dating by Keel, we can use Rule 4 to obtain the following TPQ: "Stratum VIII ends after the start of Ramesses I". Note that this TPQ holds true even if only one of the three intrusive scarabs came from Stratum VIII. The new assumption is of course merely a working hypothesis, as the possibility remains that all three scarabs originated in later strata (i.e. Stratum VII and beyond).
In the same way, considering our Ramesses IV scarab from Stratum VII as a possible intrusion from Stratum VI (see Weinstein 1993, p. 221, as discussed above), Rule 3 yields the following TPQ: "Stratum VI ends after the start of Ramesses IV" (instead of Ramesses III), and the new TPQ for Stratum VII becomes "Stratum VII ends after the start of Ramesses II" (instead of Ramesses IV). Figure 7 shows the artifacts graph resulting from these new hypotheses. We can see that the large former gap between Stratum VIII and VII is now considerably reduced, and that Strata VII and VI now also exhibit a nice progression in the step function, with the Stratum VI TPQ (Ramesses IV) now being later than (rather than equal to) the Stratum VII TPQ (Ramesses II). Globally, the step function has become more regular and each stratum (except Late VI) contains a later scarab than the preceding strata. The ensuing TPQs are given in Table 7. Figure 8 provides the artifacts graph Figure 7: Updated TPQ step function, with the addition of three stylistically-dated Ramesside scarabs from Stratum IX (Keel 2010, no. 112, 125, 134), considered as probable intrusions from Stratum VIII, and the reassignment of the Ramesses IV scarab from Stratum VII as a possible intrusion from Stratum VI. Table 7: Updated TPQs, with the addition of three stylistically-dated Ramesside scarabs from Stratum IX (Keel 2010, no. 112, 125, 134), considered as probable intrusions from Stratum VIII, and the reassignment of the Ramesses IV scarab from Stratum VII as a possible intrusion from Stratum VI.  Figure 8: Same data as in Figure 7, but with the points placed on the real time-scale. The scarabs of Hatshepsut and Thutmosis III of Stratum VIII have been merged, since they yield the same TPQ date (1479). The figure highlights the periods of paucity and high density of named Egyptian scarabs.
as a function of time. This figure highlights the paucity in named Middle Kingdom royal scarabs (2 items), and the high concentration of New Kingdom royal scarabs, from the period of Thutmosis III/Hatshepsut to Ramesses IV, corresponding to the period of the Egyptian Empire in the Levant. The figure also highlights temporal gaps, such as the long gap (1749-1479) between the Middle Kingdom and New Kingdom scarabs, and the gap in 18th-dynasty scarabs from the Amarna and post-Amarna period (see Also Weinstein 1981, p. 17). A second gap (1279-1184) appears after Ramesses II but is only apparent, due to the long 66-year reign of that monarch. Table 8 compares the TPQs obtained above to the dating of the strata proposed by the latest excavators of the site (Mazar 2006, p. 13), based mainly on pottery typology. We can see that all our TPQs are valid, as they are indeed below the excavator's dating of each stratum. Note that the gap between the TPQ and the excavator's dating for the end of a stratum varies much from one stratum to the other. It can be reasonably small (as in Stratum VI, ca. 50 years) or much wider (as in Stratum XB, ca. 150 years). Recall from Section 3 that our TPQs provide an exact date for a stratum's end only in the theoretical and highly unlikely case where the artifact would have reached the stratum at the very start of the king's reign, and the very end of the stratum's lifetime. In practical cases however, a certain time-lapse is expected between the king's ascent and the artifact's manufacture, as well as between the artifact's manufacture and its reaching of the stratum of discovery, all added to a final time-lapse between the artifact's reaching the stratum and the stratum's final demise or destruction. The sum of these three time-lapses easily accounts for all the observed gaps. Note that these time lapses can also include cases of manufacture of an artifact after a king's death (cf. the Thutmosis III example mentioned above), or arrival of an artifact in a stratum deeper than the one it was found in (heirlooms).
Furthermore, in the case of a plateau in the step function (as in Strata XB-IXB and Strata VI-Late VI), the value of the gap is artificially augmented by the presence of intermediate strata having no critical artifact, hence no better TPQ of their own (see for example Stratum XA and IXB, which have the same TPQ of 1749 BCE as Stratum XB). As mentioned in Section 4 above, such cases of plateau can be handled by adding a minimum stratum duration estimate for the intervening strata. In our case, using a 30 years minimum duration estimate for strata XA and IXB would yield an improved TPQ of 1719 BCE for Stratum XA and of 1689 BCE for Stratum IXB).
This short analysis of the distribution of Beth Shean scarabs was given here as an illustration of the potential benefits of the artifacts graph as a tool for analyzing the chronological distribution of artifacts in a site. By spotting gaps, we could formulate hypotheses that, although , and with the lower bounds of the radiocarbon ranges provided by the Hebrew University excavation reports Mullins 2007, p. 718-721 andPanitz-Cohen andMazar 2009, p. 25-27). partly speculative, helped achieve a smoother step function, with TPQs more in line with the excavator's dating of the strata.

Software
We provide a software implementation of the concepts presented in this paper, as a utility called "TPQ Composer". The software is freely available on GitHub (https://github. com/Eythan31/TPQ-Composer). It produces a full artifacts graph, with the step function of optimal TPQs (similar to Figure 7, which was created with this program). It also produces a table containing the list of optimal TPQs per stratum, the robustness of each optimal TPQ and the strata containing critical artifacts (similar to Table 7). The software was written in the Python programming language (version 3.7.4), with the matplotlib library for drawing graphs and the PyQt5 library for the graphical user interface. Executing the software requires a working Python 3 installation (available for free at https://www.python. org/downloads/), with the above-mentioned libraries. A graphical user interface is provided in order to ease data entry (see Figure 9), but a script version is also available Figure 9: The "TPQ Composer" software.
for more advanced users. The artifacts graph is generated as an image (supported image formats include JPG, TIF, PNG, PDF and SVG), and the table of TPQs is generated as a comma-separated file (CSV, see Figure 10). Both are generated in the directory from which the program is run (current directory). The user must supply the list of strata, the list of kings and start dates of each king. He must also provide the list of artifacts by clicking on the names of the relevant strata and kings. Additional parameters available to the user include the color of the points in the artifacts graph, the color of the step function and the image resolution (DPI) of the graph. The software is open-source and is distributed under the GNU Public License v3.0.

Conclusion
This article presented a formal framework for synchronizing strata and datable artifacts in multi-layered sites. We have shown that each artifact, in order to be assigned a safe TPQ, needs to be provided with an earliest possible reign and a latest possible stratum. This TPQ holds regardless of the precise reign and original stratigraphic attribution of the artifact, provided the artifact is not an intrusion from above to below (infiltrated artifact). We thus advocate for a systematic mention of an earliest possible reign and latest possible stratum for published artifacts, whenever possible. It must also be noted that our methodology is not restricted to artifacts bearing specific royal names. In fact, it can be used for any artifacts that can be related to an independent global timescale, such as ancient coins for example. Note also that our Beth Shean case study was purposefully limited to scarabs bearing royal names, as it only served illustrative purposes for the methodological discussion. The site of Beth Shean however also features other types of datable Egyptian material such as statues, stelae, inscribed plaques, hieratic inscriptions and stylisticallydated scarabs. A full chronological study of any site should of course include all types of such datable material.
We recommend the following methodology when publishing sets of chronologically-relevant stratified artifacts from a given site: (1) providing a table of artifacts featuring the earliest possible reign (or time-frame) and latest possible stratum of each artifact (cf our Table 3), (2) providing the artifacts graph with the step function of optimal TPQs (cf Figure 3), (3) providing the table of optimal TPQs of each stratum, with its associated robustness and list of critical artifacts (cf Table 5), (4) if needed, providing an updated version of the above elements with stronger robustness requirements (cf Figure 6 and Table 6). Although such a methodology may sound somewhat rigid, we believe it is important in order to bring more rigor into chronological modelling. The artifacts graph is a powerful graphical tool for visualizing the presence and absence of specific reigns in specific strata, and for identifying gaps, as shown in Section 6. Also, the identification of critical artifacts and their robustness adds significant qualitative and quantitative insight into chronological debates, with little added costs when using the graphical procedures proposed in this paper. We provided a software utility, called "TPQ Composer", freely available online (see Section 7) for drawing an artifacts graph, identifying the critical artifacts and computing the robustness of each optimal TPQ.
One might argue that our methodology is of limited use in the age of radiocarbon dating. As noted in the introduction, we believe otherwise, for the following reasons: (1) it is often hard to obtain quality samples originating in safe stratigraphic contexts, and such might not be available for every stratum of a site, (2) we believe it is useful to have a second absolute time-scale at our disposal, one which is independent from radiocarbon and can therefore provide a source of comparison, and even of sanity check, for reported laboratory results, (3) deriving secure TPQs can be of importance to radiocarbon modelling itself, since these TPQs can serve as priors for Bayesian modelling in order to obtain enhanced radiometric ranges (Bronk Ramsey 2009). We would like to add here that radiocarbon does not always provide better TPQs than the historical approach advocated for here, as can be seen in Table 8 for the case of Beth Shean. Here, only in Stratum IXB was the historical TPQ much earlier than the 2σ radiocarbon TPQ, a case easily explained by the fact that this stratum did not have an optimal TPQ of its own, but rather inherited its TPQ from the much earlier Stratum XB. In the case of Strata XA and VII however, the historical TPQ is only 19 years earlier than the 2σ radiocarbon TPQ, and in the case of Strata IXA and VI, the historical TPQs actually improve the 2σ radiocarbon TPQs by over a century. Even for the 1σ radiocarbon range, the historical TPQs of Strata IXA and VI still significantly improve the radiocarbon TPQs (by 94 years for Str. IXA and 42 years for Str. VI). We believe that the historical and radiometric approaches, far from excluding each other, are rather complementary, and, whenever possible, should be conducted in parallel.
This paper presented our framework in the case of a sequence of stratigraphic phases (the Beth Shean stratigraphic sequence for example) rather than basic stratigraphic units (depositional layers). One could wish to apply our framework at a lower level, on the depositional layers themselves, rather than on phases. This however is only possible if the stratigraphic sequence of layers is unilinear, as opposed to multilinear (see Harris 1989, p. 128-129). Our framework indeed assumed unilinear sequences, in order to enable visualization of the data in a two-dimensional graph. We leave as future work the generalization of this framework to multilinear sequences.
This paper was intended to provide guidelines as to a clean and rigorous approach to termini post quem related to collections of stratified artifacts. The methodology can serve both for site-specific discussions, as presented here for Beth Shean scarabs, but also for regional studies, integrating several sites. In the latter case, one could use archaeological periods (e.g. LB IA, LB IB, …) on the horizontal axis of our graphs, rather than strata, and thus include material (scarabs for example) from several sites, in order to increase the size of the data set, to improve robustness and to derive global TPQs for archaeological periods.

Notes
1 A Stratum "Late VII" usually also appears among the Beth Shean strata. It has been omitted from our subsequent tables and figures since, as noted by the latest excavators of the site, it "should not be regarded as an independent stratum, but rather as representing minor changes in the architecture of Level VII, while other buildings in the city, like the temple, continued to be in use." (Panitz-Cohen and Mazar 2009, p. 5). In any event, none of the named scarabs retained in this study originated in Stratum Late VII. 2 The University of Pennsylvania Museum excavations use the designation "level", while the Hebrew University of Jerusalem excavations use the term "stratum". This paper will use "stratum" throughout, for the sake of uniformity. 3 Note that optimal TPQs do not always have a robustness of 1. In some cases, a corner of the step function can correspond to more than one artifact, yielding directly a robustness of 2 or more. In other cases, two consecutive strata might have one artifact each that are situated on the same horizontal line of the step function, yielding a robustness of 2 for the later stratum.