Relativity and Acceleration
I wrote this piece around 1996. I adopted a slightly unusual approach which may be easier to understand for those with an aversion to the more mathematical explanations. I had planned to add a second article about the General Theory, just to see if I could go further with a similar approach, but this never materialised. I have recently found a reasonably understandable introduction 'The Meaning of Einstein's Equation' (a pdf file) by John Baez and Emory Bunn which I am sure is better than anything I would have written, so for now that is my best recommendation for further reading.
It is sometimes stated that acceleration can only be treated by the General Theory, and it may be a surprise to some readers that Special Relativity is perfectly adequate for this purpose. The General Theory alone, however, gives the relationship between gravity and matter in the equation G = 8 pi T.
(Physicists have a habit of defining their units to make equations look simpler, so it is common to find apparently different versions of an equation, so for example 'Einstein's Equation' can become G = T )
Simple treatments of acceleration are hard to find. For example, in the book 'Gravitation' by Misner, Thorne and Wheeler there is something similar to the extended reference frame for an accelerating observer developed here, but it goes under the name of 'a Fermi-Walker transported tetrad' and occurs in chapter 6 of a 1279 page book written at a fairly advanced level.
This is an attempt to present an unusual treatment of the subject, avoiding the more common examples. Nowhere are there observers on trains watching flashes of light! Light is avoided almost completely to highlight the fact that the theory concerns the properties of space and time, and the properties of light are not of central importance.
First, here is a diagram of an extended reference frame which could in principle be constructed by an accelerating observer, drawn as seen by a non-accelerating observer. A full derivation of this can be found in standard text books such as 'Gravitation' but instead of this mathematical approach an explanation is attempted which gives some idea of the reason for this form of reference frame.
In general the effect of acceleration is zero when the accelerated object is observed from an inertial reference frame. It is only when trying to explain a situation from the point of view of an observer undergoing acceleration that its effects must be taken into account. An observer in an inertial reference frame can describe everything in terms of velocity using Special Relativity. In high energy physics Special Relativity is used to analyse particles in collisions involving accelerations of more than 1030 earth gravities. It is nevertheless interesting and instructive to consider what the universe looks like to an observer undergoing acceleration, so this will be covered later.
Before we can make use of the transformation equations of relativity theory it is necessary to understand what is meant by a reference frame and how to construct one. As I write I am sitting in a rectangular room, and I have several clocks and a ruler. If I want to describe the location of an object I can conveniently define coordinate axes x, y and z to coincide with three edges of the walls, and define one corner of the room as position zero from which all other positions can be measured. There is nothing unique about the choice of axes. Any choice of positive x and y directions can be transformed into any other choice by rotation, but for any given x and y axes there are two possible positive z directions to choose from, and one can only be transformed into the other by a mirror reflection, not by a rotation. There is a convention as to which of the two possibilities is used, and this is a right-hand screw rule such that a rotation of a right-hand thread screw through 90 degrees from +x to +y will give motion of the screw in the +z direction. To put it another way, if I face a wall of my room, and the bottom left-hand corner is the point x = y = 0 and x is from left to right along the bottom edge of the wall, while y is upwards along the left hand side of the wall, then +z is the direction vertically out towards me from the bottom left corner.
It is important to stress that the transformation equations we are about to consider describe the relationship between two events, not between two objects. An object is something which exists more or less unchanged over a period of time. An event of the idealised sort being considered here happens at one point in space and at one point in time. An object may have a reference frame in which it is at rest. An event, however, can not be said to be at rest, or moving, or to be associated in any way with one reference frame more than any other. If I choose the origin of my reference frame to be the corner of my room, and place a clock there with a reading in seconds starting from zero at some instant, then I can define an event as being at the origin at the instant when the clock starts from zero, i.e. at x = y = z = t = 0.
We now consider another reference frame constructed by a different observer moving at velocity v relative to myself. For ease of analysis it is convenient to choose the same directions for x, y and z in the two frames, and to choose v to be in the +x direction, and also to use the same event as the origin of the second frame. This is illustrated at t = 0 in Fig.la and at a later time in Fig.1b. Only the x and y directions are shown. The event is not shown in Fig.1b, because of course it does not occur at the later time. The time also is not shown.
It is tempting to write t = t' = T where T is the reading on my clock at the later time. In Newtonian, or pre-relativistic physics this would be valid, but it is based on an assumption that there is a universal time and that observers moving with different velocities would agree on the time.
There is some difficulty in even saying t = T. I mentioned only a clock at the origin of the space coordinates, i.e. at x = y = z = 0. How are we to measure time at other locations? I have another identical clock and I can place this at another location. The clocks must be designed so that they are not affected by local conditions such as temperature or atmospheric pressure which may differ at some locations.
Gravity is a more serious problem because it is a prediction of General Relativity that clocks at different gravitational potentials run at different rates, so we will assume that there are no gravitational effects to worry about in the region we are concerned with. For similar reasons we will assume for now there is no acceleration involved.
It could be objected that there are gravitational fields everywhere, so ignoring them is not realistic, but in any location we can find a reference frame with no gravitational field, this is just the frame where we are in 'free fall'. To experience such a frame we need to go out beyond the earth's atmosphere to avoid air resistance, and just fall, or more safely be in an orbiting space station. There are still some very small 'tidal forces' because the earth's field is not uniform, it converges towards the centre of mass of the planet, but these forces are already small and can be made smaller either by adopting a higher orbit or restricting our observations to a smaller volume of space. Having found one almost perfectly inertial frame we can then define as many as we wish because any frame moving at constant velocity relative to this frame is also inertial.
Having eliminated such effects there is still the question of how to synchronise the clocks. I could just look at the clocks and read the time, but there is then a time delay for light to travel from the clocks to my eye. If I know enough about the velocity of light I could compensate for this, but I will avoid making any assumptions about light velocity, and instead use a different method which makes a different assumption.
If I wish to synchronise clocks at locations A and B 10 metres apart, suppose I measure a point mid-way between these locations, and start with two clocks at this point. Both can be started from zero at the same time and it can be checked that they run at the same rate. Then both clocks can be transported symmetrically in opposite directions at identical speeds to the points A and B. There is at least one assumption made, that the relevant laws of physics are independant of direction, so transporting a clock in one direction at a certain speed will have the same effect as transporting it at the same speed in another direction.
So now I have two synchronised clocks at A and B. Or do I? Although the clocks were moved at the same speeds in my own reference frame, this is not so for all observers. Someone moving along with one of the clocks would say that this clock was stationary, and the other clock was moving. This observer could not then use the same reasoning to conclude that the clocks were still synchronised when they reached A and B. So, all I can say with any confidence is that the clocks at A and B are synchronised in my own frame.
Actually, it turns out that there is another method of synchronising clocks if Special Relativity is correct. The prediction is that if we transport a clock from A to B there will then be a discrepancy in the reading on this clock which depends on the velocity at which it is transported. The change in the rate at which it runs is proportional to v2 at low velocities, while the duration of the journey is proportional to 1/v. It follows that the final discrepancy is proportional to v, and we can therefore reduce it to as low a value as we wish just by reducing the velocity of transport. Simply carrying our clocks to their final destination slow enough is all we need do to keep them synchronised to any required degree of accuracy.
There are situations in which reducing the velocity towards zero does not also reduce the final error towards zero. A clock carried around the perimeter on a rotating disc will have an error after returning to the starting point on the disc compared to a non-transported clock which reduces towards a non-zero value as the velocity of transport relative to the disc is reduced. The properties of rotating discs will be covered later.
So, now I can place synchronised clocks at any location in my reference frame and assign coordinates x and t to any event P. The origin, O is itself an event at x = t = 0, and the coordinates of P are then just the relationship between the two events, O and P, as measured in my frame.
Another observer moving at velocity v in the x direction, having followed a similar procedure to set up his own coordinate system using what he believes to be synchronised clocks and using his own rulers to measure distance, can use the same event O as his origin, and can then assign coordinates x' and t' to the event P.
In Newtonian physics the equations connecting these different sets of coordinates are:x' = x - vt and t' = t
Looking again at Fig.2b:
The moving frame has moved distance vt at time t, so the event P is that distance closer to the origin, so that x' = x - vt. These equations are what our normal everyday experience tells us is the correct way to transform between the two reference frames. But there is no guarantee that this is still correct in situations outside our normal experience. Maybe at high velocities something different happens.
There are several requirements we can make for the correct equations to ensure there are no logical inconsistancies, and first of all, to avoid disagreement with experience they must agree with the Newtonian equations over the range of velocities and precision of measurement encountered in everyday life. Secondly, they must work in either direction, to transform from x and t to x' and t', or equally to transform back from x' and t' to x and t, with the only difference that the relative velocity v is of opposite sign for the reverse transformation. This follows from the sort of symmetry argument used earlier, as does the fact that observers in both frames agree on the magnitude of v. Imagine a third observer moving in such a way that he sees the other two moving symmetrically in opposite directions at velocities +u and -u. If either of these 'moving' observers required a different set of transformation equations, or measured a different magnitude of relative velocity, then this would provide a way of distinguishing one direction from another by observing the properties of space and time. I have started from the postulate that this is not possible. The same sort of symmetry argument leads to other requirements, e.g. that one point in one frame corresponds to only one point in the other frame.
One more constraint must be added before we arrive at the final result, and this is the central postulate of Special Relativity, that the fundamental laws of physics are the same in all inertial reference frames. The term 'inertial reference frame' simply means a frame with no gravitational or acceleration effects, as specified earlier.
The question of which laws are fundamental is crucial. If we choose Newton's laws of mechanics then x' = x - vt and t' = t leave these invariant, i.e. they are the same laws in both frames. If we choose Maxwell's equations of electromagnetism, then different transformation equations must be used. These are the Lorentz coordinate transformation equations:x' = x - vt t' = t - vx/c2 SQRT( 1 - v2/c2 ) SQRT( 1 - v2/c2 )
These equations do not leave Newtons laws invariant, but it was realised by Einstein that this could be rectified by adding another transformation equation for the mass, m:m' = m SQRT( 1 - v2/c2 )
c is a physical constant with the value 2.9979.... x 108 m./sec., and is best known as the velocity of light in a vacuum. SQRT indicates the square root.
The Lorentz equations have the property of reducing to x' = x - vt and t' = t if v is small compared to c, and also work in reverse to transform back to x and t as required, and one point in one frame corresponds to just one point in another frame. All the requirements specified earlier are satisfied.
Since the early development of Special Relativity the emergence of quantum theory has changed opinions about what are the fundamental laws, but still Relativity survives. The ultimate test of whether it is true or false must be experiment, and this will not be considered here. I am concerned only to investigate the properties of space and time as predicted by the theory, in particular when acceleration is involved.
It is well known that General Relativity predicts curved space-time geometry in the presence of gravity, but less well known that Special Relativity makes the same prediction in the presence of acceleration. How to deal with acceleration is to divide space and time into small blocks with dimensions dx, dy, dz and dt. Provided the acceleration is finite, then as dt is reduced towards zero the change in velocity in time dt also reduces towards zero, and the velocity can be treated as constant, and the Lorentz equations can be applied. We can then add, i.e. integrate, these blocks of space-time to try to determine what is happening to our accelerating observer over larger distance and time intervals.
Curved space-time is difficult to visualise, but a more familiar problem with some similarity is the mapping of the earth's surface. If we consider only a very small area of the earth's surface such as a small town then we can draw a flat, rectangular map with only very small errors resulting from the earth's curvature. We could map a strip 1 mile wide round the equator by adding together maps of 1 mile squares side by side and the overall error would still be small. If we try to extend this to draw a flat map of the whole of the earth's surface however, we run into serious trouble. The result is always distorted in some way, and several methods are used to draw world maps, with names like 'Mercator' and 'Mollweides interrupted homolographic'. In the Mercator system the poles appear as lines with the same length as the equator, while in the Mollweides system the poles appear as a number of separate points. If we are only interested in integrating along the path of an observer, or mapping a narrow strip of the surface of a sphere, then the overall geometry is less of a problem.
Next a number of examples will be given, with diagrams, starting with uniform velocity, then 'instantaneously' changing velocity, which is a useful stepping-stone to constant acceleration. For now here is a demonstration of the best known result of the Lorentz equations:
Consider a short pulse of light emitted from one point, called event O, and arriving at another point some time later, called event P. For simplicity use units of seconds for time and light seconds for distance, so that c = 1 light second per second. Event O will be used as the origin of both my own coordinate system, and another moving with velocity 0.866c relative to me. This gives a value for SQRT( 1 - v2/c2 ) of 0.5.
Suppose event P occurs at x = 1, t = 1 in my frame, then in the other frame we find by substitution in the Lorentz equations that x' = ( 1 - 0.866 )/0.5 = 0.268 and t' = ( 1 - 0.866 )/0.5 = 0.268. Velocity = x' / t' = 1. The light pulse travels a shorter distance in a shorter time, but its velocity is still the same.
At the end of the previous part an example was worked out for a Lorentz transformation with v = 0.866c giving 1 / SQRT( 1 - v2 / c2 ) = 2. This is sometimes referred to as the 'time dilation factor', so it may seem strange that the result obtained was that events O and P, originally 1 sec. apart in one reference frame, are only 0.268 sec. apart in the other frame. Hardly a time dilation of 2. It will be shown later under what circumstances the time dilation factor is applicable. The result from the example in the previous part is shown in the following diagrams:
To investigate further the relationship between the two frames consider a small region of space and time near the origin in frame A with coordinates x and t extending from -2 to +2, and draw vertical and horizontal lines at 1 unit intervals as in Fig.4a. This square grid can be plotted in frame B by working out the corresponding coordinates x' and t' for events on the grid using the Lorentz equations. Continuing with units of seconds and light seconds to give c = 1 and with v = 0.866c:
x' = 2( x - 0.866t ) and t' = 2( t - 0.866x )
The grid can be plotted in the B frame to show how an observer in frame B sees the 'square' grid observed in frame A. The result is shown in Fig.4b.
Two events R and S are shown in both diagrams. The coordinates of S in the A frame are x = 2, t = 2, and in the B frame the same event is at x = 0.536, t = 0.536. This can be seen to be similar to the transformation between Fig.3a and Fig.3b, (except with all coordinates multiplied by 2).
The observer in frame B can plot his own square grid from -2 to +2 in his own frame as in Fig.5a The corresponding grid can be plotted in the A frame using the Lorentz equations again, but this time with relative velocity v = - 0.866c to give: x = 2( x' + 0.866t' ) and t = 2( t' + 0.866x' ). Plotting the result in frame A gives Fig.5b.
The symmetry of the transformations can be clearly seen in these diagrams. It is only the reversed sign of the relative velocity which reverses the diamond shaped grid in Fig.4b along either the x or the t axis to give Fig.5b.
An interesting property of these stretched versions of the square space-time grid is that the area enclosed is unchanged. The grid from -2 to +2 in Fig.4a has an area of 16 units. The stretched grid in Fig.4b or Fig.5b also has area 16. The y and z coordinates are the same in both frames, and it follows that any region of space-time of any shape or size also has the same space-time volume in any inertial reference frame. This appears to be one of the less well known invariances of relativity, or perhaps it is just not very useful.
Next we move on to consider objects rather than merely events. While events are represented by points in our space-time diagrams, points on objects are represented by lines. This is where we encounter the 'length contraction' and 'time dilation' effects. The most convenient objects to use are rulers and clocks. A point object can be represented by a line such as OR in Fig.6a (in frame B). This object is at rest in frame B at the location x' = 0, and so it remains at this value of x' for all values of t'. Another point object represented by ST is at rest 1 unit of distance away from the first object. If the objects remain at rest then the lines OR and ST extend to infinity. We could place a 1 unit ruler between O and S and then the two vertical lines represent the paths of the two ends of the ruler through time, measuring a constant unit of distance from O to S, from R to T, and so on. The ruler is shown as a thick line at t' = 0 and a dotted line 1 sec. later. The square ORTS can be transformed to frame A, exactly as done previously, now shown in Fig.6b.
If I am at rest in frame A, and I observe the ruler, then I see it moving past me in the +x direction. At t = 0 I see the two ends at O and P, 0.5 units apart, so I naturally claim that the ruler has a length of 0.5 units. In general, for relative velocity v, and a ruler of length L' in frame B, I will say the length in my A frame is L'/ SQRT(1-v2/c2). An observer in frame B could object that I was measuring the location of one end of his ruler at one time, and the location of the other end at a different time, and furthermore he would say the ruler I was using for my measurements was not 1 unit in length. If we drew the reverse transformation for my own ruler, we would see that the observer in frame B would claim that it measured only 0.5 units. So, there is a combination of two effects, in time and space. If we simply stated that 'moving objects are contracted' then this would seem to be a paradox, because different observers disagree about which objects are moving, so we would get the result that ruler A is shorter than ruler B, but ruler B is shorter than ruler A. It is the two combined transformations of space and time which prevent this paradox.
Note that in Fig.6b, if I draw lines MN and OQ to represent the ends of a ruler 2 units long in my A frame, then it will match up with the 1 unit B frame ruler OS at t' = 0, and the B frame observer will therefore claim that my 2 unit ruler is only 1 unit long, i.e. 1 unit measures only 0.5 units. The length contraction factor of 0.5 from A to B, and from B to A are both therefore represented in the one diagram.
Time dilation is also represented in Fig.6b. If a clock is at rest at the left hand end of the ruler in frame B, then in Fig.6a it follows the path OR and reads 0 sec. at O and 1 sec. at R. In Fig.6b it reads 0 at O and 1 at R. Seen in frame A events along QR are simultaneous, so the time is 2 sec. when event R occurs. The observer in frame A therefore says that the clock, which according to him is travelling at 0.866c, is running slow. After 2 sec. it only reads 1 sec. According to the B frame observer however, it is events along the line RT which are simultaneous. Extending this line back to the t axis gives the point U at t = 0.5. A clock at rest at x = 0 in the A frame, reading 0 sec. at t = 0, would read 0.5 at U, but the B frame observer will say that 1 sec has passed because U occurs at the same time as R according to him, and so the A frame clock is running slow. Each observer therefore says that the other observer's clock is running slow by a factor of 0.5. We could start with a ruler at rest in frame A and draw the corresponding transformation to the B frame, but this adds nothing new which cannot be deduced from Figs 6a and 6b.
The next step along the road to our final goal, to describe acceleration, is to consider a change of velocity, without concerning ourselves too much with the actual change. Imagine a 1 unit ruler initially at rest in frame A until time t = 0, which is then accelerated more or less instantaneously up to velocity 0.866c. If the left hand end changes velocity at t = 0 then we would expect that the other end must accelerate at the same time. The ruler would then keep the same length as seen in the A frame. Unfortunately, in the rest frame of the ruler itself, its length would now be doubled, so it would have a real physical change in length and would have to stretch or break. To avoid this we must delay the acceleration of the right hand end, as shown in Fig.7, to the point E so that the length in frame A is reduced to half a unit. Clearly something unpleasant will again happen to the structure of the ruler during this process, but for now we are not concerned about this minor detail, we have the correct result before and after the acceleration.
If an observer travels along with the ruler, and places clocks at each end which are initially synchronised in frame A, then after the change in velocity he will say that events O and F are simultaneous. The clock at the right hand end has travelled an additional path D to E to F, while the left hand end has merely changed its velocity in practically zero time. The right hand clock therefore reads a later time after the acceleration, as seen by the accelerated observer. Consider now a change in velocity from -0.866c to +0.866c, and plot the paths of various points along the ruler as in Fig.8.
It is now only one small step to carry out the acceleration in a more realistic manner so that no real physical stretching or compression occurs and our ruler can survive the experience undamaged. Instead of changing the velocity in one step to 0.866c, now change it by a small step dv, and repeat this step over and over, at each step maintaining the length of the ruler at 1 unit in its own rest frame. Reducing the step size towards zero leads to a continuous acceleration. Looking back to Fig.1 at the start of this article, a diagram was shown for this sort of constant acceleration. The similarity to Fig.8 can be seen if the diagram is rotated 90 degrees anti-clockwise to match up the x and t axes. For constant acceleration the velocity will start at -c in the infinite past, and reach +c in the infinite future. The properties of this constant acceleration will be examined next, together with its application to the 'twins paradox'.
The Twins Paradox
We start this time with a closer look at two diagrams shown before:
Fig.9a represents a ruler OP at rest at time t=0, and accelerated in one step up to velocity 0.866c, having previously been brought to rest from a similar velocity in the opposite direction. Performed in this manner the velocity change will cause severe structural damage to the ruler, but the length is correct both before and after the change, as seen in the rest frames of the ruler. To avoid damage to the ruler the velocity change can be carried out in small steps dV, at each step keeping the ruler at its original length as seen in its own rest frame, then reducing dV towards zero to give a continuous acceleration as in Fig.9b, so that ruler OP becomes OQ, then OR as seen by an observer moving along with it. Placing one end of the ruler at O is unrealistic as this requires the end to undergo infinite acceleration, changing instantaneously from -c to +c as seen in the inertial reference frame of Fig.9b. In Fig.9a this sort of problem is not so important because this is not intended to be physically realisable, it is just a non-essential step in our analysis included to aid understanding of the derivation of Fig.9b. But Fig.9b is applicable to real physical situations so we must be careful to stick to what is possible. It is better therefore to use a ruler SP so that only finite accelerations need be used to maintain its correct length. The observer travelling with the ruler sees it as SP, then UQ, then WR. An observer at rest in the inertial reference frame will see the ruler at rest as SP, and later with a high velocity as NR.
In Fig.9a the end of the ruler at O changed velocity instantaneously, but the other end travelled the additional path from P to Q to R, and so a clock at that end would read a later time after the acceleration. This is true for each change of velocity dV in Fig.9b, so clocks at different positions along the ruler will run at different rates according to the observer travelling with the ruler. The clock at S will be seen to be running slower than the clock at P.
The curves in Fig.9b are parabola, and each curve represents the path of an object with a different acceleration, and is given by the equation x2 - t2 = k2. When t = 0 , x = k, so k is the position where the corresponding curve crosses the x axis. Differentiating gives the velocity = t / x , and differentiating again gives the acceleration at t = 0 as 1 / k. The units of time are seconds, distance is in light seconds, and so acceleration is in light seconds per sec2.
To show what this means at a realistic acceleration consider an acceleration of one earth gravity, about 9.8 m. per sec2. The corresponding value of k is about one light year, i.e. 1016 metres. A practical result of this is that light emitted from an event more than a light year away will never catch up with us if we continue to accelerate away from it at one earth gravity. We could interpret this as the equivalent of a black hole one light year away from which no light can ever reach us. This is illustrated in Fig.9b where light emitted from point A never enters the accelerated reference frame. This is what we would expect from the principle of equivalence. Phenomena predicted for gravitational fields should generally also occur for an accelerating observer, so the equivalent of black holes, or 'event horizons' is not surprising. The acceleration corresponding to each curve is given by 1 / k and so is inversely proportional to the distance of the curve from the origin. Applying the principle of equivalence to this suggests that the accelerating frame is equivalent to a gravitational field inversely proportional to the distance from the point O. The fields of more or less spherical objects like stars or planets are inversely proportional to distance squared, so our extended frame is not a representation of these familiar fields. What it could represent is the field of a long cylindrical mass, which we could call a fibre, or perhaps a string. Bearing in mind the current string theories in which strings are the most fundamental objects in the universe, it does seem interesting that the use of the Lorentz transformation equations has lead us to the gravitational field of a string as in some sense the most fundamental accelerated reference frame. Whether this is significant or just coincidental I have no idea.
What, you may ask, does all this have to do with the twins paradox? The important point about this 'paradox' is that one of the twins undergoes an acceleration, but the other does not. The understanding of the effects of acceleration is therefore crucial. We will consider a specific example, illustrated in Fig.10.
One twin, Paul, stays at rest in an inertial reference frame following the path OP, i.e. he remains at x = 0 for a period of 42 years, while his twin brother Quentin travels several light years at velocity 0.866c, returning to the point P to find his brother Paul 42 years older. Quentin, however has only aged by 21.5 years. We have added half a year to allow for the period of deceleration and acceleration when changing direction at the far end of the journey, but apart from this Quentin spent most of the journey with a time dilation factor of 2 and aged at only half the rate of Paul. When Quentin was travelling at 0.866c he would claim that it was Paul who was moving relative to him at this speed, so Paul would be ageing at half Quentin's rate. This is the reason why this is often referred to as a paradox, because each twin should apparently claim that the other is travelling at high speed and ageing slower, and yet this is impossible. When they finally meet at the end of the journey only one of them can be younger.
Note that there is no need to complicate this picture by worrying about an acceleration and deceleration by Quentin at the start and end of his journey at O and P respectively. I have intentionally extended the paths beyond these points so that any such changes in velocity occur somewhere to the left of the diagram where we are not interested in what happens because we are not comparing any time measurements there. We only need the two observers to set their clocks to zero as they first pass each other, and to check the readings as they again pass at the end of the journey. It would perhaps have been better to just talk about 'two observers with clocks', but the problem is widely known as the 'twins paradox' so I kept this relationship. Getting the twins to the starting point O with the 0.866c relative velocity however could have already caused some differential ageing depending on how it was achieved, so we need to remember we are considering their ageing following event O rather than necessarily their ages since birth.
The only way the 'paradox' can be avoided is if the discrepancy is accounted for during the period of acceleration. From Quentin's point of view at his own location the acceleration can take up a negligible proportion of the journey, so this cannot be significant. From the point of view of Peter in his inertial frame acceleration has no effect, and everything is explainable in terms of velocity. The only remaining way for acceleration to be important is at the location of Paul, as described from the point of view of Quentin.
To see how this comes about we need only draw in the lines shown dotted in Fig.10, representing simultaneous events in the rest frame of Quentin. There are two different rest frames corresponding to the outward and return journeys at constant velocity, and an accelerating reference frame of the type developed earlier. It can be seen that throughout the outward journey there is perfect symmetry between the twins. The horizontal lines represent simultaneous events in Paul's rest frame, so when Paul says 10 years have passed , Quentin's clock only reads 5 years. When Quentin says 10 years have passed then in his rest frame at that time Paul's clock reads 5 years. Similarly on the return journey it is possible for each to claim that the other is ageing more slowly because of the different observations of simultaneity.
During the acceleration, however, Quentin ages by 1.5 years while Paul ages by 32 years, according to Quentin. According to Paul, Quentin ages by 1.5 years, but he, Paul only ages by 2 years. ( These figures are not exact, but are sufficiently close for our present illustration ). It is therefore here, during the period of acceleration, that the differential ageing really takes place. But the difference in ages at the end of the journey is only 20.5 years, and yet the difference introduced by the period of acceleration appears to be 30.5 years. Is something still wrong? Not really, because the 30.5 years difference is only from the point of view of the accelerated twin Quentin, who can in principle construct the extended accelerating reference frame to explain this difference, but then must subtract the 10 years which in his constant velocity frames are lost by Paul due to the time dilation effect. From Paul's point of view there is no acceleration in his reference frame, and it is only the velocity of Quentin on his journey which gives the differential ageing.
Quentin could, instead of being concerned with the properties of accelerating reference frames, interpret the acceleration as a gravitational field and work out the difference in ageing expected due to clocks at different gravitational potentials running at different rates, but to get the correct answer he would need to assume that the field was inversely proportional to the distance from some point, and there seems no good reason why he should do so. The field at his location tells him nothing about the field at a different location, so he has insufficient information to make calculations based on gravitational potential differences. To try to start from the general theory and the properties of gravity is therefore inadvisable in this case, and the route taken in this article, via the construction of an accelerating frame, may be the only solution to the twins paradox.
It is interesting to try to draw the journey as seen in Quentin's reference frame, rather than in Paul's frame as shown in Fig.11.
Quentin has three different frames corresponding to the three stages of his journey. When we try to put these together we run into the sort of problems described earlier when describing curved space-time, and the problems of drawing maps of curved surfaces on a flat piece of paper. It will come as no surprise to see that these sort of problems occur when trying to draw extended reference frames when acceleration is involved. Already we have encountered event horizons and differential ageing, and now we see other problems. If we try to extend even the two constant velocity sections too far they will overlap and we will find one event having two different sets of coordinates. Fortunately this is of no more physical significance than the multiple poles in the Mollweides mapping of the earth's surface. Any real path travelled by a physically realisable observer will not introduce any problems, it is only the choice of coordinate systems which gives strange results.
The above diagram may seem less problematical if we draw it on paper and cut out the shape. Then the curved section can be rolled into a cone, and then the two rectangular sections can be lined up to form an almost continuous plane with no overlap when extended.
In Fig. 11 Quentin's path is shown as a thick solid line, and the path taken by Paul is the dotted line. The years experienced by each twin are also marked along each path. Fig.11 is made up of three different reference frames fitted together, and is not therefore an overall picture as seen in any one frame.
The accelerating frame developed in this series was not derived mathematically, and the equation for the parabolas in Fig.9b was stated with no further explanation. Instead, it is hoped that the explanation of how it evolved from Fig.9a gives a better understanding of why the accelerating frame has the appearance shown. Anyone who prefers a mathematical approach can find a derivation in the book 'Gravitation' by Misner, Thorne and Wheeler, Published by Freeman & Co., 1973, but be warned, this is written at an advanced level and only the mathematically adept will find it comprehensible. The parabolic equation is derived on pages 166 and 167. I only stated the acceleration at t = 0, and in general the acceleration is 1 / x - t2 / x3 which becomes 1 / x at t = 0, and falls to zero at infinity. This is the acceleration as seen in the inertial reference frame in which the diagram is drawn, but the accelerating observer experiences a constant acceleration for all values of t. Our derivation only stipulated that the length of objects remain correct, and I believe Fig.9b is not a unique solution in this case, and so the specification of a constant acceleration experienced by the traveller is also needed.
Another topic seemingly plagued by paradox is rotation. To give an example of the problems involved imagine one observer, Jane, standing on a rotating disc, while another observer, Kate, stands on non-rotating solid ground in an inertial reference frame, close to the perimeter of the disc. The centre of the disc is at rest in this frame. Suppose Jane sets up clocks around the perimeter on the disc, and uses a suitable method to synchronise them in her own local reference frame, e.g. by starting with two clocks together and transporting them symmetrically in opposite directions to their final locations. From the symmetry of the disc we can easily see that all the clocks will run at the same rate, so once synchronised they will stay that way. Then Kate can set up her own non-rotating clocks around the perimeter using similar methods to synchronise them. Then we compare the readings on the clocks. Kate says that Jane's clocks were moving and all ran slow. Jane says it was Kate who was moving, and it was her clocks which were running slow. But only one set of clocks can be running slow. After one complete rotation we can compare two clocks initially showing the same time, and we find it is the clocks on the disc which run slow. But why does Jane believe it is Kate's clocks running slow? If the disc has a very large radius, e.g. several light years, and we consider only two clocks very close together on the perimeter, say 10cm apart, over a period of a small fraction of a second, then the rotation surely will have little local effect, and the clocks are in an almost perfectly inertial reference frame moving in an almost perfectly straight line at constant velocity. If the velocity is yet again 0.866c, then both observers are justified in saying that the other observer's clocks run slow by a factor of two according to observations in this small region of space and time. Local observations give one result, while an overall observation involving a complete rotation gives a different result. The situation is illustrated in Fig.12
We could make the disc billions of light years in diameter ( this is just a 'thought experiment' so we can do anything we like ) so that the path of the clocks on the disc would be indistinguishable from a straight line over a period of a few seconds, so we can ignore the curvature of the path over this sort of interval. The centripetal acceleration (i.e. towards the centre of the disc) is inversely proportional to the radius, so by making the radius large enough we can make the acceleration arbitrarily small, so again over a short time interval the effect will be negligible, and we are left with a simple application of the Lorentz transformation equations for observers with clocks moving at a constant relative velocity.
For convenience we will stick to velocity, or more correctly speed, of 0.866c. Imagine that the observers in the two frames each set up a series of clocks synchronised according to their own observations, e.g. by starting with several clocks together at one point, leaving one behind, and transporting the other clocks to a series of other locations at very low speed. In Fig.12b these clocks are shown as A, B, C etc. in the non-rotating frame, and A', B', C', .....to Z' on the disc. It is important to remember that if we are transporting a clock round the disc very slowly, then the disc will complete maybe millions of revolutions while we are doing this, so even a very small effect may have a significant cumulative effect, so although we can say we have synchronised the clocks in a small local region of the disc we can make no immediate assumptions about the relationship between the original clock A' and the final transported clock Z'. This is something we have to work out.
For our small local region of space and time in Fig.12a we can just apply the Lorentz equations. Suppose it happens that when A' and A are next to each other they both read 0, and when A' reaches B the reading on B is 1sec. The Lorentz equations applied in the non-rotating frame tell us that at this time A' will read only 0.5sec. Suppose also that when B' passes A the reading on B' is 1sec. Applying the Lorentz equations to the local region of the rotating frame tells us that at this time A reads 0.5sec. This is all perfectly symmetrical as we would expect. The distance between A and B, or A' and B' is not directly relevant to our present problem, but again there is symmetry, and both distances are 0.866 light seconds in their own reference frame.
Turning to Fig.12b we look at what happens after one complete revolution of the disc. Clearly, if an observer on the disc says that clock A is moving at 0.866c and consequently runs at half the correct rate, while the non-rotating observer at A says it is A' which is moving and runs at half the correct rate, then each will predict that the other clock should indicate less time after the revolution, but in reality only one of the clocks can end up indicating less time than the other. It will come as no surprise that it is the rotating clock which shows less time. This is just a variation on the 'twins paradox', where one twin stays in an inertial reference frame while the other travels away and undergoes an acceleration, so we already know who ages the least, but some further explanation is possible in this example.
For disc radius R the distance round the circumference measured in the non-rotating frame is 2 pi R, and the time for a point on the disc to travel this distance at 0.866c is 2 pi R / 0.866c, and so this must be the reading on A when it is once more next to A'. But what is the reading on A' ? This has travelled at 0.866c, so the 'time dilation effect' tells us it should read only half as much, i.e. pi R / 0.866c. But this does not seem to agree with the local observation of the observer on the disc, who says that as A passes each clock, B', C', D' and so on, round to Z', it reads only half as much as these clocks. When A is next to Z' , then if Z' is immediately next to A' so that in effect they can be considered to be in the same location, then the reading on A is again 2 pi R / 0.866c after one revolution, but the reading on Z' is twice this, i.e. 4 pi R / 0.866c, so an observer at Z' says it is the clock at A which ran slow by a factor of two.
We now can see the solution to our problem. After one revolution A' reads pi R / 0.866c, but Z' reads 4 pi R / 0.866c. Although we have synchronised each clock on the disc with the one next to it, after carrying out this process right round the disc we end up with a discrepancy. A' and Z' both run at the same rate, but the readings differ by a constant value, which in our example is 3 pi R / 0.866c. This discrepancy is a constant value, it does not increase after more revolutions, and is simply the difference between any two clocks if one is left at one location on the disc while the other is carried at a very low speed round the perimeter back to the start. The same magnitude of discrepancy will be found irrespective of which direction the 'moving' clock is transported round the disc, but it will be of opposite sign for opposite directions.
Note that it makes no difference if we use a more complicated local method of synchronising the clocks, such as measuring out the mid point between two locations and moving a pair of clocks symmetrically out to these locations, but there are other non-local methods we could try. Suppose we started with several clocks at the centre of the disc and transported each of them out towards the perimeter at the same speed. The symmetry of this operation ensures that there is now no discontinuity at any point around the disc, so it seems we now have synchronised clocks around the perimeter, and no discontinuity to explain away the 'paradox' of each observer claiming it is the others clock which runs slow. To see the error in this method consider two adjacent paths along which clocks A' and B' are transported, as in Fig.13.
There is one important factor we have neglected, and this is the 'Coriolis force'. This force is experienced by any object travelling in any direction on the surface of a rotating disc. This is in addition to the centrifugal force, and these two forces add. The Coriolis force is perpendicular to the direction of motion. Along the path from the centre to the perimeter the Coriolis force is in the direction shown in Fig.13. The force can be interpreted in two different ways by observers travelling with the clocks. They could take account of the force they must apply to keep their path straight, and interpret this as an accelerating force, and therefore say they are accelerating in the direction of rotation of the disc. This would be agreed by the non-rotating observer, who sees the speeds of the clocks increasing in his reference frame from zero at the centre up to 0.866c at the perimeter. As pointed out earlier, accelerating clocks run at different rates if they are at different locations in the direction of acceleration (although this was not demonstrated in the general case, only in a particular type of extended reference frame), so we would expect the clocks to no longer be synchronised when they reach the perimeter, and so this method of 'synchronisation' is ineffective. The observers moving with the clocks could interpret the Coriolis force as a gravitational field, and then they would say they were at different gravitational potentials, and again they would expect their clocks to run at different rates.
We could analyse the geometry of the disc, and discover that for an observer on the disc carrying out measurements the geometry is found to be non-Euclidean. Radial measurement will agree with the non-rotating frame, because the velocity is everywhere perpendicular to this direction, but measurement along the perimeter gives a greater value because of the apparent contraction along the direction of motion. The ruler on the disc will appear only half the length as seen by the non-rotating observer, so twice as many ruler lengths will fit into the circumference. The ratio of circumference to radius is therefore 4 pi instead of 2 pi for the observer on the disc with perimeter velocity 0.866c. Both time and space are therefore distorted in some way by the rotation, as expected.
A more interesting problem is the question of how clocks on the surface of the earth behave, taking into account the combined effects of rotation and gravity. I believe there is experimental confirmation that two clocks at sea level run at the same rate if one is at the north pole and the other at the equator. It appears that they should run at different rates because the clock at the equator is equivalent to the clock on the edge of a rotating disc, while the clock at the pole is at rest in the non-rotating frame. The effects of gravity may be expected to be zero, because points at sea level are at the same gravitational potential. So why is no difference observed?
Consider observers equipped with clocks. P is at the north pole, and E is at the equator. The sea level is actually further from the centre of the earth at the equator than at the pole because of the centrifugal force resulting from the earth's rotation. This force is in opposition to gravity, so the sea level rises compared to the poles. This is shown exaggerated in Fig.14.
Imagine another observer, A, hovering just above the earth's surface at the equator, and travelling in the opposite direction to the rotation at the exact speed needed to cancel the rotation. A is then in a non-rotating reference frame, the same as P, but at a greater distance from the earth's centre, so at a higher gravitational potential. There is no rotation, no relative velocity, and no centrifugal force to worry about, so the gravitational field is the only effect involved. The higher potential means that the clock at A runs faster than the one at P.
Now compare A with E. As E passes A they are at the same location, so there can be no potential difference from any source, either gravity or centrifugal force. Potential difference is given by force times distance, so if the distance is zero so is the potential difference. The only effect now is the relative velocity, and we can apply the Lorentz transformation equations to the local region to show that in the non-rotating frame, according to A, the clock at E is moving past and runs slow. If the clocks are compared after every complete revolution of the earth, it will be the rotating clock, E, which is found to be slow, as for the rotating disc example examined earlier. So, E is slow compared to A because of the relative velocity, and P is slow compared to A because of the potential difference. It turns out that these two differences are equal, and so the clocks at E and P run at the same rate.
From the point of view of E and P, if they are unaware of the rotation or the shape of the earth, they can observe that they are a fixed distance apart, so there is no relative velocity, and the only other possible effect is the potential difference, which they agree is zero because sea level defines a constant potential surface, so there is no reason why they would expect their clocks to run at different rates.
Reality is of course rather more complicated than this idealised analysis. Sea level varies significantly throughout the day due to the gravitational field of the moon, and there are time delays to be expected for large masses of water to move around to change the level, giving different effects in different locations, so even a definition of sea level is problematical.
This particular example demonstrates a method of analysis which is often useful, and this is to introduce an additional, non-rotating, or non-accelerating frame, and analyse everything in this frame. The well known experiment where atomic clocks were flown round the world in opposite directions and then compared is most easily analysed in such a non-rotating frame.