What is Segment and how do we use it at Autobooks?
Published by Jordan Skole on 2/12/2022
Segment is a tool that follows a strong convention over configuration approach that makes the process of reporting state to 3rd party tools easier and more maintainable than configuring each tool independently.
Segment follows a common convention of reporting user state (user, and user attributes), and state transitions (events, and event attributes).
Segment abstracts the logic required to map user state and state transitions to 3rd party tools, so that we don’t have to worry about configuring each tool independently.
“Segment” and this “immutable ledger of state” convention can be used interchangeably. There are other tools that follow this convention and can be implemented instead of Segment (like mParticle, or BigPicture.io).
Immutable Ledger of State?
From “What is State, Mutable State and Immutable State?” on stack overflow, with my additions in brackets:
You have state when you associate values (numbers, strings, complex data structures) to an identity and a point in time.
For example, the number 10 by itself does not represent any state: it is just a well-defined number and will always be itself: the natural number 10. As another example, the string "HELLO" is a sequence of five characters, and it is completely described by the characters it contains and the sequence in which they appear. In five million years from now, the string "HELLO" will still be the string "HELLO": a pure value.
In order to have state you have to consider a world in which these pure values are associated to some kind of entities that possess an identity. Identity is a primitive idea: it means you can distinguish two things regardless of any other properties they may have. For example, two cars of the same model, same colour, ... are two different cars.
const myCar = new Car() const yourCar = new Car()
Given these things with identity, you can attach properties to them, described by pure values. E.g., my car has the property of being blue. You can describe this fact by associating the pair
myCar.color = "blue";
to my car. The pair ("colour", "blue") is a pure value describing the state of that particular car.
State is not only associated to a particular entity, but also to a particular point in time. So, you can say that today, my car has state
/// Tomorrow I will have it repainted (state transition) in black and the new state will be
myCar.color = "black"
Note that the state of an entity can change through a state transition, but its identity does not change by definition. Well, as long as the entity exists, of course: a car may be created and destroyed, but it will keep its identity throughout its lifetime. It does not make sense to speak about the identity of something that does not exist yet / any more.
If the values of the properties attached to a given entity change over time, you say that the state of that entity is mutable. Otherwise, you say that the state is immutable.
The most common implementation is to store the state of an entity in some kind of variables (global variables, object member variables), i.e. to store the current snapshot of a state. Mutable state is then implemented using assignment: each assignment operation replaces the previous snapshot with a new one. This solution normally uses memory locations to store the current snapshot. Overwriting a memory location is a destructive operation that replaces a snapshot with a new one. (Here you can find an interesting talk about this place-oriented programming approach.)
An alternative is to view the subsequent states (history) of an entity as a stream (possibly infinite sequence) of values, see e.g. Chapter 3 of SICP. In this case, each snapshot is stored at a different memory location, and the program can examine different snapshots at the same time. Unused snapshots can be garbage-collected when they are no longer needed.
Advantages / disadvantages of the two approaches
Approach 1 consumes less memory and allows to construct a new snapshot more efficiently since it involves no copying.
Approach 1 implicitly pushes the new state to all the parts of a program holding a reference to it, approach 2 would need some mechanism to push a snapshot to its observers, e.g. in the form of an event.
Approach 2 can help to prevent inconsistent state errors (e.g. partial state updates): by defining an explicit function that produces a new state from an old one it is easier to distinguish between snapshots produced at different points in time.
Approach 2 is more modular in that it allows to easily produce views on the state that are independent of the state itself, e.g. using higher-order functions such as map and filter.