Bohr's theory of the atom
In Chapter 5 we looked at going ‘beyond’ data to more comprehensive data that extends the boundaries of conventional laws and thinking. This data creates the Halo around the original data point.
There is more to learn and understand about Halo data from the discipline of physics. In Niels Bohr's presentation of the model of the atom in 1913, the most stable, lowest energy level is found in the innermost orbit. This first orbital forms a shell around the nucleus and is assigned a principal quantum number (n) of n=1. So the metadata is the most stable. In data science terms it has the lowest potential energy to release but has attained the highest order n=1, therefore it has the greatest realised value to the business, it occupies the innermost orbit and forms a shell around the central data point.
Additional orbital shells are assigned values n=2, n=3, n=4, etc.
As electrons move further away from the nucleus, they have potential energy and become less stable. So, with our Halo data, as we move further away from the data point and more into assumption and unverified data, the data becomes less stable but the ‘potential energy’ of that data increases. For example, the political leanings of Peter may be unverified, a matter of assumption rather than fact, and so that piece of data may sit out in the n=6 orbit – but it may have huge potential energy if we can verify it as a fact. But as a ‘fact’ at this stage it is very unstable, very unassured, the confidence level is low. Our Halo data fits Bohr's model of the atom.
To continue with Bohr's model, atoms with electrons in their lowest energy orbits are in a ‘ground’ state, and those with electrons at higher energy orbits are in an ‘excited’ state. Quantum mechanics describes the movement of electrons from an outer orbit to an inner orbit and energy being released. So, data points (remember our Peter example) with simple metadata associated with them are in a ‘ground state’. Data points with a Halo of data are in an ‘excited state’. As data professionals, as data scientists, we want data in an excited state: this is where the ‘potential’ exists.