I. The Why: A Legacy in the Making
Every data project needs a starting point, a reason to hit “Search” for the first time. For me, it wasn’t a school project or a casual curiosity. It was triggered 16 years ago after the birth of my son.
Holding him for the first time, I had a sudden, profound realization: one day, he is going to look at me and ask, “Where did we come from?” I realized then that I didn’t have a good enough answer. I knew the “who” of my immediate family, but the “where” and the “why” were lost in the fog of time. I set off trying to answer that question for him, only to find myself caught in the most addictive investigation of my life.
The 16-Year “Drip”: Genealogy as Iterative Processing
What started as a search for a few names turned into a 16-year obsession with iterative processing. Genealogy isn’t a weekend hobby; it’s a living document that I have audited, abandoned, and revisited dozens of times. In fact, I initially started a separate family tree 18 years ago and learned quickly that I had accepted too many erroneous leads too quickly, so I started again
I might step away for a year, but the archive never sleeps. While I’m busy with life, Ancestry’s algorithms are busy indexing millions of new data points.
When I finally log back in, I’m not just looking at the same old names. I’m met with a raft of newly highlighted information. It feels like a cold case investigator returning to the evidence locker to find that technology has finally caught up with the crime scene.
The Chain Reaction of a Single Fact
The “addictive” nature of this journey comes from the Data Explosion. In genealogy, a single confirmed fact isn’t just a destination; it’s a key.
- You find one marriage certificate that confirms a mother’s maiden name.
- That maiden name acts as a “Primary Key” in a database search.
- Suddenly, a flurry of new discoveries opens up—five new siblings, a previously unknown migration, or a link to a completely different county.
The Steward’s Patience
Over nearly two decades, my perspective has shifted. I started as a Collector, hunting for the most “exotic” names I could find. But today, because of my son, I am a Steward.
The excitement isn’t just in the claim; it’s in the confirmation. It’s about working through that raft of new information with a sceptical eye, ensuring that every “Harrison” or “Woods” I add is verified and cited. I’m not just building a tree for myself; I’m cleaning a database so that when my son, or my daughters or any of their future descendants ask that question, the answer lies in the validated data
II. The Two Halves of the Archive: Yorkshire vs. Ireland
Genealogy is rarely a uniform experience; for me, it has been a “Tale of Two Databases.” On one side, I have the stability of Northern England: on the other, the fragmented, often frustrating records of my Irish ancestors.

The Harrison Anchor: The “Viking” Stronghold
The Harrison line is my “Data Anchor.” When I began this 16-year journey, I knew the trail led back to West Yorkshire, specifically the Wakefield area. The Harrisons weren’t just passing through; they were rooted. This deep regional consistency is why my cousin’s DNA results came back as 100% Northern England.
Historically, this area was the heart of the Danelaw. The “Viking” phenotype—the big build and sturdy frame just like my dad. In Wakefield, the records are largely intact, providing a clean, “High-Definition” view of generations of Harrisons who survived the Industrial Revolution in the same few square miles.
The Woods Migration: Chasing Shadows in the Irish Gap
The narrative shifts dramatically when I look at my mum’s side: the Woods lineage. This is where “Data Stewardship” becomes a real test of patience. Tracing the Woods family means tracking an Irish immigration story, and unlike the stationary Harrisons, this line involves movement, struggle, and a significant “Data Black Hole.”
The 1922 “Data Corruption”
In June 1922, during the Irish Civil War, an explosion and subsequent fire at the Public Record Office in the Four Courts, Dublin, destroyed nearly seven centuries of Irish history. There were no cloud back-ups in those days
- The Loss: Census returns from 1821, 1831, 1841, and 1851 were reduced to ash.
- The Consequence: For the Woods line, this is the ultimate “Corrupted File.” You know the ancestors existed—you are the living proof—but the primary evidence that links them to a specific parish or parentage is gone.
The Steward’s Workaround
As a data steward, when the primary source is destroyed, you have to look for “Parity Data.” This means:
- GRONI & Civil Records: Scouring the General Register Office for Northern Ireland.
- Tithe Applotments: Using land tax records from the 1820s and 30s to find “potential” matches.
- The UK Census: Using the 1851 or 1861 English Census as a “bridge” to see where they claimed to be from before they crossed the Irish Sea.
This side of the tree requires more than just “clicking a leaf.” It requires weighing probabilities and accepting that some branches may never be fully “Validated” to the standards of the Harrison line. You aren’t just a researcher here; you’re a forensic analyst trying to reconstruct a document from the smoke it left behind.
III. The Sheffield Paradox: A City of “Newcomers”
Living in Sheffield today, it’s easy to view it as a permanent, bustling metropolis with deep, ancient roots. But as a data steward looking back through the 19th-century census records, you quickly realize a startling truth: almost nobody was actually “from” here.
Just a few generations ago, Sheffield was a collection of townships that exploded in population almost overnight. When I look at my tree, I see the gravity of the steel industry acting as a massive human vacuum, pulling my lineages together from across the map:
- The Harrisons: Drifting south from the stable, rural-industrial pockets of Wakefield.
- The Woods: Arriving from Ireland, fleeing the famine or seeking the “Steel City” promise of work.

The “Data Empty” City
If you go back to the early 1800s, Sheffield’s population was roughly 60,000. By 1900, it had surged to over 400,000. As a researcher, this means your “Sheffield” roots are often surprisingly shallow. You find your ancestors in the back-to-back terraces of the Don Valley, but the moment you look at the “Place of Birth” column in the 1881 or 1891 census, the data points scatter.
They point to Irish villages that were decimated by emigration or Yorkshire hamlets that haven’t changed in centuries. As a steward, you realize that Sheffield isn’t the origin of your data; it’s the merger point where these disparate strings finally synced up.
IV. The Mathematics of the “Invisible Crowd”
This geographic scattering is where the math of genealogy becomes truly staggering. While the “Sheffield” portion of the tree looks like a narrow trunk, the roots beneath it are expanding at an exponential rate.
The Power of Two Genealogy is governed by a strict geometric progression. Every generation you travel back, the number of your direct ancestors’ doubles. It’s obvious when you think about, but quickly becomes a staggering amount of data
| Generation | Relation | Data Points (Direct Ancestors) |
| 1 | Parents | 2 |
| 4 | GG-Grandparents | 16 |
| 10 | The Industrial Revolution (c. 1800) | 1,024 |
| 14 | The 1600s (Parish Records) | 16,384 |
By the time I reach the 1600s—the limit of most parish records—there are over 16,000 people who had to survive, meet, and migrate for me to be sitting in Sheffield today.

The Sibling Multiplier
But as a steward, you aren’t just tracking those 16,000 direct lines. In the era of the Harrisons and the Woods, families of 8, 10, or 12 children were the norm.
- If we assume a conservative average of 5 children per household who lived to reproduce, by the time you go back just 4 or 5 generations, the “Cousin Pool” reaches into the thousands.
This explains why, in a city like Sheffield, you can walk past a stranger and there is a statistically high probability that you share a 19th-century ancestor. It’s not just a family tree; it’s a regional web. It’s also why my cousin in Australia feels that “Northern” pull so strongly—thousands of miles away, she is still tethered to that same exponential web that began in these Yorkshire hills.
V. The “Gary Cahill” Moment: Data in the Flesh
While I was busy chasing the “Paper Trail” of the Woods line and trying to bridge the Irish “Data Black Hole,” a piece of verified family history literally walked into my living room. This is the moment every genealogist dreams of: when the abstract numbers on a screen manifest in the real world.
The Encounter: “Our Gary”
About five years into my research, I was visiting my grandmother. She wasn’t well, and the house was a revolving door of extended family and distant cousins. My son, who was seven at the time, was with me. We had just come from a match, and he was still proudly wearing his Kiveton Park FC kit.
One of my distant cousins looked at him, squinted, and said, “He really reminds me of ‘Our Gary’.”
I laughed and asked who “Our Gary” was. She looked at me with genuine Yorkshire surprise: “Surely you know our Gary? Gary Cahill?”
The Reveal: From Census to Captain
It turned out “Our Gary” was Gary Cahill—the Chelsea legend and England Captain.
As a football coach myself, this was a massive “Data Enrichment” moment. I hadn’t yet managed to trace that “Royal Thread” to Mary Queen of Scots, but I had just discovered an elite athlete in my own generation.

The Steward’s Validation
However, a good data steward doesn’t just accept hearsay at a family wake. I went straight back to my Ancestry files at the next opportunity. I didn’t just want it to be true; I wanted it to be proven.
- I cross-referenced the branches of the Woods and Cahill lines.
- I traced the common ancestors back to that same South Yorkshire/North Derbyshire border.
- I validated the connection: he is my 2nd cousin.
The “Blood” Connection
This discovery bridged the gap between dry records and living traits.
- The “Football Gene”: Whether it’s the Yorkshire “Viking” build from the Harrison side or the athleticism of the Woods line, seeing it manifest in an England Captain—and in my own children’s love of the game—proved that this data isn’t just on a screen.
- The Global Web: My cousin in Australia, despite being thousands of miles from the steel mills of Sheffield or the fields of Wakefield, is part of this same genetic lottery. For her, discovering we are “100% Northern” via her DNA kit is a way of claiming a territory she’s never lived in. Her urge to discover is a search for an anchor; mine is a never-ending search for discoveries
The Flurry of Discovery
Confirming Gary wasn’t just a “cool fact”—it acted as a Validation Anchor. Because I could definitively link our families in the 20th century, it provided a “known good” data point that allowed me to look at the older, murkier Irish records with fresh eyes. In genealogy, one confirmed truth often triggers a flurry of others.
It also gave me a validation for my own football loving kids, and being their coach I used it to help ‘motivate’ them even more
VI. The Golden Aims: Queens, Presidents, and the Longitude King
While the Gary Cahill discovery provided a modern validation, my 16-year project is driven by three “Golden Aims.” These are the high-stakes investigative goals that require the most rigorous data cleansing and keep me coming back to the screen.
The “Longitude” Connection: John Harrison
The most intriguing “Golden Aim” was sparked by, of all things, an episode of Only Fools and Horses. Watching Del Boy find a lost watch reminded me of John Harrison, the self-taught clockmaker who solved the “Longitude Problem” with his H4 chronometer.
John Harrison was born in Foulby, just a few miles from Wakefield. As a Harrison researcher with deep roots in that exact patch of West Yorkshire soil, the search for a link to the man who saved the British Navy is irresistible, especially with my dad being ex Royal Navy. It’s a pursuit of “Precision Data” in both name and spirit. Like John Harrison, I am obsessed with accuracy; a “close enough” link isn’t a link at all.
The Presidential Parallel
The third aim is the Harrison Presidents. While Benjamin and William Henry Harrison are icons of American history, their roots lead back to the “Great Migration” from Northern England.
As a steward, this is where the discipline kicks in. It would be easy to find a “John Harrison” in a 1640 Virginia record and assume he is the same man from a Wakefield register. But I won’t accept the “Presidential Badge” on my tree until I find the Chain of Evidence—the ship’s manifest or the probate record that proves the bridge was crossed.
The “Royal” Thread: Mary Queen of Scots
Finally, there is the bombshell from Australia. My cousin’s DNA results—that 100% Northern signal—came with a secondary discovery: a potential thread to Mary Queen of Scots.
I have been dipping in and out of this thread for years. To link a modern Harrison to a 16th-century monarch requires finding a Gateway Ancestor—a bridge between common records and noble pedigree. Until I have three independent sources confirming the link, Mary remains a Hypothesis, not a Fact. But a compelling reason to keep dipping back in.

VII. Conclusion: The Living Archive
I started this journey 16 years ago because of my son. I wanted to give him an answer to “where we came from.” What I found was a lifelong addiction to the truth.
I am a Data Steward. I am cleaning the records, bridging the “Woods” Irish gaps, and securing the “Harrison” foundations. Whether we are descended from a Queen, a President, a genius clockmaker, or simply a long line of stubborn Yorkshire Vikings, the value is in the accuracy.
I might not have found the “H4” in my attic like Del Boy, but I’ve found a legacy that is just as valuable. I’m just the custodian of the archive, making sure that when my children take the keyboard, the data is clean, the links are verified, and the story is true.

The Sheffield Melting Pot: A Shared Legacy
My 16-year deep dive has taught me that no family tree is an island. While I’ve been chasing the Harrison and Woods lines, my research into my wife’s heritage has revealed a strikingly similar pattern of “Industrial Gravity.” Her ancestors followed a parallel path, congregating in Sheffield from the geological edges of the UK: Cornwall, Wales, and Ireland. This was a family built on the back of the mining industry—generations of men and women following the coal and tin, moving from the depths of Cornish mines and Welsh valleys to the forge of the North. Just like my own tree, her side comes with its own mix of high-stakes challenges and surprises. From the linguistic “sticking points” of Welsh records to the elusive paper trails of the Cornish tin miners, her heritage adds new layers of data to our shared archive. As a steward for our children, I’m not just preserving one side of the story; I’m documenting how these two disparate migrations—one from the Viking North and one from the Celtic fringes combined in Sheffield to create their unique DNA. It’s a reminder that we are all, at our core, a collection of data points that have travelled hundreds of miles just to meet in the middle.
Explore more “Behind the Brand” stories in our Business Analytics Blog to see the obsessive attention to detail we bring to every project
