High-powered hit-and-run data grabbing technique

You’ve invested a day off and a tank of gas to have a precious four hours at an ancestral county courthouse, state archive, or local library. You want to make the mostof every minute. Here’s my high-powered hit-and-run data grabbing technique to get home with maximum ancestral data in minimum time.

Ready, set, …!


First, we’re assuming this is a wide-net data mining expedition, gathering sources on everyone with the right last name. You’ll examine their connection to you later. You’re not targeting one particular fact. (If you’re researching the name Smith in New York City, of course, clearly you will need to do something different than a wide-net search.)

Second, we’re assuming you’re a high-tech GEG. You’re going home with data, not paper.

Third, we’re assuming the courthouse you are visiting has not banned cell phones, laptops, etc. (If they have such bans, take paper, pen, and money for copies, with my sympathy.)

Plan and pack

Look at your existing tree data and figure out which last names you need to search in this county.  Create a list of the names in your to-do list. (I use Zotero for my to-do list, as most of you know.) Make sure you’ve synced your online trees to your desktop tree, if you have both. If you’re going to a courthouse, you probably will not have Wi-fi, so have essential information stored on your computer or tablet.

Make sure you have the basics:

  • laptop computer or tablet, fully charged
  • your list of family names to pursue
  • notetaking software (For me, it’s always Zotero.)
  • smart phone with a scanner app (I recommend ScannerPro for iPhones and CamScanner for Androids.) or take a portable scanner
  • all essential cables
  • an extension cord

Verify the details

You don’t want to waste a minute on nonessentials, so before you go:

  • Determine the location of the records by calling the probate court office.
  • Map it.
  • Figure out where you’ll park, and if you’ll need quarters.
  • Verify they will be open the day you’re going and determine the hours.
  • Make sure the courthouse allows your technology.

Set up your work station

When you’ve arrived and respectfully greeted the record office personnel, find a place to sit where you can plug in your computer or laptop.  Open your family name list and note-taking software. If it’s your first time at this location, take a walk around the room(s) to see where everything is located.

Choose your book

Marriages are a lucrative place to start your data grab, assuming they have an index. A data-grab expedition is no time to be mining unindexed books. But you’ll be getting information that will prepare you for later trips, when you can mine the tougher sources.


Step 1: Extract all occurrences of your family names from the index.

Wherever you take notes in your computer/tablet, identify the book, gathering all necessary source citation information.

From the index, type a list we’ll call your Index List. Include every occurrence of your family names, with the corresponding page numbers. Why bother, you ask? Because it’s painfully time-consuming and awkward to keep flipping from the index to each page you want to process. It also creates dreadful wear and tear on old and precious books.

More importantly, the terms you’re typing will become your guide when you get home to process all of your scans. Trust me. Type the index items — not just the page numbers. If you type a page number incorrectly, you won’t know how to find your way back to the index entry that created the problem without the index term.

I extract everything into a note attached to the bibliographic entry for this book. You’ll see in the center column that a subnote of “Choctaw County Marriages” is highlighted. In the right column, you see my Index Notes. You can use whatever you normally use to type notes.

Click to enlarge.

If you choose a marriage book, you might opt to type only the name of the spouse who has your family name for each marriage. I prefer to capture what the indexer perceived to be the right transcription for each name — in case I have trouble reading the manuscript. It’s just a few seconds more of typing for high value.

Step 2: Create a list of the pages you need, in page order.

If you only have two or three names, or if the index happens to have all your people in page number order you can skip this. Otherwise, take a minute to do this. For your convenience, and again for the best care of the old book, move through the book in page order.

I put this “Page-Order List” in a separate note, so I can have my Index List and my Page-Order List open side by side, quickly grabbing and sorting the page numbers. The Index List displayed in the above step generated this Page-Order List:

Click to enlarge.

If more than one record can be found on a single page, make a notation, so you’ll remember to look for more than one. If I have three records on page 201, I put:

201 xxx

Step 3: Scan each record, making note of anything of interest or concern in the Page-Order List from Step 2.

If you are extracting from a book with a title page, scan that page first. Scan the index pages for your family names, also. Then, scan the pages in page-order. If your scanner app or software allows, gather all the images from a single book into one PDF. If you are forced to scan into individual image or PDF files, one per page, gather all of the images from one book into a single folder and name them by the page numbers.

Let’s say you notice that a date was likely written incorrectly on a record, because the records before and after it show a consistent sequence in time, but your record is out of sequence. Or you notice that the indexer transcribed something inaccurately. You can make a note of it after each page number in your Page-Order List. You can also do something to signal that you have scanned the page. I write “got” out beside, so it looks something like:

31 got
100 got
150 got — next marriage on page might be relative of bride–same last name
165 got
220 got — Indexer wrote wrong page; was actually 202
310 got
311 –missing
368 got

Step 4: Grab another book and return to Step 1, book after book, until time runs out.

If you get only partway through your last book, create a to-do item for yourself, wherever you create your To-Do’s. Your Page-Order List will show you where you left off.

Step 5: Label the scans at home.

The notes you took become your guide to the scans, eliminating the need to stop and add captions or other time-consuming details to each scan while you’re at the courthouse, archive, or library.  But here is the key:

Organize and label it all promptly after you get home!

When you get home, process your scans while the day is still fresh in your mind. Add a header or footer that includes the bibliographic information on each page. Add the original book’s page numbers to each page, if your scan did not pick it up. Make note of anything you found unusual.

If your scans are stored as PDFs, my next blog post will tell you how I add margins to the edges and add footers and other useful information. As a university employee, I get Adobe Acrobat DC for a few bucks, and it does everything I need. But I’m trying out some free PDF options to do the most essential things, and I’ll soon let you know what I found.

If your scanning software creates multiple images, you can keep them in separate files within a folder, if you prefer. But you might find them much easier to process and store if you paste the images into a word processing document, an image per page with very narrow margins. Then use your word processing software to add headers or footers throughout with the bibliographic information. Use text boxes to add the original source page numbers and any explanatory text.

I also create a page number that represents the sequence of the pages I’ve gathered, and put something in front of it like “PDF” or “DOC”.  When I’m citing data from these sources, I use both the original source page number and the PDF or DOC number, so I can quickly find it in the larger document.

Click to enlarge.

Store the scans according to your established filing routine. For a set of documents from a marriage book in a county courthouse, I store it in a main folder called “Places,” a subfolder labeled by the state, and a sub-subfolder labeled as the County.

Create a to-do item that tells you they need processing and where to find them.

Step 6: Update your family tree

When you have time to do it properly, update your family tree to reflect your new finds.

If you want to extract a single page out as a source, you can choose to print the specific page to a PDF, and save it to the specific person the document is about. This way you keep the full set, as you gathered them on your hit-and-run day, and you have individual source documents tied to the persons mentioned.

Last thoughts

This technique becomes habit in no time. You will get used to asking yourself, “What am I going to need to know when I get home?” You never want to trust your memory, so leave bread-crumb trails as you go. Let the two sets of notes you wrote — the Index List and the Page-Order List — become your key to piecing together everything you get home with.

And last, make sure that you have done what you need to do to prevent you from accidentally doing this work over again. Since Zotero is my to-do-list and library of sources, I can always discover if I’ve processed a particular document before. And thanks to the Index and Page-Order lists, I’ll know which family names I processed in full and where I stopped, if I didn’t finish it.

Needless to say, the heaviest load of the work happens at Step 6, and you might have many months of processing ahead of you. But you have the personal satisfaction of knowing you made the most of that four hours in the courthouse.

Related posts:

Share this in your circles...

6 thoughts on “High-powered hit-and-run data grabbing technique”

  1. All of our Marriages, there are none hidden here,(Some folks come in and say, it isn’t online, where is it? If it isn’t online, it does not exist here or it would be online), are online. As well as our early deeds (1820s to 1950s), some Mortgages, Will Books (All), some Judgment Books and much more are online with nearly 1/2 million images with nearly 30,000 of them every word searchable as they have been transcribed by AWESOME online VOLUNTEERS.

    1. Hi, Jeanie. There are several apps available. If I’m away from my laptop, I usually prefer to just use the web version at zotero.org. You can log in and see your information that’s sync’d to their server. I haven’t tried linking attachments on an iPad. To keep from using up all my free Zotero space, I link to attachments stored on my own computer, but they’re backing up to the cloud. So even if I’m away from home without my laptop, I can see them in the cloud.

Leave a Reply to Jeanie Russell Cancel reply

Your email address will not be published. Required fields are marked *