MetaSource_DATE_DataPoints for genealogy & other microhistory projects, a pathway, directory/folder and file naming system
2017/03/11 § 9 Comments
I am of an age to begin genealogical projects. While younger the interest was the same as it is now, but not to begin, even if then there was more time, or because there was more time.
Of course, in the course of time people die, data disppears. And contra, in modern times technology progresses, access to archives of data get easier. Swings and roundabouts.
About seven years ago I started with some genealogy software and learnt about GEDCOM files. Played around a bit with various applications. Looked at some inherited documents. Scanned them. Saved them. Went online, search some strings from 1980s typewritten family trees, and discovered ancestors on Monaropioneers.com. Learned some ancestors were actually born in New Zealand.
And then it was seven years later. Somehow.
At some time in those seven years I had discovered GRAMPS, opensource genealogy software, but had never actually used it. So a month ago I started using GRAMPS and looked at my mess of digital files, scans of old letters and documents, and downloads. Where to begin?
But I did begin.
Apparently some of my decisions were the usual ones for beginners. By family, by person, by official docs, by correspondence, by photo. Not mistakes but on going onto the GRAMPS email group and sharing my directory/folder structure, someone went yeahbut.
They mentioned the French way of archiving. And so I went off learned about Respect des fonds, Original order, Provenienzprinzip and so on. Now I already knew about Provenance and how to cite so my file naming was okay, I thought, so that didn’t change much, what did change was my folder structure, and the big picture.
And how file name and their directory/folder structure could contain implicate order.
In citing it doesn’t matter which copy of the book you actually used. Most citation systems assume that there are libraries you can go to and find some copy or other of a book or article to check out. (There are many styles of citation. I’ll just note Harvard.)
In archiving however, which actual physical copy becomes more important. History works with source material and so this information is vital to the work, not just what book, but which book where and when. And whose.
Genealogy is a microhistory, so we should learn its lessons. The history of the source material needs to be tracked and made available. I have family trees from the 1980s but some I do not know who made them, and none have supporting material.
The folder structure I had originally designed was organised according to how I was researching, not how the information might be useful later to someone else. I was focussing on my own methods of hunting down the next link in the chain, the next clue, completeing the picture in my head, and not how that picture would appear to someone else. So at this point the question became, as my project was relying on the work of others in creating the chain in the first place —was my work going to help that chain go on? And if so, could it be done better?
These days a lot of genealogy is online. There are the commercial suppliers, which now include DNA matching. As well there are the digitised and more recently computerised actual sources of information i.e. NSW Registry of Birth Deaths & Marriages.
I was looking at how to make my file names and directory/folder names be more helpful to someone else looking at the data. This is important because while GRAMPS has a excellent support for citation, source through to repository for events and places in a lifespan, it only links to what you put on the harddrive. You have to think about how to organise it.
This is a good thing.
And even if there were no GRAMPS, (software disappears over time, orphaned, abandoned, scrapped, turned into useless subscription models which dumbdown the use through time [yes I mean you Adobe]) the question would become would that bunch of files and pathway names be good enough for someone else to work out what was going on?
What is interesting at this point in writing this very blog post is that the time taken so far to write it is a gazillion times longer than the time it took to work out the following:
It is important to point out here that a moment before I decided on this structure for the entire pathway (directory structure & file name) I decided against an abbreviation system, i.e. coding. E.G. “B” for birth certificate– B1980-ClaudeManning.pdf or something. I’d try to be as obvious as possible and use the word Birth not “B”. Abbreviations are not banned, but their dismissal lead to a eureka moment. Yes, I was in the bath.
(Abbreviations which have a recognised life out in the real world are admitted e.g. NSW for ‘New South Wales’ and BDM for ‘Births, Death and Marriages’ (Registry).
My pathways were going to include as much non-abbreviated infomation as possible, but be as succint as possible. I also decided that it did not matter where the info was placed, in a many layer pathway system with many folders, or a flatter system with fewer folders. Or more likely, both, so they could integrate depending on load.
Constraints on this are operating systems abilities with respect to pathway and file name length, as well as forbidden characters. This is a big issue as I want my system to be useful to other people across time and space and strange operating systems— across which information might be conveyed (without being looked at). But I’ll describe that detail at the end.
I’ll now breakdown the three elements of MetaSource_DATE_DataPoints.
This is all the Respect des fonds, Provenienzprinzip and Provenance stuff, as informed by Original order. The MetaSource is comprised of the elements:
① Repository (where the information/ material is located)
② Source (a registry, a letter, a book)
③ Authority (author)
Some MetaSources are the one and the same, for example the NSW Registry of Birth, Deaths, and Marriage (NSWBDM) is ① ② and ③ all rolled into one. Clarifying bits may go in the pathway/filename structure after the…
This is the pivot where we move from the big world to the micro details. It uses the ISO 8601 standard, e.g. 2017-02-14.
Having the ③ Authority immediately before the DATE acknowledges the Harvard AUTHOR-DATE citation style often used in the humanities.
This gets the item level stuff. The DataPoints is a an arrangment of:
ⓐevent: birth, death, marriage, occupation, etc
ⓔsource pointer (page number, record ID)
⑤Type: map, photo, scan, downloaded, screenshot, can include secondary DATE
⑥MISC: original file name, original URI
Number ④ Subjects — one can list all or none.
Number ⑤ Type — one can list all or none possible types… for a photo of a scan or a printed screenshot.
If an element can appear twice it may do so, but usually just use the earlier placement.
Imagine a hard drive or some memory device or even somewhere in the cloud where there is an archive of digital and digitized documents. Here is the MetaSource_DATE_DataPoints schema to a birth certificate. The datapoint order need not be consistent, but it would help to do so when one eyes scan down a directory’s listing.
SomeArchive/ NSWBDM/ NSWBDM_1980-06-22_Birth_ManningClaudeRobert_Sydney_1234-1980_scan.jpg
“_” underscore separates elements
“-” appends or supplements an element with some qualification or further detail and resolution, as what it does in a DATE 2017-02-14.
One could introduce subfolders under NSWBDM each for Birth, Deaths and Marriages, but keep the file name as is, and don’t drop birth from the filename. Now just looking at the one example it looks very ugly, but with a dozen on screen in one repository folder an order does become apparent.
Also, I’ll admit that with digitisation and online resources the lines between a fonds and a provenance or even authority get real blurry, but the whole thing is in my metaFonds now so that’s the way I do it.
SomeArchive/ ManningFonds/ ManningFonds_LetterFrom_DaviesM_1963-08-25_toManningClaudeRobert _12AliceSt_scan_2017-03-01.pdf
SomeArchive/ Googlemaps/ Googlemaps_2017-02-14_1PoolSt-Otley_map_satellite_screenshot.jpg SomeArchive/ ArchiwumMapZachodniejPolski/ ArchiwumMapZachodniejPolski_Messtischblatt_1940_Ostrowo-map_download_2017-02-21.jpg
A PROTIP or two
Anything you wanted to do by naming or organising directories according to Family Branch, Family, Person, types of documents, photos (as I did start) with the above MetaSource_DATE_DataPoint system in place — can be done with virtual folders (or saved searches or what Apple Mac OS calls smart folders) because all those datapoints are in the filename. Nothing is lost, a lot is gained.
Keep datapoint elements in the same order, this will help pseudo-folderise files on screen.
Thus, LastnameFirstname for organizing families.
Separate files of scan of the same document? Number the type element with a padded suffix:
(Scan the backs of your photos, they can contain more information than you realise.)
I will also add GRAMPS ID numbers to various elements, particularly for clarity and consistency.
Devil in the Detail : Characters
Remove characters which some computer operating systems find difficult or special:
/ \ .
& Windows in particular: from Naming Files, Paths, and Namespaces (Windows)
< (less than)
> (greater than)
” (double quote)
/ (forward slash)
| (vertical bar or pipe)
? (question mark)
Imagine you grab a USB drive and copy stuff to it and not look at it and then copy it elsewhere and realise data is lost because it was some old FAT formatted thing, and you don’t even have Windows in the house!
Windows also finds long file pathways (this includes both file name & nested folder/directory structure) very difficult, as these are going to be long file names, keep the folder nesting structure as simple as possible. (If burning to an optical disc DVD or CDROM it would be best to compressed the entire structure first to .zip or similar, and burn that file. Optical disks do not support very deep pathways and lengthy filenames.)
Also remove spaces or hyphens from surnames or phrases and CamelCase them. And use CamelCase when removing spaces generally.
I use “~” the tilde to indicate the orginal (format). For example
I’ve also reserved ‘ ` ‘ or tick to indicate primary date.
Maybe keep white space for original file names, but you will have to remove the dots from any original file names or URIs and leave only the one dot to separate the file extension.
nla.12345.D23-123.pdf to …_download_nla-12345-D23-123.pdf
NOTE: File type extensions (i.e. .txt .jpg .odt) may not be included in the above descriptions. Obviously they are another type of DataPoint.
Thus we have all the:-
2012/12/09 § Leave a comment
Commuting home I cycle by a recently re-formed footpath near my home. It’s at a particularly steep little section and when tired I often push my bike up it. Doing so the other day I noticed the graffiti scratched into the then wet cement surface had captured a turning point, a shift in how we use dates, particularly the year, in relation to our names.
Foundation stones provide information about when a building or bridge is built. They may be more or less formal. The more grand and formal the building the more formal, and more informing to the reader, the foundation stone becomes. Particularly if someone very important lays a foundation stone at a ceremony. Like, say, the visiting Duke of Edinburgh, in Hobart, Tasmania in 1868.
(American usage dominates the web still, and wikipedia, so their term “cornerstone” is to be found online with the sense I am using “Foundation Stone”, while “Foundation Stone” is restricted in the USA usage to refer to a particular ethnic or religious site in the Middle East.)(God knows why.)
(And North American usage uses Sidewalk for Footpath while in Britain they just say Pavement.)
When people graffiti their name alongside a year, they often use the very year the graffiti is made, much like when a foundation stone is laid. I’d argue they are copying this formal ceremonial practice in a street style. Doing so, they are claiming the concrete in their own name, even though, like a foundation stone, the structure was actually someone else’s work. (No doubt the habit of painting the road on New Year’s Eve with the new year’s year supports this practice.)
Above you can see the name scrawlers SAPPHIRE, TYLAR, COREY & CRYStAL, have used the traditional Foundation Stone approach claiming the concrete as their work in 2012. (It is possible they are all by he same hand.) Right next to it on the same concrete pour is the following.
This SLiPKNOT Matt has not used the year of the concrete’s pour, but, most likely, the year of their birth, 2001. This follows the recent habit of distinguishing, somewhat, an email username or internet handle by year of birth from all the other Matts, all the other SLiPKNOT Matts, or, at least, from those born in other years. Otherwise you are just WarriorWomanNumber2314567@email.com and this works against the very idea of naming at all.
So this is a transitional piece and in the years to come the foundational style of scrawling one’s name in freshly poured concrete will fade away. It will just be an opportunity tag, and not a foundational, or cornerstone, mimicry.
Google Streetview (in Jan 2016) still doesn’t have the now 3 year old concrete. (Down to the right a bit, off screen.)
2012/05/15 § Leave a comment
At my Web 1.0 style personal homepage trying to pass itself off as a gallery, I’ve just worked through to a labelling of the current figures I am working on. I have this need to put them in sets, I do this by naming them.
For example Consorts to the Mountain Goddess.
The new set is Figures of Anticeptual Art. They will not get their own blog.
Now, the thing is, in realising the name Figures of Anticeptual Art I suddenly also recollected that the first of these figures was made two years ago. Thus #Swineflu is Born! (pewter, 2009, wallaby dung outer investment) is the first example of the process where naming is a conscious method of finishing the artwork.
It doesn’t start with an idea or concept, for the naming finishes it. The art is realised, not conceived.
I had just recovered from the misnamed swineflu, (I caught the #swineflu from a young woman who served me a hamburger as I transited through Melbourne back to Hobart from Weilmoringle.)(She did not look well and should not have been at work.) At this time I was wanting to send a piece to the Twitter Art Show, so as I broke open the wallaby dung and plaster it was obvious what the piece should be called. I stopped then and there. I did not even cut it off its cup to retrieve all that pewter.
It was finished in the moment I realised what its true name was.
Twitter hashtag and all.