MetaSource_DATE_DataPoints for genealogy & other microhistory projects, a pathway, directory/folder and file naming system

2017/03/11 § 2 Comments

I am of an age to begin genealogical projects. While younger the interest was the same as it is now, but not to begin, even if then there was more time, or because there was more time.

Of course, in the course of time people die, data disppears. And contra, in modern times technology progresses, access to archives of data get easier. Swings and roundabouts.

About seven years ago I started with some genealogy software and learnt about GEDCOM files. Played around a bit with various applications. Looked at some inherited documents. Scanned them. Saved them. Went online, search some strings from 1980s typewritten family trees, and discovered ancestors on Monaropioneers.com. Learned some ancestors were actually born in New Zealand.

kaiapoi firestation earthquake

And then it was seven years later. Somehow.

At some time in those seven years I had discovered GRAMPS, opensource genealogy software, but had never actually used it. So a month ago I started using GRAMPS and looked at my mess of digital files, scans of old letters and documents, and downloads. Where to begin?

But I did begin.

Apparently some of my decisions were the usual ones for beginners. By family, by person, by official docs, by correspondence, by photo. Not mistakes but on going onto the GRAMPS email group and sharing my directory/folder structure, someone went yeahbut.

They mentioned the French way of archiving. And so I went off learned about Respect des fonds, Original orderProvenienzprinzip and so on.  Now I already knew about Provenance and how to cite so my file naming was okay, I thought, so that didn’t change much, what did change was my folder structure, and the big picture.

And how file name and their directory/folder structure could contain implicate order.

In citing it doesn’t matter which copy of the book you actually used. Most citation systems assume that there are libraries you can go to and find some copy or other of a book or article to check out. (There are many styles of citation. I’ll just note Harvard.)

In archiving however, which actual physical copy becomes more important. History works with source material and so this information is vital to the work, not just what book, but which book where and when. And whose.

Genealogy is a microhistory, so we should learn its lessons. The history of the source material needs to be tracked and made available. I have family trees from the 1980s but some I do not know who made them, and none have supporting material.

The folder structure I had originally designed was organised according to how I was researching, not how the information might be useful later to someone else. I was focussing on my own methods of hunting down the next link in the chain, the next clue, completeing the picture in my head, and not how that picture would appear to someone else. So at this point the question became, as my project was relying on the work of others in creating the chain in the first place —was my work going to help that chain go on? And if so, could it be done better?

These days a lot of genealogy is online. There are the commercial suppliers, which now include DNA matching. As well there are the digitised and more recently computerised actual sources of information i.e. NSW Registry of Birth Deaths & Marriages.

I was looking at how to make my file names and directory/folder names be more helpful to someone else looking at the data. This is important because while GRAMPS has a excellent support for citation, source through to repository for events and places in a lifespan, it only links to what you put on the harddrive. You have to think about how to organise it.

This is a good thing.

And even if there were no GRAMPS, (software disappears over time, orphaned, abandoned, scrapped, turned into useless subscription models which dumbdown the use through time [yes I mean you Adobe]) the question would become would that bunch of files and pathway names be good enough for someone else to work out what was going on?

Well?

What is interesting at this point in writing this very blog post is that the time taken so far to write it is a gazillion times longer than the time it took to work out the following:

MetaSource_DATE_DataPoints

It is important to point out here that a moment before I decided on this structure for the entire pathway (directory structure & file name) I decided against an abbreviation system, i.e. coding.  E.G. “B” for birth certificate– B1980-ClaudeManning.pdf or something. I’d try to be as obvious as possible and use the word Birth not “B”. Abbreviations are not banned, but their dismissal lead to a eureka moment. Yes, I was in the bath.

(Abbreviations which have a recognised life out in the real world are admitted e.g. NSW for ‘New South Wales’ and BDM for ‘Births, Death and Marriages’ (Registry).

My pathways were going to include as much non-abbreviated infomation as possible, but be as succint as possible. I also decided that it did not matter where the info was placed, in a many layer pathway system with many folders, or a flatter system with fewer folders. Or more likely, both, so they could integrate depending on load.

Constraints on this are operating systems abilities with respect to pathway and file name length, as well as forbidden characters. This is a big issue as I want my system to be useful to other people across time and space and strange operating systems— across which information might be conveyed (without being looked at). But I’ll describe that detail at the end.

I’ll now breakdown the three elements of MetaSource_DATE_DataPoints.

MetaSource

This is all the Respect des fonds, Provenienzprinzip and Provenance stuff, as informed by Original order. The MetaSource is comprised of the elements:

① Repository (where the information/ material is located)
② Source (a registry, a letter, a book)
③ Authority (author)

Some MetaSources are the one and the same, for example the NSW Registry of Birth, Deaths, and Marriage (NSWBDM) is ① ② and ③ all rolled into one. Clarifying bits may go in the pathway/filename structure after the…

DATE

This is the pivot where we move from the big world to the micro details. It uses the ISO 8601 standard, e.g. 2017-02-14.

Having the ③ Authority immediately before the DATE  acknowledges the Harvard AUTHOR-DATE citation style often used in the humanities.

DataPoints

This gets the item level stuff. The DataPoints is a an arrangment of:

④Subject:

ⓐevent: birth, death, marriage, occupation, etc
ⓑperson/s
ⓒplace
ⓓtitle

ⓔsource pointer (page number, record ID)

⑤Type: map, photo, scan, downloaded, screenshot, can include secondary DATE
⑥MISC: original file name, original URI

Number ④ Subjects — one can list all or none.

Number ⑤ Type — one can list all or none possible types… for a photo of a scan or a printed screenshot.

Substitution gives:

①②③DATE④ⓐⓑⓒⓓⓔ⑤⑥

Repository_Source_Authority_DATE_event_person_place_title_sourcepointer_type_Misc

If an element can appear twice it may do so, but usually just use the earlier placement.

 

Examples

Imagine a hard drive or some memory device or even somewhere in the cloud where there is an archive of digital and digitized documents. Here is the MetaSource_DATE_DataPoints schema to a birth certificate. The datapoint order need not be consistent, but it would help to do so when one eyes scan down a directory’s listing.

SomeArchive/ 
NSWBDM/
NSWBDM_1980-06-22_Birth_ManningClaudeRobert_Sydney_1234-1980_scan.jpg

“_” underscore separates elements

“-” appends or supplements an element with some qualification or further detail and resolution, as what it does in a DATE 2017-02-14.

One could introduce subfolders under NSWBDM each for Birth, Deaths and Marriages, but keep the file name as is, and don’t drop birth from the filename.  Now just looking at the one example it looks very ugly, but with a dozen on screen in one repository folder an order does become apparent.

Also, I’ll admit that with digitisation and online resources the lines between a fonds and a provenance or even authority get real blurry, but the whole thing is in my metaFonds now so that’s the way I do it.

SomeArchive/ 
ManningFonds/
ManningFonds_LetterFrom_DaviesM_1963-08-25_toManningClaudeRobert
_12AliceSt_scan_2017-03-01.pdf

 

SomeArchive/ 
Googlemaps/
Googlemaps_2017-02-14_1PoolSt-Otley_map_satellite_screenshot.jpg



SomeArchive/
ArchiwumMapZachodniejPolski/
ArchiwumMapZachodniejPolski_Messtischblatt_1940_Ostrowo-map_download_2017-02-21.jpg

 

 

A PROTIP or two

Anything you wanted to do by naming or organising directories according to Family Branch, Family, Person, types of documents, photos (as I did start) with the above MetaSource_DATE_DataPoint system in place — can be done with virtual folders (or saved searches or what Apple Mac OS calls smart folders) because all those datapoints are in the filename. Nothing is lost, a lot is gained.

Keep datapoint elements in the same order, this will help pseudo-folderise files on screen.

Thus, LastnameFirstname for organizing families.

Separate files of scan of the same document? Number the type element with a padded suffix:

…photo_scan01-front.png
…photo_scan02-rear.png

(Scan the backs of your photos, they can contain more information than you realise.)

I will also add GRAMPS ID numbers to various elements, particularly for clarity and consistency.

ManningClaudeRobert-I0123
Googlemaps-R0123

Devil in the Detail : Characters

Remove characters which some computer operating systems find difficult or special:

/ \ . 

& Windows in particular: from Naming Files, Paths, and Namespaces (Windows)

< (less than)

> (greater than)

: (colon)

” (double quote)

/ (forward slash)

\ (backslash)

| (vertical bar or pipe)

? (question mark)

* (asterisk)

Why?

Imagine you grab a USB drive and copy stuff to it and not look at it and then copy it elsewhere and realise data is lost because it was some old FAT formatted thing, and you don’t even have Windows in the house!

Windows also finds long file pathways (this includes both file name & nested folder/directory structure) very difficult, as these are going to be long file names, keep the folder nesting structure as simple as possible. (If burning to an optical disc DVD or CDROM it would be best to compressed the entire structure first to .zip or similar, and burn that file. Optical disks do not support very deep pathways and lengthy filenames.)

Also remove spaces or hyphens from surnames or phrases and CamelCase them. And use CamelCase when removing spaces generally.

I use “~” the tilde to indicate the orginal (format). For example

…PicassoP_1948-02-14_ManningClare_inLondon~drawing-portrait_photo_scan-2005-10-11.png

I’ve also reserved ‘ ` ‘ or tick to indicate primary date.

Maybe keep white space for original file names, but you will have to remove the dots from any original file names or URIs and leave only the one dot to separate the file extension.

nla.12345.D23-123.pdf
to
…_download_nla-12345-D23-123.pdf

NOTE: File type extensions (i.e. .txt .jpg .odt) may not be included in the above descriptions. Obviously they are another type of DataPoint.

Thus we have all the:-

MetaSource_DATE_DataPoints

What are the rules for art?

2017/02/14 § Leave a comment

 

When I walk into a gallery, and I look at an artwork, I wonder, how was it made?

This is always my first question.

Sometimes this is too obvious particularly in the case of flat art like oil painting and we drift immediately into curatorial and collectimaniacal discussions, perhaps of style, technique and brushstroke.

Other times the types of process is more technological than the personality in practice, technique and presentation —and I get caught up in lost wax, sprues and welding.

Interesting. I pace hither and thither, lost in thought and admiration. And so more questions, all sorts of questions.

I may venture into what the maker intended, but in any case, the last question I ask is —how was it marketed?

It is the last thing I ask not because of its import but because as soon as it floats into the periphery of notice — I walk away from the art and into the light.

Any other questions that might arise (subject matter, historical significance, the artist’s early death, the skill, the late success, the naive approach) all of them are subsumed by marketing, by the market, by the marketeers, the curators, the auctioneers, the gallerists and #bignames curating their own careers.

And my first question, in context of the marketplace, means nothing at all.

So I walk away.

Marketing stops my curiosity, marketing is a mind killer. It is the magic that must hide its power even as it consumes everything.

I try to walk away.

My first question makes me lonesome, and perhaps proud. The other questions make me a member of the market, an atom of nothing in a sea of commodities, a see of POVs of likes and dislikes, of subjective demographics. And I cannot walk away even when I think I have.

And the last question reminds me the rules for art are the rules of the marketplace. There is no alternative.

I crawl in circles.

My Journey through a Book of the Dead with the Three Jays and Then Some

2014/12/23 § Leave a comment

Matthew-Barney-2

On opening night J¹ said she felt sad. “I don’t want to make art. It all… I don’t feel like making any anymore.”

Industrial Egyptpunk

Numbed ghosts walk by lots of found objects touched by a lesser Midas. A gallery plonked with faux ready-mades from the factory floor. Technically brilliant foundry work. Lovely copper. I get bored with people saying they are underwhelmed.

Beautiful.

I could make all of this. I would make none of this. I am a year older.

The pitch: Mad Max versus Stargate

Norman Mailer as a car, a character in an adaption of his own novel, see… like… you know… c.f. Ka, Egyptian soul-double. Ha-ha. Haw-haw. Crow bars as was:- bull’s blown bits as magical scepters, jawbreakers. But there is no release, no transfiguration. So us psychopomps, like K, flatline  ___________________________

Homework

Please read Norman Mailer’s novel Ancient Evenings and produce a 6 tonne bronze by Thursday morning.

The Nile as an autobahn of progress, a physical series of tubes. Discuss.

Closing the book on his desk J² shakes his head, “If only Barney had joined the 27 club.” After dusk on Wednesday, J² pours petrol over his copy of The Cremaster Cycle, drops his joint and stands back but forgets to video it for youtube.

Satan’s Skin

Milton Moon covers the walls, “the devil gets the best tunes” we jest. It’s a shade, not a colour. Just wait until the flouros flicker.

J³ crumples into a corner groaning. He wants to cower but there are no shadows here. No hidden depths.

Remote control

I like process. I like review. I like books. There are copies of Barney’s tomes. He says this show is a bit remote. He says he is not a theatrical filmmaker, it’s about the objects. J² says there is nothing there. In the book of interviews Barney says he doesn’t do interviews. Or catalogs. Pick two.

Tomorrow and Tomorrow and Tomorrow

Pyramids as immortality machines; a form of conspicuous consumption showering society with law and order.

Or poo machines of fertility.

Either way, ancient eternal lives for the rich and powerful, opera for the bored in spirit; over-laboured, groaning, constipated, inappropriating, and signifying nothing.

Get behind me

C said antithetical to Beuys but I can’t spell that in this light.

Beuys-Feldman-Gallery

This is a response to Matthew Barney - River of Fundament - MONA

Which own photographer are you? Selfie or POV screen-shottie?

2014/09/04 § 1 Comment

POV screen-shottie

POV screen-shottie

I’m quite keen on screen shots.

 

And again it is the mobile cell phone that makes all the difference in how new formats move on in to popular culture. It is the technological format that is defining us.

Our Aesthete Brains Evolving to Desire Beauty but Relax Into Art

2014/08/26 § Leave a comment

I have been reading The Aesthetic brain : how we evolved to desire beauty and enjoy art by Anjan Chatterjee.

Key message is that the diversity of form is directly related to environmental and selective pressures.

Where there is strong selective evolutionary pressure then, as an example, birdsong will be as unchanging as Egyptian art over millennia. Or, when there is strongly repressive government then art will be restricted to pro-government propaganda i approved form and genre, and as unchanging as the wild birdsong.

Where conditions relax then there can be a survival in a diversity of form, as in the diverse songs of domesticated songbirds compared to their wild cousins.

The middle bit of the book surveys the recent writing in neuroaesthetics and a number of evolutionary arguments about “why art?”. Unsatisfied by the answers involving “art instinct” or “by-product” he argues for a third way involving that relaxation of selective pressures mentioned above.

maskofreposePHI

I still feel Ellen Dissanayake‘s work is the best of “why art” in a evolutionary context, and I can see it fitting in with Anjan Chatterjee’s suggestions of relaxation to allow the diversity we see through time and across geographies. Both are at base material arguments, one for raising children, one for how they, and we, survive.

Suggestions of relaxed environments, if not attitudes, will probably work for any Dissanayake’s “making special” activities covered by other modern words like ‘religion’.

“Art” after all is primarily a marketing category, a very modern form. And perhaps one not relaxed enough yet to be any good. Especially all that conceptual art that just looks like bad science fiction made for people who do not read science fiction.

Bombastic Distributor, or how I got published in one easy step

2014/05/19 § 2 Comments

Often I write to find out why I am writing. This precludes me from being published, at least in traditional human tradeable publishing. Or, until I have established a market for what I write-as-I-find-out-why-I-am-writing. Traditionally, publishing rewards those who write for humans.

This is no longer the case.

Now I can write to find out why I am writing and be published at that exact same moment.

Consider the appellation “bombastic distributor”.

My partner Mona was in conversation with Tony describing me as an “enthusiastic initiator” and herself as a “conscientious completer”.

Having heard all this before I butted in with, “Yes, but what about those ‘bombastic distributors’?”

I made this term up. It was a throw-away line. Days passed and the conversation was forgotten, but the phrase stuck in my mind.

I wondered if “bombastic distributor” was truly new, if it was actually a novel construction.

So, as you do, I put the search phrase “bombastic distributor” into a search engine.

bombasticdistributorsearch

I was hoping for a return less than a googlewhack.

There were no googlewhacks, but there were four results for the search string.

Of these four only one had the phrase “bombastic distributor” without punctuation between the words e.g. “bombastic; distributor”

And that one result actually lead to the page where “bombastic distributor” could actually be seen, at the time all the other results returned pages that could not be found (404s).

bombasticdistributor original search bombastic distributor original search

It was part of some word salad generated content for a Korean link farm. Non-human produced writing, for the English on the page was mechanically made up. It was almost human but not quite. It was in an uncanny valley of meaning, if only creepy because of the serious-looking Korean guy in the banner advert.

Now it was done for humans should they happen upon the page, but only a little bit. It was more for humans indirectly, by conning other algorithms that it was a human created webpage, in order to direct other humans via pagerank to whatever page they want to boost with their link farming. It was obviously word salad.

It didn’t really have to fool humans, merely other machines, other algorithms. Currently this is quite a low bar.

So my phrase “Bombastic Distributor” was only nearly unique, almost novel. At least for humans, it was thus partial googlewhack. A virtual googlewhack. It appears no human but me had previously put the two words together and documented it. It had been auto-published by some page where English was at best a second language (if it had been created by humans). But as it was most likely robot written without context, it was no ordinal numbered tongue at all.

Virtually virtual, literally. It had involved no conversation at all. Especially not with Mona, Tony, or me, or any human.

It was part of some fill to support the Korean banner ads for some serious but shonky looking Korean guy. Politician?

The banner advert was gone by the time I starting writing this piece about it. Documenting it days after the search, a week after the conversation. I should have got a screen grab, such things can be so fleetingly available on the web. Bookmarking a favourite is not enough. He was gone at the time I was first writing this up.

Elsewhere, I’ve said that I no longer write for humans.

Now I am not saying I am also merely a wetware salad generator.

No, no, no. I am not the one with issues here.

When I searched for “bombastic distributor”, when I typed each letter of “bombastic distributor” and hit enter, sending the query away to the search engines, it was at that moment I published my writing.

The internet has issues. Not me.

When I, or you, or anyone hits an enter key to initiate a search, it becomes another bit of data produced by a living human. This is when the phrase “bombastic distributor” was first published. And those first results were a review of the first edition.

Published forever, or as long as the internet exists, at least within the data collected and collated by the meta-manipulation of the search engines queries. Certainly forever compared to the conversation between Mona & Tony & me in which “bombastic distributor” occurred. Unless documented, conversations are not taken as published. There is no record. But my search had been recorded the moment I made it, if not before with auto complete and search suggestions,

The search returns might be my reward for a query, but the query itself is the piece of data used to structure further involvement by all users of the engine … and then the data is on the internet somewhere, as are the queries’ results.

A search query is a published work.

Publishing is primarily a economic activity, search engines are massively profitable activities, thus using a search query is a form of publishing. Perhaps not completely open, but real libaries are rarely as available 24/7 as is a search engine.

Serving Suggestion: Have you written an unpublished novel? Well all you have to do to get it published now is to cut and paste the entire work into a search engine query box and hit return. You novel is now published, for as a search query it is now part of the economic order. Now go and publish all the earlier versions. No, no one human will have read it, perhaps they never wouold have anyway, but this is one reason why I no longer write for humans.

Since I made those searches for “Bombastic Distributor” and remember there were only four returns a week or so ago, it has now blossomed into six, as spam results seek to attract my attention, somehow they have accessed my previous query and now pretend to answer to it, which they don’t but they do prove that the search query “bombastic distributor” is a published work. My non-human readership is taking notice!

bombasticdistributorsearch1

The form of books (on paper)

2014/03/30 § Leave a comment

Wink reviews one remarkable paper book every weekday.winkssmall “We take photos of the covers and the interior pages of the books to show you why we love them.”

Where Am I?

You are currently browsing the formats category at FORMeika.