Welcome, Guest. Please Login
Tinderbox
  News:
IMPORTANT MESSAGE! This forum has now been replaced by a new forum at http://forum.eastgate.com and no further posting or member registration is allowed. The forum is still accessible via read-only access for reference purposes. If you wish to discuss content here, please use the new forum. N.B. - posting in the new forum requires a fresh registration in the new forum (sorry - member data can't be ported).
  HomeHelpSearchLogin  
 
Pages: 1
Send Topic Print
Data-transfer question, from Lotus Agenda (Read 11616 times)
J Fallows
Full Member
*
Offline



Posts: 418

Data-transfer question, from Lotus Agenda
Jun 13th, 2013, 5:57pm
 
Before there was Tinderbox, there was Zoot. I don't know if this was "before" in the corporate-history sense, but for me personally Zoot was the magic-organizer I started using in the mid 1990s long before I knew of Tinderbox. (See: http://www.theatlantic.com/past/docs/issues/97aug/zoot.htm)

And before there was Zoot there was Lotus Agenda, which I started using in the late 1980s and which, when sacrificed by the Lotus managers a few years later, came to symbolize the principle that "the beautiful die young," or something. (See: http://www.bobnewell.net/agenda/atlantic.txt.html )

I still have a lot of info preserved in Agenda files, even though on startup (in a DOS-only session, in the Windows VMware Fusion session on my Macs) the Agenda screen says "Copyright 1988, 1990." And the question is how I can export it for ingestion to Tinderbox.

Agenda does not support CSV export. It has its own idiosyncratic .STF format, for structured file, in which a typical item would be exported thus:

Quote:
Exported from C:\USERS\JAMESF~1\DROPBOX\LOGBOOK.AG, created 06/13/13 2:55pm
{STF}06/13/13;14:55:06;002
{d}1
{I}{T}Flight #77
{N}VOR tracking, unusual attitude recovery{S}
{C}\To/BFI\
{.}
{C}\From/Boeing Field;BFI\
{.}
{C}\Light;Lt/Day;D\
{.}
{C}\Aircraft/N16593\
{.}
{C}\Time#|1.1
{.}
{C}\Entry@|02%/20%/1999 15:52
{.}
{C}\PICTime#|1.1
{.}
{C}\DualInstrucTime#|1.1
{.}
{C}\SimInsTime#|0.8
{.}
{C}\T%/O#|1
{.}
{C}\Landings;Land#|1
{.}
{C}\When@|02%/20%/1999
{.}
{C}\GroundInstr#|0.5
{.}
{!}


Here's what this means:
  • The {I} marks the beginning of a new item;
  • The {T} introduces the title of the item, comparable to $Name;
  • The {N} and {S} are the beginning and end, respectively, of the text notes for the item;
  • Each {C} introduces a Category, comparable to Tbx $Attribute;
  • After the "\" comes the name of category, followed by "/" or "|" and the value for that category;
  • Various codes indicate the type of data each category contains, with @ for date categories, # for numeric, etc;
  • {.} appears to be an all-purpose "ending" marker;
  • The {.}{!} sequence marks the end of an item; the next one would begin {I}{T}

I have a very large amount of material in this format -- in this case, they are entries from a piloting logbook. The regularity of the structure suggests that it "should" be fairly easy to set up an import routine that would take the text output of a .STF file, which could contain hundreds of notes on the pattern shown above. Would one of you Marks, A or B, point me to a section in TBRef, Help Files, or The Tinderbox Way that would give me step-by-steps in constructing an import file? Or anyone else who knows how? Thanks in advance.
Back to top
 
« Last Edit: Jun 14th, 2013, 9:43am by J Fallows »  
  IP Logged
Mark Bernstein
YaBB Administrator
*
Offline

designer of
Tinderbox

Posts: 2871
Eastgate Systems, Inc.
Re: Data-transfer question, from Lotus Agenda
Reply #1 - Jun 13th, 2013, 7:37pm
 
This is bound to be fun!

Are the fields in consistent order?  And are they all present in each item?  If so, we can turn this into CSV very easily.

Another route might be OPML.

How many of these files do you need to import?
Back to top
 
 
WWW   IP Logged
J Fallows
Full Member
*
Offline



Posts: 418

Re: Data-transfer question, from Lotus Agenda
Reply #2 - Jun 13th, 2013, 9:51pm
 
1) Good news: field/categories are in consistent order.

2) Bad news: not every item has the same fields. They are exported only when there is a value explicitly assigned to the field. For instance, if there is a field for "night flight time," that field is included only for items with a >0 value and just omitted for daytime flights. Agenda does not have a utility to convert missing values to 0 or null, so as to make sure there is a consistent sequence of fields in the exported file.

3) Scale: there is only one file where the transfer actually matters, but it has about 700 items, so I have thought that trying to automate the transfer would be worthwhile. (Can send you the STF file if interesting for research purposes.)
Back to top
 
 
  IP Logged
Mark Bernstein
YaBB Administrator
*
Offline

designer of
Tinderbox

Posts: 2871
Eastgate Systems, Inc.
Re: Data-transfer question, from Lotus Agenda
Reply #3 - Jun 14th, 2013, 10:18am
 
I can see three routes here, all viable.

a) It might be possible to massage the data into a consistent table form. You'd use search-and-replace to turn field delimiters into tabs and record delimiters into returns, paste the result into a spreadsheet, and then manually move things so they line up in the correct columns. That’s brute force, but it sets an engineering baseline; one could almost certainly do one record a minute, so that’s about 12 hours of work.

Tedious, finicky work to be sure, but it's limited.  Even if our estimates are off significantly -- say it turns out to be 40 hours -- it's a useful baseline. (I bet there are a number of students, for example, who'd be very happy to have 40 hours of work this summer at, say, $50/hr.)

b) I bet this could, with relatively little trouble, be translated to OPML. Indeed, OPML was originally written for a similar product. Here, there are two approaches we could try: either replace the Agenda tags with XML-style equivalents, or simply parse the Agenda file in something like Ruby or perl, build the data structure, and then write the OPML.  

Last night, I would have bet this would prove the most effective approach. Right now, I'm not sure that method A wouldn't be faster and cheaper. Method B is a lot more interesting, of course, and gives you a tool you could use again.  I bet there are some other Agenda users out there.

c) We could build a special-purpose importer into Tinderbox that would read the Agenda files and create a new Tinderbox document. Importers are usually fairly easy to write. The timing isn't great just now, and of course people aren’t creating new Agenda files every day.  But this would not be terribly difficult -- especially since the format seems tractable and is (probably) decently documented. My impression is that people don't ask us for importers as often as they should.

I've discussed this at some length in the hope that future readers who are interested in other kinds of import will find this.
Back to top
 
 
WWW   IP Logged
Mark Anderson
YaBB Administrator
*
Offline

User - not staff!

Posts: 5689
Southsea, UK
Re: Data-transfer question, from Lotus Agenda
Reply #4 - Jun 14th, 2013, 1:56pm
 
[sorry for late reply - been out all day] [Also, I think I'm describing method 'a' in the last post above]

Interesting. I post-date Agenda as a user but its successor  - Lotus Organizer - was my first PIM. Annoyingly this question comes just as I unplug and head off on leave for a week. Some thoughts…

I think you'll want to do a little massaging of the data in something like TextWrangler or BBEdit and that Tab-delimited text should be the way to go. OPML could be used but it needs more fiddling with mark-up.

Sidenote: I'm happy to stand corrected but I've seen no documentation that TB supports CSV import - though the terms get used interchangeably with Tab-delimited to refer to tabular text. A quick experiment with v5.12.1 shows tab-delimeted text inputs OK but comma-separated does not.

I'd start with a test. Make a 2-3 records with every field complete and use this as your test vehicle. Go through the fields and make an appropriately named Tinderbox field of the right data type (e.g. Number, Date, etc.) if not a String. Start your tests on a copy of this so you can update it for your testing and use it for the 'real' data import. Do check the day-month order of exported dates are the same as your OS default (likely they will be the same).

Things to look out for:
  • Field names with characters not allowed in TB attributes., e.g. 'T%/O' will want become something in $TimeOn (Sorry I've no idea what T/O means!).
  • % is clearly used to escape froward slashes. Once field names are correct (step above), use a regex to replace all instances of '%/' with just %'/'. If there are any other escapes, use the same method - don't just delete all % as there may be valid use of in it note text.
  • Delete everything before the opening {I}
  • Replace {.}\n{!} with a line break.
  • If there are any line breaks between any {N} and next {S}, wrap the enclosed test in straight double quotes
    • EITHER: Insert a line with just {C} wherever there is a missing field.
    • OR: pre-populate fields pre-export with nonsense strings/numbers (e.g. @@@@@ and 99999) that can be stripped out after TB import.
  • Add a line at the start with the TB attribute names, with a tab between each, in the order they appear in the records. Do not append $ prefixes: Name      Text      To      From…etc.
    [for any multi-valiue source fields, ensure the per-value delimiter (i.e. separating tags/keywords) is a semi-colon.
  • Find each sequence '{C}field_name-data_type/' and delete everything after the } up to and including the /. IOW, {C}\From/Boeing Field;BFI\ becomes {C}Boeing Field;BFI\
  • Replace all instances of '%/' with '/'
  • Delete all '\' at the end of lines.
  • Delete {S} at the end of a line.
  • Delete sequence {I}{T} from that start of a line.
  • Replace line break {N} with a tab
  • Replace each 'line break followed by {C}' with a tab
Congratulations, after all the up front work you should have a valid tab-delim file to import to TB. If you have BBEdit you can make a 'text factory' out of all the above steps, which are basically as set of regex find/replace. Test on your test data. Any errors, re-tune/test.

Then, run on the real data and import the result to TB. If you used placeholders for empty field values make agents as required to find and remove these value(s).

~~~~~~~~~~~
I'd say the hardest part is figuring the above list of tasks. with a competent text (code) editing tool implementing the process should be pretty easy though you might need some help with some of the harder regular expressions (and this why pre-testing is a must).

Only you can tell whether it's easy to fill blank values in Agenda or add extra markers in the exported file. If Agenda lets you query for all items with no value for field X, then I'd go the placeholder route.

I hope that helps. If it all hits your 'TL;DR' buffer, by all means contact me off list (or via Eastgate) and I'm sure I can help when I'm back on the 24th.
Back to top
 
« Last Edit: Jun 14th, 2013, 2:00pm by Mark Anderson »  

--
Mark Anderson
TB user and Wiki Gardener
aTbRef v6
(TB consulting - email me)
WWW shoantel   IP Logged
Mark Anderson
YaBB Administrator
*
Offline

User - not staff!

Posts: 5689
Southsea, UK
Re: Data-transfer question, from Lotus Agenda
Reply #5 - Jun 14th, 2013, 2:18pm
 
Doing the above, your data looks like the following (tabs shown for clarity):

Code:
Name	Text	To	From	Light	Aircraft	Time	Entry	PICTime	DualInstrucTime	SimsInTime	TimeOn	Landings	When	GroundInstr
Flight #77	VOR tracking, unusual attitude recovery	BFI	Boeing Field;BFI	Day;D	N16593	1.1	02/20/1999 15:52	1.1	1.1	0.8	1	1	02/20/1999	0.5 



I tested this (above made by hand  - I've not done all the regex) and had one glitch where $TimeOn and $Landings became Booleans. But, if I re-test with those attributes pre-created (as per my method above) then 1 and 0 values import correctly. Thus, the result of my my quick & dirty test:

Made using this TXT file (again - the data was 'cleaned' manually and not via the actual process laid out in my last post).

I'm sure there are some more edges here. For instance, the  'Light;Lt' filed name seems odd and points to some data structure as yet undescribed. Plus I've not tested $Text with line breaks in it or multi-value fields. But, I hope I've moved you forward a bit


Is T/O take off? As in the old joke about good pilots aiming achieving equal numbers of take-offs and landings (in the same flight).
Back to top
 
« Last Edit: Jun 14th, 2013, 4:31pm by Mark Anderson »  

--
Mark Anderson
TB user and Wiki Gardener
aTbRef v6
(TB consulting - email me)
WWW shoantel   IP Logged
Mark Anderson
YaBB Administrator
*
Offline

User - not staff!

Posts: 5689
Southsea, UK
Re: Data-transfer question, from Lotus Agenda
Reply #6 - Jun 14th, 2013, 5:34pm
 
I could see this would bug me if I didn't start to work it out. So, let's assume you've already done the following:
  • Exported a data set with no empty fields - i.e. put common placeholder text in any record where fields lack values (for deletion in TB when done)
  • You've tidied any bad field names like T/O and the ones like Landing;Land (latter construct isn't explained so I don't want to guess) to something TB-friendly.
Now, as tested in BBEdit v10.5.4 using the grep option in find/replace:
  • Strip the export preamble line to first record. Find: \A[\s\S]*?{I} , replace: nothing . Note this leaves a blank line, which you use this in the last step below.
  • Strip title marker. Find: {T} , replace: nothing
  • Strip tail end markers. Find: {.}\n{!} , replace: nothing
  • Strip '%' escape characters. Find: %/ , replace: /
  • Replace number and date title/data boundary markers so the same as strings. Find: #\||@\| , replace: /
  • Strip field names. Find: {C}.*?/ , replace: {C}
  • Strip line end backslashes. Find: \\$ , replace: nothing
  • Replace field delimiters with tabs. Find: \n{.}\n{C} , replace: \t
  • Replace text start marker with tab. Find: \n{N} , replace: \t
  • Replace text end marker with tab. Find: {S}\n{C} , replace: \t
  • Paste tab-dellimited TB attribute names (column heads) into the empty line #1.
  • Save the file with a '.txt. extension as UTF-8.
  • Drag drop file onto Tinderbox.
Expectation management: I'm not a regex expert  (real experts please critique!) and I've only one record to go on, so there will be some more edge cases. I don't understand the significance of the double field names with semi-colon delimited values. If we know all of these and their TB values, a regex could be written for each to tidy them. Such regex should be run before all the above steps.
Back to top
 
 

--
Mark Anderson
TB user and Wiki Gardener
aTbRef v6
(TB consulting - email me)
WWW shoantel   IP Logged
J Fallows
Full Member
*
Offline



Posts: 418

Re: Data-transfer question, from Lotus Agenda
Reply #7 - Jun 15th, 2013, 12:11am
 
To both Marks: Thank you! Have been off the grid all day, actually flying a little plane across the American heartland, so I will digest these tomorrow. Really appreciate it. jf
Back to top
 
 
  IP Logged
J Fallows
Full Member
*
Offline



Posts: 418

Re: Data-transfer question, from Lotus Agenda
Reply #8 - Jun 16th, 2013, 10:46pm
 
Here is an update, again with sincere thanks to both Marks for their careful, systematic, yet also creative attempts to deal with this challenge. I've been off line (and in an airplane) most of the past 48 hours and thus have not yet tried these possible solutions.

Two relevant bits of extra data:

1) In reflecting back on the way Agenda works, I realize that there is a relatively easy way to assure that all items have a consistent set of Categories/Attributes. Without going into all the details, you can go category-by-category and:
  • filter to find all items with null/missing/no value for that category, and then
  • assign the appropriate 0/dummy value for each of those items

Then, when the file's values were exported to an Agenda "structured file," every item would have the same set of categories in the same order. Which would of course simplify all transfer steps.

2) I'll clarify an Agenda convention, in case it is of any use in future Tinderbox/etc design. Understandably, you ask about the confusion of category names -- what is the difference between "Light" and "Lt," or "Landings" and "Land," what does "T/O" mean, and so on. The Light/Lt difference arises from an Agenda feature that let you specify one or more short forms of a category's name. If  the full name of one category was "Landings," it would display that way when you had set the column-width to be large enough. But when you specified a narrower column, it could display a specified short version, eg "Land" or even "L."  

 To answer another question, T/O is indeed "Takeoffs." For most flights it would be 1, but for some training flights it could be a large number. (And, yes, you hope the number of Landings is the same.)

Bonus quiz: Air Force One is the only aircraft that, for reasons having nothing to do with accidents or crashes, has recorded 1 more takeoff in its history than its total landings. And the reason is .... ?

Again thanks for guidance, and after I put the Agenda file in some kind of order with step #1 above, I will put your suggestions into effect.
Back to top
 
 
  IP Logged
Ted Goranson
Full Member
*
Offline



Posts: 141
Virginia Beach VA
Re: Data-transfer question, from Lotus Agenda
Reply #9 - Jun 17th, 2013, 9:05am
 
Because any plane with POTUS is AF1.

GHW Bush jumped from a plane while POTUS.
Back to top
 
 
WWW TedGoranson   IP Logged
J Fallows
Full Member
*
Offline



Posts: 418

Re: Data-transfer question, from Lotus Agenda
Reply #10 - Jun 17th, 2013, 10:40am
 
Close, in that the explanation involved a former Republican president, though not either of the Bushes. GHWB jumped out of planes before (during WW II) and after his time in the White House, but not during.

To avoid thread-drift peril, I'll give the answer after I try out my Agenda solution.
Back to top
 
 
  IP Logged
Paul Walters
Full Member
*
Offline



Posts: 267

Re: Data-transfer question, from Lotus Agenda
Reply #11 - Jun 17th, 2013, 4:56pm
 
When Richard Nixon left DC for the last time on AF1 he was POTUS, while that flight was en route Gerald Ford was sworn in.
Back to top
 
« Last Edit: Jun 17th, 2013, 4:56pm by Paul Walters »  
  IP Logged
J Fallows
Full Member
*
Offline



Posts: 418

Re: Data-transfer question, from Lotus Agenda
Reply #12 - Jun 17th, 2013, 6:41pm
 
Quote:
When Richard Nixon left DC for the last time on AF1 he was POTUS, while that flight was en route Gerald Ford was sworn in.

We have a winner!

And meanwhile, I am working on an Agenda-file transfer.
Back to top
 
« Last Edit: Jun 17th, 2013, 6:42pm by J Fallows »  
  IP Logged
J Fallows
Full Member
*
Offline



Posts: 418

Re: Data-transfer question, from Lotus Agenda
Reply #13 - Jun 19th, 2013, 2:11am
 
To follow up on the info-request I was making earlier, I tried the variety of approaches suggested by Mark A and Mark B, and ... it all worked!

Or at least it has worked on the small sample of notes I have prepared for import, so now I'm emboldened to do it on a whole file.

In detail:
  • I cleaned up the Agenda file by making sure that each item had an assigned value for each Category (aka $Attribute), even if that value was 0 or null. This meant that there would be a consistent set of category/value readings for all exported items;
  • I exported that to a "Structured File," and then:
  • I deleted characters TBox can't recognize, and removed something that had been a plus in Agenda but would be a minus for this process: the ability to give any category (or value) its long-version name and also several shortcut names;
  • I reduced it all to a tab-separated list of values, with a tab-separated list of category names at the beginning of it all; and
  • I dragged-and-dropped that into a Tinderbox Outline view. And, voila.

Since it turns out that there are 750+ items in my original Agenda file, it's worth going through the hassle of automating the import process, as opposed to re-entering the info. This is to close the loop and say that the steps recommended above actually work. Thanks!
Back to top
 
 
  IP Logged
Pages: 1
Send Topic Print