Welcome, Guest. Please Login
Tinderbox
  News:
IMPORTANT MESSAGE! This forum has now been replaced by a new forum at http://forum.eastgate.com and no further posting or member registration is allowed. The forum is still accessible via read-only access for reference purposes. If you wish to discuss content here, please use the new forum. N.B. - posting in the new forum requires a fresh registration in the new forum (sorry - member data can't be ported).
  HomeHelpSearchLogin  
 
Pages: 1 2 3 4 
Send Topic Print
Gettysburg: a TB textual analysis experiment (Read 62238 times)
Mark Anderson
YaBB Administrator
*
Offline

User - not staff!

Posts: 5689
Southsea, UK
Re: Gettysburg: a TB textual analysis experiment
Reply #30 - Jul 12th, 2009, 11:27am
 
@Jean (reply #28).  One thing to remember is that code like your one setting $Prototype won't work as expected if notes can have more than one code. I don't think that's the case but I can imagine that context occurring.
Back to top
 
 

--
Mark Anderson
TB user and Wiki Gardener
aTbRef v6
(TB consulting - email me)
WWW shoantel   IP Logged
Charles Turner
Full Member
*
Offline



Posts: 180
New York, USA
Re: Gettysburg: a TB textual analysis experiment
Reply #31 - Jul 12th, 2009, 2:02pm
 
Quote:
that's a pretty powerful demonstration of the value of layering!


Um, yeah... I find most of the maps/diagrams a bit hard to understand, perhaps because they're butting up against some "7 +- 2" limit with the number of spatial, temporal and vocal codes (which number ~11).

Also, they seem subject to some Albers' "interaction of color" phenomenon where it's hard to compare the spatial greens against the differing shades of vocal blue. That's what got me to throw some shapes into the last diagram, although without much care.

(I've been using color in my outlines, not so much as a way to read a code, but as a representation of genre type across a timeline. The variations in genre "density" are readily apparent that way.)

So the layers, aside from carrying the attribution category, help to reduce the complexity to where you can actually comprehend the figure/ground relationship, which seems necessary for this type of intense analysis of a small text. (As opposed to a "law of large numbers" view of a big corpus.)

Best, Charles
Back to top
 
 
WWW   IP Logged
Mark Bernstein
YaBB Administrator
*
Offline

designer of
Tinderbox

Posts: 2871
Eastgate Systems, Inc.
Tinderbox Weekend?
Reply #32 - Jul 12th, 2009, 3:53pm
 
Aside: should we schedule a session on textual analysis at Tinderbox Weekend SF this year?
Back to top
 
 
WWW   IP Logged
Loryn
Full Member
*
Offline



Posts: 97

Re: Gettysburg: a TB textual analysis experiment
Reply #33 - Jul 12th, 2009, 3:56pm
 
Hey, Jean: I agree that using the CodeLink method is more flexible than directly assigning prototypes. When I posted my analysis, I'd pretty much concluded that I should switch to using it; and I was wondering whether I could link it with protototypes automatically. Thanks for showing me definitely how.

Jean: The Repetition and Synonymy links are the beginning of a cohesion analysis, along the lines of that specified by Halliday and Hassan.

And MarkA: Yes, the technique Jean shows needs to be used as follows:
1. With a conditional checking on prototypicality.
2. With the code notes itself only containing prototypicality along a single dimension, such that there's guarantee there can be no clashes.

If no-one beats me to it, my next example will include that, as I expand my analysis from ideational to interpersonal and textual metafunctions.
Back to top
 
 
  IP Logged
Loryn
Full Member
*
Offline



Posts: 97

Re: Gettysburg: a TB textual analysis experiment
Reply #34 - Jul 12th, 2009, 3:58pm
 
MarkB

I suspect you'll have a LOT of material. Smiley
Back to top
 
 
  IP Logged
Loryn
Full Member
*
Offline



Posts: 97

Re: Gettysburg: a TB textual analysis experiment
Reply #35 - Jul 12th, 2009, 4:12pm
 
MarkA

I think you corrected the template that was working to my satisfaction.

Tell me what's wrong with the following?

/TEMPLATES2/•CC
Code:
<^get($Prototype)^> ^children(/TEMPLATES2/•Clause)^ </^get($Prototype)^> 



/TEMPLATES2/•Clause
Code:
<^get($Prototype)^> ^children(/TEMPLATES2/•Leaf)^ </^get($Prototype)^> 



/TEMPLATES2/•Leaf
Code:
<^get($Prototype)^> ^title^ </^get($Prototype)^> 



This is the output I want:
Code:
< ClauseComplex> <Clause> <Circumstantial-phrase> Fourscore and seven years ago </Circumstantial-phrase><Nominal:Actor> our fathers </Nominal:Actor><Verbal-phrase> brought forth </Verbal-phrase><Circumstance:Locative> on this continent </Circumstance:Locative><Nominal:Range> a new nation </Nominal:Range> </Clause><Clause:Dependent> <Verbal-phrase> conceived </Verbal-phrase><Circumstance:Manner> in liberty </Circumstance:Manner> </Clause><Clause:Dependent> <Conjunction> and </Conjunction><Verbal-phrase> dedicated to </Verbal-phrase><Nominal:Range> the proposition </Nominal:Range> </Clause><Clause:Dependent> <Nominal:Goal> that all men </Nominal:Goal><Verbal-phrase> are created </Verbal-phrase><Circumstance:Manner> equal </Circumstance:Manner> </Clause> </ClauseComplex> 



And this is the code I get: note all the extra tag-opens and tag-closes? Where are they coming from?
Code:
< ClauseComplex> <Clause> <Circumstantial-phrase> Fourscore and seven years ago </Circumstantial-phrase><Nominal:Actor> our fathers </Nominal:Actor><Verbal-phrase> brought forth </Verbal-phrase><Circumstance:Locative> on this continent </Circumstance:Locative><Nominal:Range> a new nation </Nominal:Range> </Clause><Clause:Dependent> <Verbal-phrase> conceived </Verbal-phrase><Circumstance:Manner> in liberty </Circumstance:Manner> </Clause><Clause:Dependent> <Conjunction> and </Conjunction><Verbal-phrase> dedicated to </Verbal-phrase><Nominal:Range> the proposition </Nominal:Range> </Clause><Clause:Dependent> <Nominal:Goal> that all men </Nominal:Goal><Verbal-phrase> are created </Verbal-phrase><Circumstance:Manner> equal </Circumstance:Manner> </Clause> </ClauseComplex> < Clause> <Circumstantial-phrase>  </Clause><Nominal:Actor>  </Clause><Verbal-phrase>  </Clause><Circumstance:Locative>  </Clause><Nominal:Range>  </Clause> </ClauseComplex> < Circumstantial-phrase>  </ClauseComplex> < Nominal:Actor>  </ClauseComplex> < Verbal-phrase>  </ClauseComplex> < Circumstance:Locative>  </ClauseComplex> < Nominal:Range>  </ClauseComplex> < Clause:Dependent> <Verbal-phrase>  </Clause><Circumstance:Manner>  </Clause> </ClauseComplex> < Verbal-phrase>  </ClauseComplex> < Circumstance:Manner>  </ClauseComplex> < Clause:Dependent> <Conjunction>  </Clause><Verbal-phrase>  </Clause><Nominal:Range>  </Clause> </ClauseComplex> < Conjunction>  </ClauseComplex> < Verbal-phrase>  </ClauseComplex> < Nominal:Range>  </ClauseComplex> < Clause:Dependent> <Nominal:Goal>  </Clause><Verbal-phrase>  </Clause><Circumstance:Manner>  </Clause> </ClauseComplex> < Nominal:Goal>  </ClauseComplex> < Verbal-phrase>  </ClauseComplex> < Circumstance:Manner>  </ClauseComplex> 

Back to top
 
« Last Edit: Jul 12th, 2009, 4:16pm by Loryn »  
  IP Logged
Loryn
Full Member
*
Offline



Posts: 97

Re: Gettysburg: a TB textual analysis experiment
Reply #36 - Jul 12th, 2009, 4:25pm
 
Jean

When you wrote, referring to Paul's method of coding text fragments using links:

Quote:
So I think we need to add your experiment as #5 on the list of "ways to use Tinderbox for textual analysis."


Which list of four were you referring to? Were you pointing to this article: http://loryn.me/journal/2009/6/21/textual-analysis-in-four-situational-contexts.... ?

Or referring to another list somewhere else?
Back to top
 
« Last Edit: Jul 12th, 2009, 4:26pm by Loryn »  
  IP Logged
Mark Anderson
YaBB Administrator
*
Offline

User - not staff!

Posts: 5689
Southsea, UK
Re: Gettysburg: a TB textual analysis experiment
Reply #37 - Jul 12th, 2009, 6:05pm
 
Loryn, your example good bad output is too long too easily figure, can you send an example based on less output. It's not clear currently what is different. I don't see the same extra parts with bullet symbols in the Nakakoji view you specify - check your template code.

Also, in agent 'constructed' is you're testing the value of $Prototype use '==' for a test of equality.

In your templates, ditch the $ prefixes for attribute names in ^get()^ calls. It does work, but only by grace. In export calls we don't use $ prefix and action codes we do.  See aTbRef for more.
Back to top
 
 

--
Mark Anderson
TB user and Wiki Gardener
aTbRef v6
(TB consulting - email me)
WWW shoantel   IP Logged
Loryn
Full Member
*
Offline



Posts: 97

Re: Gettysburg: a TB textual analysis experiment
Reply #38 - Jul 13th, 2009, 2:02am
 
MarkA

This is shorter. Here is the text in Nakakoji outline:
Quote:
                 8.3.3.1 The world will little note nor long remember what we say here, but it can never forget what they did here.
                       8.3.3.1.1 The world will little note nor long remember what we say here
                             8.3.3.1.1.1 The world
                       8.3.3.1.2 but it can never forget what they did here
                             8.3.3.1.2.1 but


And here is the output:
Code:
< ClauseComplex> <Clause:Independent> <Nominal:Senser> The world </Nominal:Senser> </Clause:Independent><Clause:Independent> <Conjunction> but </Conjunction> </Clause:Independent> </ClauseComplex > < Clause:Independent> <Nominal:Senser>  </Nominal:Senser> </Clause:Independent > < Nominal:Senser>  </Nominal:Senser > < Clause:Independent> <Conjunction>  </Conjunction> </Clause:Independent > < Conjunction>  </Conjunction > 



The output should finish after the first </ClauseComplex> tag.

The updated templates are as follows:
/TEMPLATES2/•CC
Code:
<^get(Prototype)^> ^children(/TEMPLATES2/•Clause)^ </^get(Prototype)^> 



/TEMPLATES2/•Clause
Code:
<^get(Prototype)^> ^children(/TEMPLATES2/•Leaf)^ </^get(Prototype)^> 



/TEMPLATES2/•Leaf
Code:
<^get(Prototype)^> ^title^ </^get(Prototype)^> 



I'm using Tinderbox 4.6.2 on Mac OS X 10.5.7.
Back to top
 
« Last Edit: Jul 13th, 2009, 2:04am by Loryn »  
  IP Logged
Loryn
Full Member
*
Offline



Posts: 97

Re: Gettysburg: a TB textual analysis experiment
Reply #39 - Jul 13th, 2009, 4:09am
 
Mystery Nakakoji text generation problem solved!

The problem in the above-demonstrated text is due to me generating text using the Selected notes and contents option, together with a set of Nakakoji templates that specify how to traverse the notes.

The upshot of that combination was to traverse the outline tree multiple times.

The solution is to generate text using the Selected notes option only.

(And that solution came to me while I was cooking dinner: no wonder Mark Bernstein pays so much attention to his cooking!)
Back to top
 
« Last Edit: Jul 13th, 2009, 4:10am by Loryn »  
  IP Logged
Mark Anderson
YaBB Administrator
*
Offline

User - not staff!

Posts: 5689
Southsea, UK
Re: Gettysburg: a TB textual analysis experiment
Reply #40 - Jul 13th, 2009, 4:49am
 
Post edit: Ah, whilst drafting this, Loryn found the cause - the one I guessed at I'll leave part of my post as it might help others debugging a similar problem]

Loryn, thanks. I can't replicate the problem. However...

One thing I have tried is to add markup to your templates:

CC: Code:
[Processing top: <^get(Prototype)^>
 ^children(/TEMPLATES2/•Clause)^
</^get(Prototype)^>
[End top: ^get(OutlineOrder)^] 



Clause: Code:
[Processing clause: ^get(OutlineOrder)^]
<^get(Prototype)^>
 ^children(/TEMPLATES2/•Leaf)^
</^get(Prototype)^>
[End clause: ^get(OutlineOrder)^] 



Leaf: Code:
[Processing leaf: ^get(OutlineOrder)^]
<^get(Prototype)^> ^title^ </^get(Prototype)^>
[End leaf: ^get(OutlineOrder)^] 



Using note outline order 943 (Trial 2/Gettysburg Address by Sentence/Paragraph 3/Sentence 6/But in a larger sense.....) we get output like:
Code:
[Processing top: 943]
< ClauseComplex>
 [Processing clause: 944]
<Conjunction>  

</Conjunction>
[End clause: 944][Processing clause: 945]
<Circumstantial-phrase>
  
</Circumstantial-phrase>
[End clause: 945][Processing clause: 946]
<Clause:Independent>
 [Processing leaf: 947]
<Nominal:Actor> we </Nominal:Actor>
[End leaf: 947][Processing leaf: 948]
<Verbal-phrase> cannot dedicate </Verbal-phrase>
[End leaf: 948]
</Clause:Independent>
......[snip]...... 


This should allow you to track the source of where your errant extra output. OutlineOrder isn't unchageable but if the TBX is in pretty stating state, it's no problem to make an agent at the bottom of the outline (i.e. at end of current outline order) and to a query for OutlineOrder==N where N is the outline order number you're trying to find.

Your second set of templates only investigate 3 level CC calls children in Clause, clause calls children in Leaf, Leaf does look at children. In such a construct ensure you set the Nakakoji view at the top of your intended 3 level analysis and click the radio button for 'Selected notes' - this shows only your one top level note - all the content from other notes is via template includes and not via the Nakakoji view listing other notes. It may be you're leaving the view in the default radio button of 'Selected note and contents'. In the latter case you've got the view walking the hierarchy and your templates doing the same thing giving a form of double listing.

Re Nakakoji view
  • you can effectively only have one Nakakoji view open at a time - it will refresh as you change selected note(s) in your main view.
  • the default scope of opening a Nakaoji view is 'Selected note and contents' - it doesn't remember your last choice; if you need 'Selected notes' then you'll need to change that each time you open the view. [later: this was the actual problem Loryn was having]
  • the view always defaults to using the template specified it the app or doc preferences (latter takes priority if different).
Back to top
 
« Last Edit: Jul 13th, 2009, 5:26am by Mark Anderson »  

--
Mark Anderson
TB user and Wiki Gardener
aTbRef v6
(TB consulting - email me)
WWW shoantel   IP Logged
Loryn
Full Member
*
Offline



Posts: 97

Re: Gettysburg: a TB textual analysis experiment
Reply #41 - Jul 13th, 2009, 9:15am
 
I've begun to document a synthesis of our major discoveries on http://loryn.me

I'm arranging the articles thematically. Where multiple people have contributed, I'm doing my best to attribute both the origin and refinements of ideas as they have been documented in this thread.
Back to top
 
« Last Edit: Jul 13th, 2009, 9:16am by Loryn »  
  IP Logged
Mark Anderson
YaBB Administrator
*
Offline

User - not staff!

Posts: 5689
Southsea, UK
Re: Gettysburg: a TB textual analysis experiment
Reply #42 - Jul 13th, 2009, 11:21am
 
Thanks for the write-up on your blog - I'm sure that will help those looking to experiment with this area. I think it might be worth adding a tail-end comment to your section on 'cleansing' just alerting users to the facts:
  • using lots of piped commands may create more load on your system than the few characters of code might imply
  • the latter will increase as the size of the source rises - so cleanse in small chunks at a time
  • remember the '|=' as a means for ensuring rules run only once


Some more ideas - on which I've no time to build on today, but which might open a few more doors.

Remember HTMLExportCommand, this post-processes HTML output via command line (or a script called via command line). It could add some extra flex to what gets exported. See the link for more. If one doesn't want to lose the formal analysis mark-up LOyn's shown in his recent exploration of Nakakoji view, one could use the same templates (perhaps with a bit more HTML) and use a CL or CL-script to do something like turns tags like <Clause:Independent> into <h2 class="Clause_Independent"> or such. Note that HTML/XHTML 'id' and 'class' tag attributes must use 0-9 Aa-Zz and underscore only and that names must begin with a letter; an 'id' tag value must also be unique ofr any given page.  Still, that just implies a little extra manipulation.

Capturing maps in HTML - this wiki page.

Combine the above two and you might be able to export analysis maps that web users could view and drill down into in terms of looking that the actual analysis notes for a given phrase/word. By way of reality check: don't expect link lines (especially labelled ones) - it a step too far on the HTML, or rather web browser side** at present, and don't look to do many levels of drill down. Probably we need to mature the Gettysburg model a bit do there are more analysis notes with some specimen content in order to make sense of the idea.

** If you want windows users, or more specifically IE users, to be able to play along. IE8 doesn't (natively) support <canvas> or <svg> at present. Firefox on Windows would be OK. However lots of people have locked down Windows desktops at work that only allow IE - often an old version too. Apparently, the still noticeable numbers of IE6 users is due to corporate IT being slow to change (in fairness it's not laziness there are factors of cost, etc. to consider).
Back to top
 
 

--
Mark Anderson
TB user and Wiki Gardener
aTbRef v6
(TB consulting - email me)
WWW shoantel   IP Logged
Mark Anderson
YaBB Administrator
*
Offline

User - not staff!

Posts: 5689
Southsea, UK
Re: Gettysburg: a TB textual analysis experiment
Reply #43 - Jul 14th, 2009, 1:04pm
 
One other tangent, I'm making slow progress looking at a method to export Tinderbox map data in XML in a form that OmniGraffle could import and then edit. It was for a different project but might be an idea to have tacked up on the corner of the board here. I should add that I'm not pursuing this 'feature' by changes to TB, but rather looking at whether I can get an XML data format that OG can read in as usable data. Like iWork, OG data stored as (or can be read from) XML but the formats undocumented as it's not really seen as a public interchange format - sadly!
Back to top
 
 

--
Mark Anderson
TB user and Wiki Gardener
aTbRef v6
(TB consulting - email me)
WWW shoantel   IP Logged
Mark Anderson
YaBB Administrator
*
Offline

User - not staff!

Posts: 5689
Southsea, UK
Re: Gettysburg: a TB textual analysis experiment
Reply #44 - Jul 14th, 2009, 2:03pm
 
Re Loryn's post (#41), there's an error in your web article. Part way through the article on 'cleansing', we have this:
Code:
| sed 's/…/… /g'
| sed 's/—/— /g'
| sed 's/\\.\\.\\./\\.\\.\\. /g'
| sed 's/ / /g' 


It should in fact be:
Code:
| sed 's/…/… \t/g'
| sed 's/—/— \t/g'
| sed 's/\\.\\.\\./\\.\\.\\.\t/g'
| sed 's/ / \t/g' 


...otherwise the explanation following it will be a tad confusing! I suspect somewhere in the publishing process something has translated the '\t' into tab characters which in turn don't render in a web page. If this is happening, try replacing each '\t' with '&#092;t' in your HTML.
Back to top
 
 

--
Mark Anderson
TB user and Wiki Gardener
aTbRef v6
(TB consulting - email me)
WWW shoantel   IP Logged
Pages: 1 2 3 4 
Send Topic Print