[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

abstract model violation — NOT! #2497

Open
sydb opened this issue Oct 23, 2023 · 3 comments
Open

abstract model violation — NOT! #2497

sydb opened this issue Oct 23, 2023 · 3 comments

Comments

@sydb
Copy link
Member
sydb commented Oct 23, 2023

[The following problem noticed by WWP encoder extraordinaire Grace O’Mara.]

The following (it seems to me) is a perfectly reasonable encoding:

<lg type="poem" subtype="stanzaic">
  <head>Ode II.</head>
  <head type="sub">The Mermaid.</head>
  <epigraph>
    <quote source="b:IT00863">
      <p>When at laſt they retired to reſt, <persName>Ajut</persName> went down to the beach, where
      <lb/>finding a fiſhing-boat, ſhe entered it without heſitation, and, telling thoſe … </p>
      <p>The fate of theſe lovers gave occaſion to various fictions and conjectures.
      …
      <lb/>her lover in the deſerts of the ſea.</p>
    </quote>
    <bibl><title ref="b:IT00863">Rambler</title>, N<g ref="#sup-o"/> 187.</bibl>
  </epigraph>
  <l>Blow on, ye death-fraught whirlwinds! blow,</l>
  <l>Around the rocks, and rifted caves;</l>

But the "abstractModel-structure-p-in-l-or-lg" constraint fires on each of those <p> elements complaining that “Lines may not contain higher-level structural elements”. That happens because those <p> elements are descendants of <lg>, but are not a descendant of <floatingText>, a child of <figure>, nor a child of <note>.[1]

Possible solutions, in my preferred order:

  1. Change the test so it is only checking that <p>s that are a descendant of <l>, not of <lg>.
  2. Add another clause to the test to check that the <p> is also not a descendant of <epigraph>.
  3. Add another clause to the test to check that the <p> is also not a child of <quote>.
  4. Require the <epigraph> be inside a <floatingText>.
  5. Tell Grace that finally, at the end of her stellar 5-year career encoding for the WWP, she has found something that TEI simply cannot represent properly.

I like (1) the best, because I am not sure why we are testing <lg> here. While I have a gnawing suspicion that it is at least partly my fault that we are testing <lg>, I am having trouble figuring out what element a paragraph could occur in which should be an abstract model violation when it occurs in an <lg> but not outside. That is, I am (once again) complaining that

        <lg>
          <head><app><rdg><p/></rdg></app></head>
          <l/>
        </lg>

is an abstract model violation, but

        <head><app><rdg><p/></rdg></app></head>
        <lg>
          <l/>
        </lg>

is not. Makes no sense to me.[2] Further, if we use solution (1), because we are soon to be required to provide a @context,[3] we may as well use

      <sch:rule context="tei:l//tei:p">
	<sch:assert test="ancestor::tei:floatingText | parent::tei:figure | parent::tei:note )">
          Abstract model violation: Lines may not …
      </sch:assert>

which is a whole lot simpler than what we have now.

Notes
[1] The test, which is fired for each and every <p>, is (ancestor::tei:l or ancestor::tei:lg) and not( ancestor::tei:floatingText |parent::tei:figure |parent::tei:note ); the error message is delivered if the test is true.
[2] Well, not exactly; my usual complaint is that they should both be abstract model violations.
[3] I know we (TEI Technical Council) have pretty much decided to require an <sch:rule context="…"> in every <constraintSpec scheme="schematron">, but I could not find a ticket for that.

@ebeshero
Copy link
Member
ebeshero commented Oct 24, 2023

@sydb While I am usually pleased to question the abstract model, in this case I find myself perplexed on two points:

  1. Why is it reasonable to set a presumably introductory prose epigraph inside an <lg> designated to contain stanzaic poetry? Is the epigraph somehow attached to a particular cluster of lines within a poem? Grace's encoding suggests otherwise, that an <lg> is used to contain an entire poem. I am perplexed about that decision, because surely another container element would be more versatile for such a mixture of content. But it seems a deliberate decision to use <lg> this way for introductory material, presumably containing epigraph, some introductory lines, and I imagine some nested <lg> elements once you get into the stanzas. You seem perfectly happy with this.

  2. So is it time to reconsider the abstract model itself instead of elaborately undercutting it with more and more special-case Schematron rules based on specific contexts? Someone like me may come along in another ticket wanting to put a <note> in an <lg> for reasons that the note is attached to a specific stanza and happens to go on for paragraphs.

The abstract model itself may be the problem.

@lb42
Copy link
Member
lb42 commented Oct 24, 2023

I was also very puzzled by this use of lg type=poem.... Why isnt it a div or indeed a text?
As to the alleged model violation -- surely the model should know that some elements (note, quote, floatingText, possibly app) are just private little worlds on their own and not fuss?

@sydb
Copy link
Member Author
sydb commented Oct 25, 2023

@ebeshero — In my current state of mind I am not sure I am qualified to answer question (1), if I ever am. But I suggest it is entirely irrelevant: the TEI Guidelines explicitly permit <epigraph> as a child of <lg>, and have done so since at least P2.

As for (2), while I think there are definitely problems with the abstract model, I do not think this is one of them. I think the abstract model says it is perfectly OK to have lg/epigraph/p, and our Schematron is just wrong. As noted above, my preferred solution would be to simplify the Schematron so it no longer checks lg//p as a possible violation, on the theory that any <p> descendant of an <lg> is legitimate unless it is illegitimate because it is also a descendant of something like a <head> or an <l> (so those are the things we need worry about).

@lb42 — Yes, I think the model should know that, and (fortunately) in the cases of <note>, <quote>, and <floatingText> I think it handles things mostly correctly. But not for <epigraph> when inside <lg>.

Note that <app> is another story, as it is the source of all this horror. The content model of <rdg> (or <lem>) allows <p>, <ab>, <lg>, or even <div>. The point of all this problematic Schematron is to warn an encoder when he has put one of those big elements inside a smaller thing (like a <head>, <p>, <l>, or even <w>) by enclosing it in a <rdg> (or <lem>). I.e., to prevent

<p>This is a
  <app>
    <rdg><div><ab>bad</ab></div></rdg>
    <rdg><div><p>terrible</p></div></rdg>
  </app>
  idea.
</p>

@trishaoconnor trishaoconnor added this to the Guidelines 4.8.0 milestone Nov 20, 2023
sydb added a commit that referenced this issue May 19, 2024
 * Updated Schematron as per issue
 * Added a test to detest.xml
 * Updated expected results to match new test file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants