w3c Annotations meeting 2014-04-02

Randall Leeds:

Annotations are additions - links, block quotes, inline - microformats and microdata; out of band webmention + pingback
we want annotations to be decentralized, so they need to interoperate with a data model and component models
We are going to need syndication and discovery for annotations
we want a more read/write web - we deserve the freedom to reference, not overload URI syntax forever
how do we go from out of band to inline? How do you show a remote annotation in the text you're reading
existing event targets are DOM element, but we want to target the data -the text of the page idea: event.dataTarget
how about a selection pseudo element - can we describe a selection, style it and not modify the DOM - like a:visited
we need feed discovery - Activity Streams. Also we need to publish locally and use WebMention http://indiewebcamp.com/webmention

Chris Gallello:

I'm PM on Office Online, working on accessibility - we have annotations in Office and visual studio already
red underlines in Word and breakpoints in Visual Studio are annotations form our point of view [machine generated]
we've been thinking about how to connect screen readers to annotations - to know there is one, to jump in and out
we can use an aria.annotationtype of "comment" and point to the element with the comment with aria.annotatedby

Anna Gerber:

Annotation is used throughout the research process, and for teaching not just as part of publications
researchers need to annotate maps, 3D spaces, protein models, sensor data streams , textual variants
scholary annotation requirements: citable; precise; of segments/regions; many data types; within dynamic web apps
we need to migrate annotations across multiple copies of identical resources, or modified ones
stand-off annotation is needed to maintain integrity of original resources
we use the Annotea, OA model so have shared generic backend, but need protocols and APIs too
the OA model is extremely flexible, but diffciult to implement as the queries are hard to write

Sean Boisen:

Logos is a digital library for biblical studies, with very rich annotation and cross referencing tools
our app is not out on the web, but a desktop app you download.
Bible has book/chapter/verse and we add word, but they are different between some bibles
we care about cross-lingual word alignment for annotations, which requires extra standardizations

Nick Stenning:

data models are simple; protocols harder, but user interactions are hard enough that we'd take us years to agree
a bookmarklet is basically a cross-site scripting attack. A standard known as content security policy breaks them
Browser extensions are not standardised - every user agent has its own extension model
Does anyone know for proposals to allow user-trusted code to run in the DOM in a standard way? Nope
we either need to spec ALL the things, or build for pluralism so there are many ways to do it

James Williamson:

As a publisher, having notes you take on Kindle not translating to ibooks is really annoying
we get complaints from readers all the time about the "1000 people marked this page" stuff in Kindle
we do pop-ups of footnotes and publication history in our online journals
we have third party annotations from reference management and social media crawling
the only way to annotate our works is with the "contact us" button that emails the publisher- can take 2 years
we have multiple different annotations at different levels of the publishing process paper/Word/PDF
we have extensive annotations in our new journals online product , and for assignments and quizzes
what happens to annotations when content has been deleted? Or out of print? [odd concept for web]

Frederick Hirsch:

people annotate books but also movies and sound too where you have to point at timestamps
teachers comment on student assignments and students can respond inline
provenance of annotations is important - this brings in identity issues. Iterating them is also hard
I don't think we should rework the Open Annotation data model, but we need RESTful search and JSON-LD output

Anna Gerber:

we have covered the data model but not the protocol.

Timothy Cole:

for scholarly text we need individual words and phrases as anchors, even when adjacent content is updated
correction of OCR and manual transcriptions, and of automated part-of-speech tagging
we need proposed corrections to be able to be reviewed and annotated themselves
for example we annotate the scan with the original OCR and the OCR with the corrections

Eric Aubourg:

STM is a small publisher with complex texts including arabic, hebrew and heiroglyphics
our policy is ePub firts, no DRM, one purchase for all formats
we have ePub with interactive maps of the Karnak temple that link to images of the wall paintings
referencing other works is required for scholarly publications, but page number doesn't work across editions
for epub we need something user-raedable and reader-processable - people like "page 23" not ids
we need it to be independent of paper, pdf, epub that can survive reflowing, and human readable
in epub you can mark the page boundaries of the original document in the HTML but doesn't go the other way
numbered paragraphs within chapters can work for finished works - easy to quote
we need readable shortcuts, not 64-character hashes - like link shorteners
we want the target refernce to be human processable

Kevin Marks:

q: isn't a quotation from the work the most robust reference across paper+ edocs 10 words are unique?

Eric Aubourg:

yes, quotation is good and robust, and human readable, but it can be a bit long

Fred Chasen:

the interactions in the various different document viewers are cumbersome and inconsistent
creating the notes content first and then anchoring to the document makes more sense
robust note authoring shouldn't block what you are annotating. It should let you re-anchor after composition
we need to account for longer notes, that may be longer than the entire text
also define print styles for CSS so that users can print out the whole thing

Kristof Csillag:

at Hypothes.is we're working on an annotation system for the web and we have proposed solutions
annotating web documentsnis good, but we added PDF, Epub and would like to add scribd and google docs too
we define a target with generic selectors xpath, but also text position and text content selectors
by having multiple selectors we can use fuzzy matching to find the parts we want
there is a problem with dynamic sites, we need dynamic anchoring - comments can be oprhaned
knowing what is actually the target across documents can be very hard

Kevin Marks:

if you remove annotations when the document referenced is edited can't unfavourable ones be removed?

Kristof Csillag:

yes, but we can keep orphaned annotations and possibly keep old versions

Anna Gerber:

how do you cope with copyright issues of keeping quotations?

Kristof Csillag:

right now we don't care about copyright - we are focused on making the annotations robust

Nick Stenning:

there are difficult problems around how you display disappeared content to the user, and how to reanchor

Tantek Çelik:

the anchoring problem is important - cool URLs don't change
question the assumption of annotation providers, how about self-hosted annotation, anchor and context too

Kevin Marks:

we should standardize multiple anchor formats - link, cite, text, image, audio snippet rather than how to be fuzzy

Anna Gerber:

when thinking about anchoring, focus on the content, not just the document format

Robert Casties:

At the Max Planck Institute for the History of Science, we have historical sources in many forms
we want to weave a web of knowledge as Jürgen Renn says
when you can collect of all the annotations on a source you get a semantic network of the source
we want to annotate images in a resolution independent way, we also want to show relationships and provenance
we want more complex co-ordinates and representations [why not use http://dev.w3.org/html5/spec-preview/image-maps.html] ?
we could use GeoJSON to point to images how would we add this as selectors?

Raquel Alegre:

at University of Reading, I work on CHARMe which annotates climate datasets
Scientists get huge numbers of options for data from data providers, but the annotations aren't lined
A climate Dataset may be a table, a time series, a map a 3d model or an animation
climate data users reserach timing, specific areas of the world, and comparing datasets
climate data comes in 2d, 3d and 4d formats - layers of images at different res or sensors
we need ways to point to space and time -we have geolocation and time units for most of this

Gregg Kellogg:

how can I use an API without coding for it? Define operations on classes and properties
annotations are the results of operations acting on entities

Jason Haag:

I'm from the IEEE Learning Technology Standards Committee and we are working on storage APIs
our Tin Can API is based on activitystrea.ms - Actor, Verb, Object
the xAPI records learning experiences using activites
IEEE wants to use EPUB3 as sustainable format for technical content, with action tracking by xAPI

Jake Hartnell:

I'm a science fiction writer - heres a shameless plug for my book a 23rd century romance
in 2018, web annotation will be implemented in the browser we can refer to anything
the annotation document needs to be stored somewhere - think of it as channels
the browser queries all the channels the user subscribes to, kinda like rss feeds and they load in a sidebar
Annotation is a kind of advanced linking
the browser should provide a space for these attached documents to live and be viewed

Gerardo Capiel:

annotation is a powerful tool for accessibility of non-textual content when authors forget to put it in
images tend to lack proper descriptions and mathematics is often done as images, not MathML
Video description has even less support
today, Blind and vision impaired students get support by others annotating video and images
we need unified standards for annotation so that the efforts of people who do accessible annotation this can spread

Puneet Kishor:

Copyright is a rats nest. I'm not a lawyer, I want to avoid unleashing the rats. I work at Creative Commons
our job at Creative Commons is to keep this unfettered by the law
annotations may not start out with enough original material to be copyrightable, but could grow into cliffs notes
every annotation should carry the information with it to determine legal status [presumably cc license]
people can assert what they want in the way of attribution and commercial use with CC license
our latest version of Creative Commons, CC4 will cover database licences too, Should be stable to use now
Creative Commons don't restrict people, they enable people. That's always the goal.
you can only licence what you create, not someone else's stuff. The snippets [anchors] should be fair use
copyright attaches to original authorship fixed in a tangible medium - CC lets you disclaim

Tantek Çelik:

both APA and MLA citation styles for tweets include the entire text of the tweet - they don't mention licensing

w3c Annotations meeting 2014-04-02

webmentions