Lecture: 09
Human-Computer Interaction
paper: printing and scanning
          print technology
 fonts, page description, WYSIWYG
           scanning, OCR
Printing
• image made from small dots
   • allows any character set or graphic to be printed,
• critical features:
   • resolution
       • size and spacing of the dots
       • measured in dots per inch (dpi)
   • speed
       • usually measured in pages per minute
   • cost!!
Types of dot-based printers
• dot-matrix printers
    • use inked ribbon (like a typewriter
    • line of pins that can strike the ribbon, dotting the paper.
    • typical resolution 80-120 dpi
• ink-jet and bubble-jet printers
    • tiny blobs/drops of ink sent from print head to paper
    • typically 300 dpi or better .
• laser printer
    • like photocopier: dots of electrostatic charge deposited
      on drum, which picks up toner (black powder form of ink)
      rolled onto paper which is then fixed with heat
    • typically 600 dpi or better.
Printing in the workplace
• shop tills
   • dot matrix
   • same print head used for several paper rolls
   • may also print cheques
• thermal printers
   • special heat-sensitive paper
   • paper heated by pins makes a dot
   • poor quality, but simple & low maintenance
   • used in some fax machines
Fonts
• Font – the particular style of text
                                Courier font
                               Helvetica font
                               Palatino font
                             Times Roman font
                      • §´  (special symbol)
• Size of a font measured in points (1 pt about 1/72”)
  related to its height
                                   This is ten point Helvetica
                                    This is twelve point
                                 This is fourteen point
                               This is eighteen point
                      and this is twenty-four point
Fonts (ctd)
Pitch
   • fixed-pitch – every character has the same width
        e.g. Courier
   • variable-pitched – some characters wider
        e.g. Times Roman – compare the ‘i’ and the “m”
Serif or Sans-serif
   • sans-serif – square-ended strokes
   (Modern style:printing, styles etc)
        e.g. Helvetica
   • serif – with splayed ends (such as)
   (Old style:paper, newspaper, magazine printing)
        e.g. Times Roman or Palatino
Readability of text
• lowercase
    • easy to read shape of words
• UPPERCASE
    • better for individual letters and non-words
       e.g. flight numbers: BA793 vs. ba793
• serif fonts
   • helps your eye on long lines of printed text
   • but sans serif often better on screen
     Page Description Languages
    • Pages very complex
       • different fonts, bitmaps, lines, digitised photos, etc.
    • Can convert it all into a bitmap and send to the printer
          … but often huge !
    • Alternatively Use a page description language
       • sends a description of the page can be sent,
       • instructions for curves, lines, text in different styles, etc.
       • like a programming language for printing!
    • PostScript is the most common
    ** is a graphics language invented by the people at Adobe Systems
PostScript
Incorporated. It is a simple stack language with a rich variety of functions.
Screen and page
• WYSIWYG
    • what you see is what you get
    • aim of word processing, etc.
• but …
    • screen: 72 dpi, landscape image
    • print: 600+ dpi, portrait
• can try to make them similar
       but never quite the same
• so … need different designs, graphics etc, for screen and
  print
**A WYSIWYG (pronounced "wiz-ee-wig") editor or program
is one that allows a developer to see what the end result will
look like while the interface or document is being created.
Scanners
• Take paper and convert it into a bitmap
• Two sorts of scanner
   • flat-bed: paper placed on a glass plate, whole page
     converted into bitmap
   • hand-held: scanner passed over paper, digitising strip
     typically 3-4” wide
• Shines light at paper and note intensity of reflection
   • colour or greyscale
• Typical resolutions from 600–2400 dpi
**A bitmap (or raster graphic) is a digital image composed of a matrix of
dots. When viewed at 100%, each dot corresponds to an individual pixel
on a display.
Scanners (ctd)
Used in
    • desktop publishing for incorporating photographs and
      other images
    • document storage and retrieval systems, doing away
      with paper storage
    + special scanners for slides and photographic negatives
Optical character recognition
• OCR converts bitmap back into text
• different fonts
   • create problems for simple “template matching”
     algorithms
   • more complex systems segment text, decompose it into
     lines and arcs, and decipher characters that way
• page format
   • columns, pictures, headers and footers
Paper-based interaction
• paper usually regarded as output only
• can be input too – OCR, scanning, etc.
• Xerox PaperWorks
   • glyphs – small patterns of /\\//\\\
      • used to identify forms etc.
      • used with scanner and fax to control applications
     **a glyph is an elemental symbol within an agreed set of
     symbols, intended to represent a readable character for
     the purposes of writing.
more recently
   papers micro printed - like wattermarks
       identify which sheet and where you are
   special ‘pen’ can read locations
       know where they are writing
       **Wattermarks is a faint design made in some paper during
       manufacture that is visible when held against the light and
       typically identifies the maker.
       ** An image that has a fixed location and does not move
       along with other content.Watermarks are often used on web
       pages so that the site's logo or banner is always visible in the
       background.
        Memory
  short term and long term
speed, capacity, compression
       formats, access
Short-term Memory - RAM
• Random access memory (RAM)
   • on silicon chips
   • 100 nano-second access time
   • usually volatile (lose information if power turned off)
   • data transferred at around 100 Mbytes/sec
• Some non-volatile RAM used to store basic set-up
  information
• Typical desktop computers:
      64 to 256 Mbytes RAM
Long-term Memory - disks
• magnetic disks
   • floppy disks store around 1.4 Mbytes
   • hard disks typically 40 Gbytes to 100s of Gbytes
     access time ~10ms, transfer rate 100kbytes/s
• optical disks
   • use lasers to read and sometimes write
   • more robust that magnetic media
   • CD-ROM
      - same technology as home audio, ~ 600 Gbytes
   • DVD - for Audio Video (AV) applications, or very large files
Blurring boundaries
• PDAs
   • often use RAM for their main memory
• Flash-Memory
   • used in PDAs, cameras etc.
   • silicon based but persistent/tireless
   • plug-in USB devices for data transfer
speed and capacity
• what do the numbers mean?
• some sizes (all uncompressed) …
   • this book, text only ~ 320,000 words, 2Mb
   • the Bible ~ 4.5 Mbytes
   • scanned page ~ 128 Mbytes
       • (11x8 inches, 1200 dpi, 8bit greyscale)
   • digital photo ~ 10 Mbytes
       • (2–4 mega pixels, 24 bit colour)
   • video ~ 10 Mbytes per second
       • (512x512, 12 bit colour, 25 frames per sec)
virtual memory
• Problem:
   • running lots of programs + each program large
   • not enough RAM
• Solution - Virtual memory :
   • store some programs temporarily on disk
   • makes RAM appear bigger
• But … swopping/swapping/exchanging
   • program on disk needs to run again
   • copied from disk to RAM
   •slows t h i n g s          d o w n
Compression
• reduce amount of storage required
• lossless
   • recover exact text or image – e.g. GIF, ZIP
   • look for commonalities:
      • text: AAAAAAAAAABBBBBCCCCCCCC            10A5B8C
      • video: compare successive frames and store change
• lossy
   • recover something like original – e.g. JPEG, MP3
   • exploit perception
      • JPEG: lose rapid changes and some colour
      • MP3: reduce accuracy of drowned out notes
Storage formats - text
• ASCII - 7-bit binary code for to each letter and character
• UTF-8 - 8-bit encoding of 16 bit character set
• RTF (rich text format)
      - text plus formatting and layout information
• SGML (standardized generalised markup language)
      - documents regarded as structured objects
• XML (extended markup language)
      - simpler version of SGML for web applications
Storage formats - media
• Images:
   • many storage formats :
            (PostScript, GIFF, JPEG, TIFF, PICT, etc.)
   • plus different compression techniques
            (to reduce their storage requirements)
• Audio/Video
   • again lots of formats :
             (QuickTime, MPEG, WAV, etc.)
   • compression even more important
   • also ‘streaming’ formats for network delivery
methods of access
• large information store
    • long time to search => use index
    • what you index -> what you can access
• simple index needs exact match
• forgiving systems:
    • Xerox “do what I mean” (DWIM)
    • SOUNDEX – McCloud ~ MacCleod
• access without structure …
    • free text indexing (all the words in a document)
    • needs lots of space!!