Generative Ink: Data-Driven Computational Models for Digital Ink

Emre Aksan⁴ &
Otmar Hilliges⁴

Part of the book series: Human–Computer Interaction Series ((HCIS))

2725 Accesses
3 Altmetric

Abstract

Digital ink promises to combine the flexibility of pen and paper interaction and the versatility of digital devices. Computational models of digital ink often focus on recognition of the content by following discriminative techniques such as classification, albeit at the cost of ignoring or losing personalized style. In this chapter, we propose augmenting the digital ink framework via generative modeling to achieve a holistic understanding of the ink content. Our focus particularly lies in developing novel generative models to gain fine-grained control by preserving user style. To this end, we model the inking process and learn to create ink samples similar to users. We first present how digital handwriting can be disentangled into style and content to implement editable digital ink, enabling content synthesis and editing. Second, we address a more complex setup of free-form sketching and propose a novel approach for modeling stroke-based data efficiently. Generative ink promises novel functionalities, leading to compelling applications to enhance the inking experience for users in an interactive and collaborative manner.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

GANwriting: Content-Conditioned Generation of Styled Handwritten Word Images

DSS: Synthesizing Long Digital Ink Using Data Augmentation, Style Encoding and Split Generation

Sampling and Ranking for Digital Ink Generation on a Tight Computational Budget

Notes

References

Aksan E, Pece F, Hilliges O (2018) DeepWriting: making digital Ink editable via deep generative modeling, association for computing machinery, New York, NY, USA, pp 1–14. https://doi.org/10.1145/3173574.3173779
Aksan E, Deselaers T, Tagliasacchi A, Hilliges O (2020) Cose: compositional stroke embeddings. arXiv:200609930
Annett M (2017) (digitally) inking in the 21st century. IEEE Comput Graph Appl 37(1):92–99. https://doi.org/10.1109/MCG.2017.1
Annett M, Anderson F, Bischof WF, Gupta A (2014) The pen is mightier: Understanding stylus behaviour while inking on tablets. In: Proceedings of graphics interface 2014, Canadian information processing society, CAN, GI ’14, pp 193–200
Google Scholar
Arvo J, Novins K (2000) Fluid sketches: continuous recognition and morphing of simple hand-drawn shapes. In: Proceedings of the 13th annual ACM symposium on User interface software and technology. ACM, pp 73–80
Google Scholar
Arvo J, Novins K (2005) Appearance-preserving manipulation of hand-drawn graphs. In: Proceedings of the 3rd international conference on Computer graphics and interactive techniques in Australasia and South East Asia. ACM, pp 61–68
Google Scholar
Berninger VW (2012) Strengthening the mind’s eye: the case for continued handwriting instruction in the 21st century. Principal 91:28–31
Google Scholar
Bhattacharya U, Plamondon R, Chowdhury SD, Goyal P, Parui SK (2017) A sigma-lognormal model-based approach to generating large synthetic online handwriting sample databases. Int J Doc Anal Recogn (IJDAR) 1–17
Google Scholar
Bhunia AK, Ghose S, Kumar A, Chowdhury PN, Sain A, Song YZ (2021a) Metahtr: towards writer-adaptive handwritten text recognition. arXiv:210401876
Bhunia AK, Khan S, Cholakkal H, Anwer RM, Khan FS, Shah M (2021b) Handwriting transformers. arXiv:210403964
Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press Inc, USA
MATH Google Scholar
Brandl P, Richter C, Haller M (2010) Nicebook: supporting natural note taking. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, New York, NY, USA, CHI ’10, pp 599–608. https://doi.org/10.1145/1753326.1753417
Bresler M, Phan TV, Průša D, Nakagawa M, Hlaváč V (2014) Recognition system for on-line sketched diagrams. In: ICFHR
Google Scholar
Bresler M, Průša D, Hlaváč V (2016) Online recognition of sketched arrow-connected diagrams. IJDAR
Google Scholar
Buades A, Coll B, Morel JM (2005) A non-local algorithm for image denoising. In: 2005 IEEE computer society conference on computer vision and pattern recognition, CVPR’05, vol 2, pp 60–65. https://doi.org/10.1109/CVPR.2005.38
Burgert HJ (2002) The calligraphic line: thoughts on the art of writing. H-J Burgert, translated by Brody Neuenschwander
Google Scholar
Carbune V, Gonnet P, Deselaers T, Rowley HA, Daryin A, Calvo M, Wang LL, Keysers D, Feuz S, Gervais P (2020) Fast multi-language LSTM-based online handwriting recognition. IJDAR
Google Scholar
Chang WD, Shin J (2012) A statistical handwriting model for style-preserving and variable character synthesis. Int J Doc Anal Recogn 15(1):1–19. https://doi.org/10.1007/s10032-011-0147-7
Chen HI, Lin TJ, Jian XF, Shen IC, Chen BY (2015) Data-driven handwriting synthesis in a conjoined manner. Comput Graph Forum 34(7):235–244. https://doi.org/10.1111/cgf.12762
Cheng Y, Wang D, Zhou P, Zhang T (2017) A survey of model compression and acceleration for deep neural networks. arXiv:171009282
Cherubini M, Venolia G, DeLine R, Ko AJ (2007) Let’s go to the whiteboard: how and why software developers use drawings. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, New York, NY, USA, CHI ’07, pp 557–566. https://doi.org/10.1145/1240624.1240714
Chung J, Kastner K, Dinh L, Goel K, Courville AC, Bengio Y (2015) A recurrent latent variable model for sequential data. arXiv:1506.02216
Costagliola G, Deufemia V, Risi M (2006) A multi-layer parsing strategy for on-line recognition of hand-drawn diagrams. In: Visual languages and human-centric computing
Google Scholar
Davis B, Tensmeyer C, Price B, Wigington C, Morse B, Jain R (2020) Text and style conditioned gan for generation of offline handwriting lines. arXiv:200900678
Davis RC, Landay JA, Chen V, Huang J, Lee RB, Li FC, Lin J, Morrey CB III, Schleimer B, Price MN, Schilit BN (1999) Notepals: Lightweight note sharing by the group, for the group. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, New York, NY, USA, CHI ’99, pp 338–345. https://doi.org/10.1145/302979.303107
Drucker J (1995) The alphabetic labyrinth: the letters in history and imagination. Thames and Hudson
Google Scholar
Elarian Y, Abdel-Aal R, Ahmad I, Parvez MT, Zidouri A (2014) Handwriting synthesis: classifications and techniques. Int J Doc Anal Recogn 17(4):455–469. https://doi.org/10.1007/s10032-014-0231-x
Elsen C, Häggman A, Honda T, Yang MC (2012) Representation in early stage design: an analysis of the influence of sketching and prototyping in design projects. Int Des Eng Tech Conf Comput Inf Eng Conf Am Soc Mech Eng 45066:737–747
Google Scholar
Espana-Boquera S, Castro-Bleda MJ, Gorbe-Moya J, Zamora-Martinez F (2011) Improving offline handwritten text recognition with hybrid hmm/ann models. Trans Pattern Recogn Mach Intell 33(4):767–779
Article Google Scholar
Evernote Corporation (2017) How evernotes image recognition works. http://blog.evernote.com/tech/2013/07/18/how-evernotes-image-recognition-works/. Accessed 10 Aug 2017
Fogel S, Averbuch-Elor H, Cohen S, Mazor S, Litman R (2020) Scrabblegan: semi-supervised varying length handwritten text generation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4324–4333
Google Scholar
Gadelha M, Wang R, Maji S (2020) Deep manifold prior
Google Scholar
Gervais P, Deselaers T, Aksan E, Hilliges O (2020) The DIDI dataset: digital ink diagram data
Google Scholar
Google Creative Lab (2017) Quick, draw! The data. https://quickdraw.withgoogle.com/data. Accessed 01 May 2020
Graves A (2013) Generating sequences with recurrent neural networks. arXiv:1308.0850
Groueix T, Fisher M, Kim V, Russell B, Aubry M (2018) Atlasnet: a papier-mâché approach to learning 3D surface generation. In: CVPR
Google Scholar
Gurumurthy S, Sarvadevabhatla RK, Radhakrishnan VB (2017) Deligan: generative adversarial networks for diverse and limited data. arXiv:170602071
Ha D, Eck D (2017) A neural representation of sketch drawings
Google Scholar
Haines TS, Mac Aodha O, Brostow GJ (2016) My text in your handwriting. In: Transactions on graphics
Google Scholar
Haller M, Leitner J, Seifried T, Wallace JR, Scott SD, Richter C, Brandl P, Gokcezade A, Hunter S (2010) The nice discussion room: Integrating paper and digital media to support co-located group meetings. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, New York, NY, USA, CHI ’10, pp 609–618. https://doi.org/10.1145/1753326.1753418
Hinckley K, Pahud M, Benko H, Irani P, Guimbretière F, Gavriliu M, Chen XA, Matulic F, Buxton W, Wilson A (2014) Sensing techniques for tablet+stylus interaction. In: Proceedings of the 27th annual ACM symposium on user interface software and technology. ACM, New York, NY, USA, UIST ’14, pp 605–614. https://doi.org/10.1145/2642918.2647379
Hinton G, Nair V (2005) Inferring motor programs from images of handwritten digits. In: Proceedings of the 18th international conference on neural information processing systems. MIT Press, Cambridge, MA, USA, NIPS’05, pp 515–522. http://dl.acm.org/citation.cfm?id=2976248.2976313
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Huang CZA, Vaswani A, Uszkoreit J, Shazeer N, Simon I, Hawthorne C, Dai AM, Hoffman MD, Dinculescu M, Eck D (2018) Music transformer. arXiv:180904281
Hussain F, Zalik B (1999) Towards a feature-based interactive system for intelligent font design. In: Proceedings of the 1999 IEEE international conference on information visualization, pp 378–383. https://doi.org/10.1109/IV.1999.781585
Johansson S, Eric A, Roger G, Geoffrey L (1986) The tagged LOB corpus: user’s manual. Norwegian computing centre for the humanities, Bergen, Norway
Google Scholar
Kienzle W, Hinckley K (2013) Writing handwritten messages on a small touchscreen. In: Proceedings of the 15th international conference on human-computer interaction with mobile devices and services. ACM, New York, NY, USA, MobileHCI ’13, pp 179–182. https://doi.org/10.1145/2493190.2493200
Kingma DP, Welling M (2013a) Auto-encoding variational bayes. In: Proceedings of the 2nd international conference on learning representations (ICLR), 2014
Google Scholar
Kingma DP, Welling M (2013b) Auto-encoding variational bayes
Google Scholar
Knuth DE (1986) The metafont book. Addison-Wesley Longman Publishing Co Inc, Boston, MA, USA
MATH Google Scholar
Kotani A, Tellex S, Tompkin J (2020) Generating handwriting via decoupled style descriptors. In: European conference on computer vision. Springer, pp 764–780
Google Scholar
Kumar A, Marks TK, Mou W, Feng C, Liu X (2019) UGLLI face alignment: estimating uncertainty with gaussian log-likelihood loss. In: ICCV workshops, pp 0–0
Google Scholar
Lewis JR, Sauro J (2009) The factor structure of the system usability scale. In: Kurosu M (ed) Proceedings of the human centered design: first international conference, HCD 2009. Springer, Berlin, Heidelberg, pp 94–103. https://doi.org/10.1007/978-3-642-02806-9_12
Li K, Pang K, Song YZ, Xiang T, Hospedales T, Zhang H (2019) Toward deep universal sketch perceptual grouper. Trans image processing
Google Scholar
Li Y, Li W (2018) A survey of sketch-based image retrieval. Mach Vis Appl 29(7):1083–1100
Article Google Scholar
Liu G, Reda FA, Shih KJ, Wang TC, Tao A, Catanzaro B (2018) Image inpainting for irregular holes using partial convolutions. In: Proceedings of the European conference on computer vision (ECCV), pp 85–100
Google Scholar
Liwicki M, Bunke H (2005) Iam-ondb. an on-line English sentence database acquired from handwritten text on a whiteboard. In: In Proceedings of the 8th international conference on document analysis and recognition, pp 956–961
Google Scholar
Liwicki M, Graves A, Bunke H, Schmidhuber J (2007) A novel approach to on-line handwriting recognition based on bidirectional long short-term memory networks. In: Proceedings of the 9th international conference on document analysis and recognition, ICDAR 2007
Google Scholar
Locatello F, Bauer S, Lucic M, Rätsch G, Gelly S, Schölkopf B, Bachem O (2018) Challenging common assumptions in the unsupervised learning of disentangled representations
Google Scholar
Lu J, Yu F, Finkelstein A, DiVerdi S (2012) Helpinghand: example-based stroke stylization. ACM Trans Graph 31(4):46:1–46:10. https://doi.org/10.1145/2185520.2185542
Maaten LVD, Hinton G (2008) Visualizing data using t-sne. JMLR 9(Nov):2579–2605
Google Scholar
Marti UV, Bunke H (2002) The IAM-database: an English sentence database for offline handwriting recognition. IJDAR 5(1):39–46
Article Google Scholar
Mueller PA, Oppenheimer DM (2014) The pen is mightier than the keyboard: Advantages of longhand over laptop note taking. Psychol Sci. https://doi.org/10.1177/0956797614524581, http://pss.sagepub.com/content/early/2014/04/22/0956797614524581.abstract
Mynatt ED, Igarashi T, Edwards WK, LaMarca A (1999) Flatland: new dimensions in office whiteboards. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, New York, NY, USA, CHI ’99, pp 346–353. https://doi.org/10.1145/302979.303108
MyScript (2016) MyScript: the power of handwriting. http://myscript.com/. Accessed 04 Oct 2016
Noordzij G (2005) The stroke: theory of writing. Hyphen, translated from the Dutch, London
Google Scholar
Oord A, Dieleman S, Zen H, Simonyan K, Vinyals O, Graves A, Kalchbrenner N, Senior A, Kavukcuoglu K (2016) Wavenet: a generative model for raw audio. arXiv:160903499
Park T, Liu MY, Wang TC, Zhu JY (2019) Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2337–2346
Google Scholar
Perteneder F, Bresler M, Grossauer EM, Leong J, Haller M (2015) cluster: smart clustering of free-hand sketches on large interactive surfaces. In: Proceedings of the 28th annual ACM symposium on user interface software and technology. ACM, New York, NY, USA, UIST ’15, pp 37–46. https://doi.org/10.1145/2807442.2807455
Pfeuffer K, Hinckley K, Pahud M, Buxton B (2017) Thumb + pen interaction on tablets. In: Proceedings of the 2017 CHI conference on human factors in computing systems. ACM, New York, NY, USA, CHI ’17, pp 3254–3266. https://doi.org/10.1145/3025453.3025567
Plamondon R, Srihari SN (2000) On-line and off-line handwriting recognition: a comprehensive survey. IEEE Trans Pattern Anal Mach Intell 22(1):63–84. https://doi.org/10.1109/34.824821
Plamondon R, O’reilly C, Galbally J, Almaksour A, Anquetil É (2014) Recent developments in the study of rapid human movements with the kinematic theory: applications to handwriting and signature synthesis. Pattern Recogn Lett 35:225–235
Google Scholar
Pulver MAE (1972) Symbolik der handschrift, new. Kindler, Munich
Google Scholar
Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: Deep learning on point sets for 3d classification and segmentation. In: CVPR, pp 652–660
Google Scholar
Ribeiro L, Bui T, Collomosse J, Ponti M (2020) Sketchformer: transformer-based representation for sketched structure
Google Scholar
Riche Y, Henry Riche N, Hinckley K, Panabaker S, Fuelling S, Williams S (2017) As we may ink?: Learning from everyday analog pen use to improve digital ink experiences. In: Proceedings of the 2017 CHI conference on human factors in computing systems. ACM, New York, NY, USA, CHI ’17, pp 3241–3253. https://doi.org/10.1145/3025453.3025716
Robinson A (2007) The story of writing. Thames & Hudson, London, UK
Google Scholar
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Article Google Scholar
Sellen AJ, Harper RH (2003) The myth of the paperless office. MIT Press, Cambridge, MA, USA
Google Scholar
Shamir A, Rappoport A (1998) Feature-based design of fonts using constraints. In: International conference on raster imaging and digital typography. Springer, pp 93–108
Google Scholar
Shi J, Malik J (2000) Normalized cuts and image segmentation. PAMI
Google Scholar
Srihari S, Cha S, Arora H, Lee S (2002) Individuality of handwriting. J Forensic Sci 47(4):1–17. https://doi.org/10.1520/JFS15447J
Article Google Scholar
Subramonyam H, Seifert C, Shah P, Adar E (2020) texsketch: active diagramming through pen-and-ink annotations. In: Proceedings of the 2020 CHI conference on human factors in computing systems, pp 1–13
Google Scholar
Sutherland CJ, Luxton-Reilly A, Plimmer B (2016) Freeform digital ink annotations in electronic documents: a systematic mapping study. Comput Graph 55(C):1–20. https://doi.org/10.1016/j.cag.2015.10.014
Sutherland IE (1963) Sketchpad: A man-machine graphical communication system. In: Proceedings of the 21–23 May 1963, spring joint computer conference. ACM, New York, NY, USA, AFIPS ’63 (Spring), pp 329–346. https://doi.org/10.1145/1461551.1461591
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: NeurIPS, pp 3104–3112
Google Scholar
Ulyanov D, Vedaldi A, Lempitsky V (2018) Deep image prior. In: CVPR, pp 9446–9454
Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: NeurIPS
Google Scholar
Wang J, Wu C, Xu YQ, Yeung Shum H, Ji L (2002) Learning-based cursive handwriting synthesis. In: Proceedings of the Eighth international workshop on frontiers of handwriting recognition, pp 157–162
Google Scholar
Wang J, Wu C, Xu HY, Ying-Qing nd Shum, (2005) Combining shape and physical models for online cursive handwriting synthesis. Int J Doc Anal Recogn (IJDAR) 7(4):219–227. https://doi.org/10.1007/s10032-004-0131-6
Weibel N, Fouse A, Emmenegger C, Friedman W, Hutchins E, Hollan J (2012) Digital pen and paper practices in observational research. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, New York, NY, USA, CHI ’12, pp 1331–1340. https://doi.org/10.1145/2207676.2208590
Williams BH, Toussaint M, Storkey AJ (2007) Modelling motion primitives and their timing in biologically executed movements. In: Proceedings of the 20th international conference on neural information processing systems, Curran associates Inc, USA, NIPS’07, pp 1609–1616. http://dl.acm.org/citation.cfm?id=2981562.2981764
Williams F, Trager M, Panozzo D, Silva C, Zorin D, Bruna J (2019) Gradient dynamics of shallow univariate relu networks. In: NeurIPS, pp 8376–8385
Google Scholar
Williams RJ, Zipser D (1989) A learning algorithm for continually running fully recurrent neural networks. Neural Comput
Google Scholar
Wu X, Qi Y, Liu J, Yang J (2018) SketchSegNet: a RNN model for labeling sketch strokes. In: MLSP
Google Scholar
Xia H, Hinckley K, Pahud M, Tu X, Buxton B (2017) Writlarge: Ink unleashed by unified scope, action, and zoom. In: Proceedings of the 2017 CHI conference on human factors in computing systems. ACM, New York, NY, USA, CHI ’17, pp 3227–3240. https://doi.org/10.1145/3025453.3025664
Xu P, Hospedales TM, Yin Q, Song YZ, Xiang T, Wang L (2020) Deep learning for free-hand sketch: a survey.
Google Scholar
Yang L, Zhuang J, Fu H, Zhou K, Zheng Y (2020) SketchGCN: semantic sketch segmentation with graph convolutional networks
Google Scholar
Yoon D, Chen N, Guimbretière F (2013) Texttearing: opening white space for digital ink annotation. In: Proceedings of the 26th annual ACM symposium on user interface software and technology. ACM, New York, NY, USA, UIST ’13, pp 107–112. https://doi.org/10.1145/2501988.2502036
Yoon D, Chen N, Guimbretière F, Sellen A (2014) Richreview: Blending ink, speech, and gesture to support collaborative document review. In: Proceedings of the 27th annual ACM symposium on user interface software and technology. ACM, New York, NY, USA, UIST ’14, pp 481–490. https://doi.org/10.1145/2642918.2647390
Yun XL, Zhang YM, Ye JY, Liu CL (2019) Online handwritten diagram recognition with graph attention networks. In: ICIG
Google Scholar
Zanibbi R, Novins K, Arvo J, Zanibbi K (2001) Aiding manipulation of handwritten mathematical expressions through style-preserving morphs. Graph Interf 2001:127–134
Google Scholar
Zhang B, Srihari SN, Lee S (2003) Individuality of handwritten characters. In: Proceedings of the 7th international conference on document analysis and recognition, pp 1086–1090
Google Scholar
Zitnick CL (2013) Handwriting beautification using token means. ACM Trans Graph 32(4):53:1–53:8. https://doi.org/10.1145/2461912.2461985

Download references

Author information

Authors and Affiliations

Department of Computer Science, ETH Zürich, Stampfenbachstrasse 48, 8092, Zürich, Switzerland
Emre Aksan & Otmar Hilliges

Authors

Emre Aksan
View author publications
You can also search for this author in PubMed Google Scholar
Otmar Hilliges
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Emre Aksan .

Editor information

Editors and Affiliations

Google Research (United States), Mountain View, CA, USA
Yang Li
Advanced Interactive Technologies Lab, ETH Zurich, Zurich, Switzerland
Otmar Hilliges

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Aksan, E., Hilliges, O. (2021). Generative Ink: Data-Driven Computational Models for Digital Ink. In: Li, Y., Hilliges, O. (eds) Artificial Intelligence for Human Computer Interaction: A Modern Approach. Human–Computer Interaction Series. Springer, Cham. https://doi.org/10.1007/978-3-030-82681-9_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-82681-9_13
Published: 05 November 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-82680-2
Online ISBN: 978-3-030-82681-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics