@@ -24,16 +24,37 @@ \section{\module{codecs} ---
2424\begin {funcdesc }{register}{search_function}
2525Register a codec search function. Search functions are expected to
2626take one argument, the encoding name in all lower case letters, and
27- return a tuple of functions \code {(\var {encoder}, \var {decoder}, \var {stream_reader},
28- \var {stream_writer})} taking the following arguments:
27+ return a \class {CodecInfo} object having the following attributes:
28+
29+ \begin {itemize }
30+ \item \code {name} The name of the encoding;
31+ \item \code {encoder} The stateless encoding function;
32+ \item \code {decoder} The stateless decoding function;
33+ \item \code {incrementalencoder} An incremental encoder class or factory function;
34+ \item \code {incrementaldecoder} An incremental decoder class or factory function;
35+ \item \code {streamwriter} A stream writer class or factory function;
36+ \item \code {streamreader} A stream reader class or factory function.
37+ \end {itemize }
38+
39+ The various functions or classes take the following arguments:
2940
3041 \var {encoder} and \var {decoder}: These must be functions or methods
3142 which have the same interface as the
3243 \method {encode()}/\method {decode()} methods of Codec instances (see
3344 Codec Interface). The functions/methods are expected to work in a
3445 stateless mode.
3546
36- \var {stream_reader} and \var {stream_writer}: These have to be
47+ \var {incrementalencoder} and \var {incrementalencoder}: These have to be
48+ factory functions providing the following interface:
49+
50+ \code {factory(\var {errors}='strict')}
51+
52+ The factory functions must return objects providing the interfaces
53+ defined by the base classes \class {IncrementalEncoder} and
54+ \class {IncrementalEncoder}, respectively. Incremental codecs can maintain
55+ state.
56+
57+ \var {streamreader} and \var {streamwriter}: These have to be
3758 factory functions providing the following interface:
3859
3960 \code {factory(\var {stream}, \var {errors}='strict')}
@@ -58,13 +79,13 @@ \section{\module{codecs} ---
5879\end {funcdesc }
5980
6081\begin {funcdesc }{lookup}{encoding}
61- Looks up a codec tuple in the Python codec registry and returns the
62- function tuple as defined above.
82+ Looks up the codec info in the Python codec registry and returns a
83+ \class {CodecInfo} object as defined above.
6384
6485Encodings are first looked up in the registry's cache. If not found,
65- the list of registered search functions is scanned. If no codecs tuple
66- is found, a \exception {LookupError} is raised. Otherwise, the codecs
67- tuple is stored in the cache and returned to the caller.
86+ the list of registered search functions is scanned. If no \class {CodecInfo}
87+ object is found, a \exception {LookupError} is raised. Otherwise, the
88+ \class {CodecInfo} object is stored in the cache and returned to the caller.
6889\end {funcdesc }
6990
7091To simplify access to the various codecs, the module provides these
@@ -85,6 +106,22 @@ \section{\module{codecs} ---
85106Raises a \exception {LookupError} in case the encoding cannot be found.
86107\end {funcdesc }
87108
109+ \begin {funcdesc }{getincrementalencoder}{encoding}
110+ Lookup up the codec for the given encoding and return its incremental encoder
111+ class or factory function.
112+
113+ Raises a \exception {LookupError} in case the encoding cannot be found or the
114+ codec doesn't support an incremental encoder.
115+ \end {funcdesc }
116+
117+ \begin {funcdesc }{getincrementaldecoder}{encoding}
118+ Lookup up the codec for the given encoding and return its incremental decoder
119+ class or factory function.
120+
121+ Raises a \exception {LookupError} in case the encoding cannot be found or the
122+ codec doesn't support an incremental decoder.
123+ \end {funcdesc }
124+
88125\begin {funcdesc }{getreader}{encoding}
89126Lookup up the codec for the given encoding and return its StreamReader
90127class or factory function.
@@ -188,6 +225,18 @@ \section{\module{codecs} ---
188225an encoding error occurs.
189226\end {funcdesc }
190227
228+ \begin {funcdesc }{iterencode}{iterable, encoding\optional {, errors}}
229+ Uses an incremental encoder to iteratively encode the input provided by
230+ \var {iterable}. This function is a generator. \var {errors} (as well as
231+ any other keyword argument) is passed through to the incremental encoder.
232+ \end {funcdesc }
233+
234+ \begin {funcdesc }{iterdecode}{iterable, encoding\optional {, errors}}
235+ Uses an incremental decoder to iteratively decode the input provided by
236+ \var {iterable}. This function is a generator. \var {errors} (as well as
237+ any other keyword argument) is passed through to the incremental encoder.
238+ \end {funcdesc }
239+
191240The module also provides the following constants which are useful
192241for reading and writing to platform dependent files:
193242
@@ -292,6 +341,109 @@ \subsubsection{Codec Objects \label{codec-objects}}
292341 empty object of the output object type in this situation.
293342\end {methoddesc }
294343
344+ The \class {IncrementalEncoder} and \class {IncrementalDecoder} classes provide
345+ the basic interface for incremental encoding and decoding. Encoding/decoding the
346+ input isn't done with one call to the stateless encoder/decoder function,
347+ but with multiple calls to the \method {encode}/\method {decode} method of the
348+ incremental encoder/decoder. The incremental encoder/decoder keeps track of
349+ the encoding/decoding process during method calls.
350+
351+ The joined output of calls to the \method {encode}/\method {decode} method is the
352+ same as if the all single inputs where joined into one, and this input was
353+ encoded/decoded with the stateless encoder/decoder.
354+
355+
356+ \subsubsection {IncrementalEncoder Objects \label {incremental-encoder-objects } }
357+
358+ The \class {IncrementalEncoder} class is used for encoding an input in multiple
359+ steps. It defines the following methods which every incremental encoder must
360+ define in order to be compatible to the Python codec registry.
361+
362+ \begin {classdesc }{IncrementalEncoder}{\optional {errors}}
363+ Constructor for a \class {IncrementalEncoder} instance.
364+
365+ All incremental encoders must provide this constructor interface. They are
366+ free to add additional keyword arguments, but only the ones defined
367+ here are used by the Python codec registry.
368+
369+ The \class {IncrementalEncoder} may implement different error handling
370+ schemes by providing the \var {errors} keyword argument. These
371+ parameters are predefined:
372+
373+ \begin {itemize }
374+ \item \code {'strict'} Raise \exception {ValueError} (or a subclass);
375+ this is the default.
376+ \item \code {'ignore'} Ignore the character and continue with the next.
377+ \item \code {'replace'} Replace with a suitable replacement character
378+ \item \code {'xmlcharrefreplace'} Replace with the appropriate XML
379+ character reference
380+ \item \code {'backslashreplace'} Replace with backslashed escape sequences.
381+ \end {itemize }
382+
383+ The \var {errors} argument will be assigned to an attribute of the
384+ same name. Assigning to this attribute makes it possible to switch
385+ between different error handling strategies during the lifetime
386+ of the \class {IncrementalEncoder} object.
387+
388+ The set of allowed values for the \var {errors} argument can
389+ be extended with \function {register_error()}.
390+ \end {classdesc }
391+
392+ \begin {methoddesc }{encode}{object\optional {, final}}
393+ Encodes \var {object} (taking the current state of the encoder into account)
394+ and returns the resulting encoded object. If this is the last call to
395+ \method {encode} \var {final} must be true (the default is false).
396+ \end {methoddesc }
397+
398+ \begin {methoddesc }{reset}{}
399+ Reset the encoder to the initial state.
400+ \end {methoddesc }
401+
402+
403+ \subsubsection {IncrementalDecoder Objects \label {incremental-decoder-objects } }
404+
405+ The \class {IncrementalDecoder} class is used for decoding an input in multiple
406+ steps. It defines the following methods which every incremental decoder must
407+ define in order to be compatible to the Python codec registry.
408+
409+ \begin {classdesc }{IncrementalDecoder}{\optional {errors}}
410+ Constructor for a \class {IncrementalDecoder} instance.
411+
412+ All incremental decoders must provide this constructor interface. They are
413+ free to add additional keyword arguments, but only the ones defined
414+ here are used by the Python codec registry.
415+
416+ The \class {IncrementalDecoder} may implement different error handling
417+ schemes by providing the \var {errors} keyword argument. These
418+ parameters are predefined:
419+
420+ \begin {itemize }
421+ \item \code {'strict'} Raise \exception {ValueError} (or a subclass);
422+ this is the default.
423+ \item \code {'ignore'} Ignore the character and continue with the next.
424+ \item \code {'replace'} Replace with a suitable replacement character.
425+ \end {itemize }
426+
427+ The \var {errors} argument will be assigned to an attribute of the
428+ same name. Assigning to this attribute makes it possible to switch
429+ between different error handling strategies during the lifetime
430+ of the \class {IncrementalEncoder} object.
431+
432+ The set of allowed values for the \var {errors} argument can
433+ be extended with \function {register_error()}.
434+ \end {classdesc }
435+
436+ \begin {methoddesc }{decode}{object\optional {, final}}
437+ Decodes \var {object} (taking the current state of the decoder into account)
438+ and returns the resulting decoded object. If this is the last call to
439+ \method {decode} \var {final} must be true (the default is false).
440+ \end {methoddesc }
441+
442+ \begin {methoddesc }{reset}{}
443+ Reset the decoder to the initial state.
444+ \end {methoddesc }
445+
446+
295447The \class {StreamWriter} and \class {StreamReader} classes provide
296448generic working interfaces which can be used to implement new
297449encodings submodules very easily. See \module {encodings.utf_8} for an
0 commit comments