@@ -58,9 +58,9 @@ SELECT * FROM tab WHERE lower(col) = LOWER(?);
5858 The <type>citext</> data type allows you to eliminate calls
5959 to <function>lower</> in SQL queries, and allows a primary key to
6060 be case-insensitive. <type>citext</> is locale-aware, just
61- like <type>text</>, which means that the comparison of upper case and
61+ like <type>text</>, which means that the matching of upper case and
6262 lower case characters is dependent on the rules of
63- the <literal>LC_CTYPE</> locale setting. Again, this behavior is
63+ the database's <literal>LC_CTYPE</> setting. Again, this behavior is
6464 identical to the use of <function>lower</> in queries. But because it's
6565 done transparently by the data type, you don't have to remember to do
6666 anything special in your queries.
@@ -97,17 +97,25 @@ SELECT * FROM users WHERE nick = 'Larry';
9797
9898 <sect2>
9999 <title>String Comparison Behavior</title>
100+
101+ <para>
102+ <type>citext</> performs comparisons by converting each string to lower
103+ case (as though <function>lower</> were called) and then comparing the
104+ results normally. Thus, for example, two strings are considered equal
105+ if <function>lower</> would produce identical results for them.
106+ </para>
107+
100108 <para>
101109 In order to emulate a case-insensitive collation as closely as possible,
102- there are <type>citext</>-specific versions of a number of the comparison
110+ there are <type>citext</>-specific versions of a number of string-processing
103111 operators and functions. So, for example, the regular expression
104112 operators <literal>~</> and <literal>~*</> exhibit the same behavior when
105- applied to <type>citext</>: they both compare case-insensitively.
113+ applied to <type>citext</>: they both match case-insensitively.
106114 The same is true
107115 for <literal>!~</> and <literal>!~*</>, as well as for the
108116 <literal>LIKE</> operators <literal>~~</> and <literal>~~*</>, and
109117 <literal>!~~</> and <literal>!~~*</>. If you'd like to match
110- case-sensitively, you can always cast to <type>text</> before comparing .
118+ case-sensitively, you can cast the operator's arguments to <type>text</>.
111119 </para>
112120
113121 <para>
@@ -168,10 +176,10 @@ SELECT * FROM users WHERE nick = 'Larry';
168176 <itemizedlist>
169177 <listitem>
170178 <para>
171- <type>citext</>'s behavior depends on
179+ <type>citext</>'s case-folding behavior depends on
172180 the <literal>LC_CTYPE</> setting of your database. How it compares
173- values is therefore determined when
174- <application>initdb</> is run to create the cluster. It is not truly
181+ values is therefore determined when the database is created.
182+ It is not truly
175183 case-insensitive in the terms defined by the Unicode standard.
176184 Effectively, what this means is that, as long as you're happy with your
177185 collation, you should be happy with <type>citext</>'s comparisons. But
@@ -181,6 +189,20 @@ SELECT * FROM users WHERE nick = 'Larry';
181189 </para>
182190 </listitem>
183191
192+ <listitem>
193+ <para>
194+ As of <productname>PostgreSQL</> 9.1, you can attach a
195+ <literal>COLLATE</> specification to <type>citext</> columns or data
196+ values. Currently, <type>citext</> operators will honor a non-default
197+ <literal>COLLATE</> specification while comparing case-folded strings,
198+ but the initial folding to lower case is always done according to the
199+ database's <literal>LC_CTYPE</> setting (that is, as though
200+ <literal>COLLATE "default"</> were given). This may be changed in a
201+ future release so that both steps follow the input <literal>COLLATE</>
202+ specification.
203+ </para>
204+ </listitem>
205+
184206 <listitem>
185207 <para>
186208 <type>citext</> is not as efficient as <type>text</> because the
@@ -198,20 +220,20 @@ SELECT * FROM users WHERE nick = 'Larry';
198220 contexts. The standard answer is to use the <type>text</> type and
199221 manually use the <function>lower</> function when you need to compare
200222 case-insensitively; this works all right if case-insensitive comparison
201- is needed only infrequently. If you need case-insensitive most of
202- the time and case-sensitive infrequently, consider storing the data
223+ is needed only infrequently. If you need case-insensitive behavior most
224+ of the time and case-sensitive infrequently, consider storing the data
203225 as <type>citext</> and explicitly casting the column to <type>text</>
204- when you want case-sensitive comparison. In either situation, you
205- will need two indexes if you want both types of searches to be fast.
226+ when you want case-sensitive comparison. In either situation, you will
227+ need two indexes if you want both types of searches to be fast.
206228 </para>
207229 </listitem>
208230
209231 <listitem>
210232 <para>
211233 The schema containing the <type>citext</> operators must be
212234 in the current <varname>search_path</> (typically <literal>public</>);
213- if it is not, a normal case-sensitive <type>text</> comparison
214- is performed .
235+ if it is not, the normal case-sensitive <type>text</> operators
236+ will be invoked instead .
215237 </para>
216238 </listitem>
217239 </itemizedlist>
0 commit comments