Releases: ChatScript/ChatScript
ChatScript 14.1
- Normally :retry cannot be executed by a running script, it requires
it come from user input. There is however, an input cheat that will allow it.
"cheat retry xxx" given to the system will change the input to
"retry xxx" and then proceed along the debug path.
For Spanish you can say "cheat rever xxx".
13.4
- For users who have the TreeTagger add-in, the latest code supports reading a file which enumerates
supplemental foreign vocabulary. Write me for a copy of the library files. See Esoterica manual on TreeTagger.
13.3
- Support for Filipino
- potential phone numbers are marked
~phonenumber, ~uslocal_phonenumber,
~usinternational,~usinternal_phonenumber,~ukinternal_phonenumber
~ukinternational,~deinternational
~jpinternational,~esinternational, ~mxinternational
13.2
Note- you will be forced to recompile your bot when you run this version initially.
Formats for compiled data have changed.
-
^load() removed
-
JSON autoindirects a get or assign expression of a variable whose value is itself a variable.
if $x = '$y' and $y points to a json structure, then $x is auto and $x.val means $y.val . -
Advanced replace substitution- you can name a pattern (which can extend over multiple lines)
that can conditionally change the
matched word into any other word or remove it or do nothing.
Matching starts with _0 having been assigned to the location of the word/phrase to replace.
e.g.:
replace: bubble_tea ([
(is $$cs_replace:=2)
(has $$cs_replace:=null)
(@_0- *~2 my $$cs_replace:=1)
])
"bubble tea is" -> 2 is
"bubble tea has" -> has
"my green bubble tea loves" -> my green 1 loves
You cannot use concepts in these patterns, nor the canonical forms of words.
13.1
-
New command line parameers
a. legacymatch=0/1 (default if not given is legacymatch=1)
redefinition of match variable contents when matching a concept.
canonical is the concept member NOT the canonical form of the words
and original is what user typed (after corrections)
eg: concept: ~food ("baked potato") with input "baked potatoes"
legacy=1 : (_~food) => '_0 - baked_potatoes _0 - bake_potato
legacy=0 : (_~food) => '_0 - baked_potatoes _0 - baked_potato
If you don't have code that depends on legacy, you are better off adding legacy=0 to your init file
b. filecache= allocates a buffer to cache max n files.
A server might want to cache common files across volleys and users.
eg. filecache=5000x100 means allocate a 5M buffer for caching 100 files.
c. nophrases - suppresses marking ~prep_phrase, ~verb_phrase, ~noun_phrase
this is a minor speedup if you don't have a use for them.
d. nopatterndata - disables pattern data gathering that supports ^MatchesCode
this is a minor speedup if you don't use ^MatchesCode
e. websocketbot=xxx - names bot to be used for websocket communcation -
New functions
a. ^jsonreusekill($jsonstruct)
Kills the facts of this struct and makes them immediately available for reuse as free facts -
Function changes
^pos(uppercase/capitalize xxx) no longer capitalizes all in a phrase now follows title de capitalization rules regarding short conjunctions, prepositions, determiners
^pos(mixcase ... ) lowercase all letters except first one and first ones after space, underscore, hyphen
^setrejoinderO - simplied use, no longer need to use "copy" in some circumstances
^pos (substitute xxx)
^stats - removed wordaccess and averagehashdepth
^readfile (aka ^jsonreadcsv) now supports using :trace all and :trace none as input lines so
you can observe the behavior of specific lines of the file. also :quit, :exit, :pause, :resumeNote that the use of :trace and :quit are generally supported on any function which is reading data, including :source, table compilation, ^readfile.
-
New debug functions
:respondonly - Take serverlog named and replicate in tmp but without serverpre or separator lines
:dumpcache - dumps data about files currently cached by server
:splitfile - reads huge file line by line, writing 20mb pieces into TMP
:mergelines file {all} - now documented in Finalizing a bot
It reads in lines and for adjacent exact matches, it outputs the line just once, with a count in front.
Useful for finding how often a rule triggers in server log or finding unique user inputs to go
add to your bot. -
New marks
~verb_phrase, ~noun_phrase, ~prep_phrase enable easy grabs of sentence fragments -
Command line parameter removed:
nl_save - no longer cache nl analysis in testpattern -
new cs interchange variables
$cs_jid - number to start with when starting indexing of new json structure ids
$cs_directfromoob, if set to true, tells cs to convert any incoming oob directly into a json structure -
additional engine hook funcions
EndChat, PosTag, CheckRoles
MarkWord, SequenceMark, AdditionalMarks, BugLog,
DatabaseOperationStart (mongo), DatabaseOperationStop (mongo) -
Additional %httpresponse codes from jsonopen (see system variables manual for %httpresponse)
-
Modified JSON behavior:
JSON is case sensitive in field names. CS is not. If you offer a new case field name and it already has a field
with a different case, it will use the preexisting one.
You can name fields to ignore returning from ^parsejson and ^jsonopen by passing in a concept set
of names. If you have given an ignoreconcept, then whenver a field name is encountered in the incoming data
matches the name of a member of that concept set, that field and all data below it are discarded.
If you have given an underscore concept, then any field name encountered in the incoming data
that matches a member of that concept, that field's string value will have any spaces in it
replaced with underscores.
12.31
- Any nonjson loop limit given as a positive number will locally override whatever limits normally would
exist for a std non-json loop. eg Loop (20000) will make the limit be 20000 whereas Loop(^length(...))
would limit to whatever the current limit it even if ^length returned a greater value. - ^jsonopen optionally takes 2 additional arguments. First is the name of a concept whose
members are names of fields which if seen should be discarded. Second is the name of a
concept whose members are names of fields which are presumed to be string values, and
all blanks in the string should be converted to underscores. - ^jsonparse optionally takes 2 additional arguments. (see #2)
- %httpresponse has been extended with additional negative code values specific to curl failing to connect.
- New void rule kind
v: LABEL ()
which is faster and clearer than writings: LABEL (?)
.
It is used as the target of ^reuse or ^refine.
12.3
Version 12.3 October 16, 2022
WARNING- You will need to recompile your script files!
Erase TOPIC folder and then run CS and rebuild.
- Dictionary size now can be doubled, to allow 4M words instead of 2M.
Enabled one to load english, french, spanish, and german all at once. - With extra multi-language abilities, when you use the replace: command
be sure it is done under the correct language. You can do something like:
language: english
replace: "my word" mina_worda
Setting language lasts until you next change it. - documenting existing ^setresponse which is merely a rename of ^reviseoutput.
changes a queued message - You can synthesize a function name and then call it, eg
$_fn = ^join( ^testing _ en_us)
$_text = ^$_fn($_tmp)
Previously ^join would complain about a bad function call. - CS has inbuilt support for spanish lemma and possible pos-tags detection and manages words with and without their appropriate accenting.
It has concepts to represent object plurality and gender. - CS now allows you to name 7 language instead of 4 in cs_init files for what languages should be loaded.
In build files, you can designate what language is compiling what files, and include the language UNIVERSAL, which means a word
is visible to ALL languages and has the union of all its facts and properties from each language.
With an appropriate level 0 build file, you can load translated std cs concepts for multiple languages.
See Esoteric foreign language support LEVEL 0
6 :ingestlog reads a cs log file, repeats its calls and reports differences in the results and outputs errors to tmp/ingesterr.txt - The japanese mecab tagger and in-built nlp system will be applied to chinese if that is a named language.
Japanese requires installation of additional software and building executable without DISCARD_JAPANESE - ^jsonopen header value correlation-id: %s
If you have a variable $correlation_id, this value will be passed to remote call from jsonopen.
See Json manual.
9 :splitlog takes a cs log file and creates tmp/log.csv, whose columns are:
input, output, whyname, botname. See ChatScript Analytics manual.
10 ^incontext takes optional 2nd arg, an integer with how from the volley to not fail
Default returning how far from volley we are is changed when this param is used to how far or fail.
11 :verifylist, :verifyrun :verifymatch and associated #! VERIFY comments
A regression system that reads special comments in scripts, executes them always in the context
of coming from the top level initially outside of any topic. Arrival at the correct rule is a match
regardless of what the output text is. See Finalizing a Bot manual.
12 :translatetop - Reads a chatscript source file and outputs a corresponding one in the language you
request, using microsoft translate. See Esoteric ChatScript Foreign languages. - :fact - similar to :word, display all facts with given word or meaning or in named fact set"
- :word - now accepts optional 2nd argument which is the limit limit on number of facts to display
15 :language sets current language or if given no language returns the current language.
See also CS command line parameters language= parameter.
16 :ingestlog reads a cs log file, repeats its calls and reports differences in the results and outputs errors to tmp/ingesterr.txt
CS 11.6
Version 11.6 11/21/2021 THANKSGIVING SPECIAL
-
Support for multiple language dictionaries at once (+ japanese)
a. command line parameter - eg - "language=english,german"
b. :language to change language in local mode during conversation
c. :word(word) lists what language dictionaries word is in
d. top level "language: german" to scriptcompiler
e. $cs_language when set will change to that language
f. ^Setlanguage(spanish) to change from script
g. API top level field "language" whose value is the language to use for ^testpattern, ^testoutput, ^compilepattern, ^compileoutput
h. Special spellchecking support exists for german, spanish, french
i. tokenization- Japanese major sentence end punctuation converts to US std -
topic flag TOPIC_SAFE documented (when topic revision is "safe" so any saved state of user in topic is not destroyed)
-
:build flag echorule -- displays what rule you are currently compiling (useful if you dont know where you were if you died)
-
%curlversion %dbversion - version information for curl and current database connection
-
%crosstalk2 and %crosstalk3 - additionaly crosstalk variables (used to communicate across users via server)
6 :trace to scriptcompiler - see debugging manual "Tracing during compilation" -
serverlog entry shows linux pending requests q size at time of entry into q when using evserver mode
-
TOPRULE param for use when inside rejoinder areas (a:, b: etc) ^fail(TOPRULE) ^end(toprule) ^nofail(toprule)
9 ^setposition modified to accept ^setposition ( _var1 _var2 original ) which sets range from original user input
10 ^spellcheck accepts 3rd argument, tolerance for number changes that can be made or % of changes
11 jsonmerge optional 3rd argument
SUM When the keys in the two objects are the same then the values are added together.
SUMIF will only do that if the key actually exists in both objects being merged.
12 :restart redo_boot will, at end of volley, unwind the boot layer and reexecute boot functions to repopulate the server
with new boot data visible to all bots and users.
13 parselimit tells the system to not bother parsing, postagging or spellchecking inputs greater than this limit (speed optimization).
14 command line params
a. stdlogging and noserverprelog removed
b. deployloggingdelay - enable serverlogging on deploy automatically for n minutes (when normally logging would be off)
c. nl_save=1 -- caches nl processing on $cs_nldata for api calls to ^testpattern to pass along to future calls. saves processing time
nl-save
in user input will override to force nl saving (sets $bwnlsave to do the same thing)
d. file traceboot.txt to set param trace_boot dynamically on startup
- ^compilepattern/testpattern/testoutput support for script function definitions assigned to variables
see ^compilepattern ""Compiling script functions" - ^testpattern changes
a. style=tfdif in patterns for ^testpattern to match patterns in a manner to tf/dif
b. testpatterninput cheat cs info - returns variable $cs_info with when scripts and engine were compiled
c. if ^testpattern pattern calls ^jsonopen and fails (eg timeout or bad url), and extra field will be returned jsonopen-status field
d. $$cs_sentencecount- the number of the current sentence being used in ^testpattern - can be used with ^restoresentence
e. ^testpattern accepts a label field on each pattern, and using ^getrule(~ label) will retrieve this
f. %testpattern-prescan in the first n patterns to ^testpattern will cause all input sentences to be run against those patterns
before running the rest of the patterns in unison against each sentence sequentially
g. %testpattern - the index of the current pattern being testsed by ^testpattern
h. %originalinput - in testpattern its the input to it, otherwises its the non-oob input per normal
i. %testpattern-nosave in pattern will override any nlsave=1 command line parameter for this call to cs
CS 11.5
- Documented existing ^JsonKind(item) function, which returns object, array or fails, depending on what is passed to it.
- ^wordinfo(word) provides dictionary data about word properties, system flags, and substitution values.
- ^testpattern now accepts concepts named ~noun, ~verb, ~adjective and ~adverb,
allowing you to augment the dictionary remotely. - if ^purgeboot is called to remove the contents of a json variable,
then if that variable is not a user variable, it will be allowed to be reassigned to
later, without it automatically reverting to pre-user values (which is normally what
happens to changes to bot variables) - command line param 'pseudoserver' tells cs that this DLL or sharedobject
version of CS is actually incorporated into a server and should require the same
authorizations as a server before allowing debug commands. - :source now accepts a line :exit to abort executing more lines from it
And :source can accept a log file and will execute the log entry data
appropriate (and skip non-useful lines). - command line parameter parselimit=n if input is larger than n characters, disable intense spellchecking, pos-tagging, and parsing for speed
- privatecode handler: MongoGotDocument invoked if mongo successfully retrieves document
- in ^testpattern %trace_on normally enables tracing data to be sent back that only does pattern trace.
but%trace_on all
turns on full tracing. - ESOTERIC-CHATSCRIPT/ChatScript-Foreign-languages doc updated to give info on
inbuilt support for Japanese, German, and Spanish.
CS 11.4
Customer Survey: If you are willing, please answer questions below and email your answers to gowilcox@gmail.com
- Are you using CS privately or commercialy?
- How long have you been using CS?
- What human language are you using cs for?
- What additional features do you wish CS had?
- What are you using your chatbot for - (eg, ordinary conversation, help bot, faq bot, real estate bot, etc)?
- How did you hear about CS?
Version 11.4 6/6/2021
BEAR IN MIND THAT THE MAC version in BINARIES is obsolete
until someone supplies me with a new compiled Mac version.
-
if $tokenControl has been set to include JSON_DIRECT_FROM_OOB
then a user input that contains the word "json" followed immediately by an
obvious json structure (start with [ or { and ending with }]), the
tokenizer will convert all the json into a transient structure consisting of facts with
universal bot access and all that json will instead become a single token of the JSON name
of the structure. This allows you to bypass the normal 254 token limit on an input
sentence and provide lots of data in a single sentence. -
if $cs_token has #TOKENIZE_BY_CHARACTER
good for ideographic languages (chinese, etc). Automatically sets canonical to same as original
?3. "exist" in fact -
%trace_on and %trace_off - these can be used in a pattern of ^testpattern
or an output of ^testouput to enable tracing data be returned from the call
in the range between those tokens. -
command line param blockapitrace - disables any tracing in ^testpattern and ^testouput that
might have accidentally been left in the code -
cheat cs info - if given in input to ^testpattern, it will return newglobal variable
$cs_info, and any subsequent ^testoutput call will append that to its normal output. -
^msqlinit($_params) ^msqlclose()
^msqlread($_username) ^msqlwrite($_username $_value)
These routines allow you to access a Microsoft SQL Db from script.
Init Params are the same as for using such a db for a filesystem. -
^jsonmerge({transient/permanent} key/key-value $_arg1 $_arg2) take two json structures and perform a top level merge.
The result is a copy of the first argument, with top level fields augmented with fields
from arg2 not found in arg1. Optional first argument is the standard one for
many Json creation functions. -
incoming messages containing \uxxxx characters (utf16 representations)
are to utf8 (or regular ascii) except m-dashes and n-dashes, which are left unchanged. -
some html& constants are converted to utf8 except underscores, since
CS often changes underscores to spaces on output.
11 param noretrybackup disables cs saving prior volley data
for :retry when in standalone mode
12 param traceboot - turn on tracing when cs boot starts up -
serverlog and userlog used to refer to logging done when in
acting as a server or when running standalone. No more. Now it
refers to where the log is saved. serverlogging is saved in the LOGS folder
in a file named by the port id (serverlog1024.txt). User logging is
saved in the USER folder in files named by user and bot involved.
14 restart.txt - when added to the top level folder, will force cs to
reload itself on the next volley (and it erases the file) -
prelogging.txt - when added to the top level folder, will turn on
prelogging (message in log before cs begins processing). This is
usually only useful to show what input crashed a server and thus you
don't get a normal Respond log entry. -
PerformChatArguments engine hook function
16 %forkcount - number of forks requested in linux evserver environment
17 %servertype - parent or fork in linux evserver environment, server or null otherwise
18 %dbparams - copy of the server params given to db used as fileserver (pg or mysql or mssql or mongo)
19 %botid - bot id number in use