8000 DSL experiments by korny · Pull Request #182 · rubychan/coderay · GitHub
[go: up one dir, main page]

Skip to content

DSL experiments #182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 68 commits into
base: possible-speedups
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
5e3df7f
experiment with JSON scanner
korny Oct 28, 2013
8dc6d8b
ws
korny Oct 28, 2013
41c211d
Merge branch 'master' into dsl
korny Mar 21, 2015
615ac96
add alternative JSON scanners
korny Mar 21, 2015
300ccd3
no need to modify file_extension
korny Mar 21, 2015
f4f0db4
add variant tasks like rake test:scanner:json:2
korny Mar 21, 2015
f1ea428
this seems obsolete
korny Mar 21, 2015
b01a3fb
add SKIP_UPDATE_SCANNER_SUITE switch
korny Mar 21, 2015
2499b1e
first version of RuleBasedScanner for JavaScript
korny Mar 21, 2015
647e9c0
more work on DSL scanner for JavaScript
korny Mar 22, 2015
63c9f26
finally, a version that is fast without eval!
korny Mar 24, 2015
e6753b9
cleanup .gitignore
korny Mar 24, 2015
6eaa589
use instance variable instead of class variable
korny Mar 25, 2015
5e954e2
add check_unless
korny Apr 3, 2015
0cd3e62
add DSL CSS scanner
korny Apr 3, 2015
e8bef10
move RuleBasedScanner into own file
korny Apr 3, 2015
235e01b
add push/pop state, working on C scanner
korny Apr 21, 2015
463ffa1
Merge branch 'master' into dsl
korny Dec 24, 2015
7dcbf8a
Debug encoder should count tokens for better inspection
korny Feb 12, 2016
e6d46f9
just show the array
korny Feb 12, 2016
61a9d96
scanner tweaks
korny Feb 12, 2016
c274a90
fix comment
korny Feb 12, 2016
36af5ca
add explicit pattern method; make pattern optional
korny Feb 12, 2016
40f1fa7
Push and Pop take optional group argument now
korny Feb 12, 2016
aa93af4
quick increment/decrement, yay!
korny Feb 12, 2016
3df8487
use explicit pattern method
korny Feb 12, 2016
7561e8d
warn about error tokens
korny Feb 12, 2016
a1a7b2c
add json scanner using RuleBasedScanner
korny Feb 12, 2016
4da772b
add generated Lua scanner
korny Feb 12, 2016
9f4af60
highlight generated C scanner (like the others)
korny Feb 12, 2016
aaa1705
ignore benchmark results
korny Feb 12, 2016
0d1c786
move comment to the top
korny Feb 13, 2016
dd1d779
move setup to superclass
korny Feb 13, 2016
42e2ca3
cleanup
korny Feb 13, 2016
f1bd833
use setup
korny Feb 13, 2016
ca3f15f
remove whitespace
korny Feb 13, 2016
ee9e840
cleanup
korny Feb 13, 2016
3c92a65
Merge branch 'master' into dsl
korny Feb 13, 2016
ae94f2f
Merge branch 'master' into dsl
korny Feb 13, 2016
f8cadd9
add line number to eval
korny Feb 14, 2016
13ac3fd
optional push state (return nil)
korny Feb 14, 2016
6b80e1e
remove obsolete flag, fix order of rules
korny Feb 14, 2016
dcf73a6
nicer debug output
korny Feb 14, 2016
9526bd8 8000
generate scanner code automatically
korny Feb 14, 2016
14339bf
some more variables that are set by the scanner
korny Feb 14, 2016
90c5c91
remove templates, yay!
korny Feb 14, 2016
23e23f2
Merge branch 'master' into dsl
korny Mar 12, 2016
86b0bcf
Merge branch 'master' into dsl
korny Jun 4, 2016
a38b337
Merge branch 'possible-speedups' into dsl
korny Jun 4, 2016
545398f
update version; this will be CodeRay 2
korny Jun 4, 2016
e01ccf7
Merge branch 'possible-speedups' into dsl
korny Jun 12, 2016
26f915f
Merge branch 'possible-speedups' into dsl
korny Dec 28, 2016
548e2d0
default set(:flag) to true
korny Jan 15, 2017
7a02cdd
working towards DSL scanner
korny Apr 9, 2017
1bdaeef
starting with SimpleScanner
korny Apr 9, 2017
1f2367a
Merge branch 'master' into dsl
korny Nov 2, 2017
775255a
Merge branch 'possible-speedups' into dsl
korny Nov 2, 2017
e101dbe
normalize class names
korny Nov 2, 2017
bc580c5
Merge branch 'possible-speedups' into dsl
korny Nov 2, 2017
e69b878
sort .gitignore, add spec/example.txt
korny Nov 2, 2017
8d46c46
Merge branch 'possible-speedups' into dsl
korny Nov 2, 2017
579c00b
testing SingleStateRuleBasedScanner; not faster :(
korny Nov 5, 2017
738aae5
fix autoloading of DSL scanners
korny Nov 5, 2017
465e6c3
remove obsolete "protected"
korny Nov 5, 2017
a9e04e1
fix specs, update SimpleScannerDSL
korny Nov 5, 2017
ec89197
trying to fix tests for Ruby 2.4
korny Nov 5, 2017
3a703d2
Merge branch 'master' into dsl
korny Nov 18, 2017
434006c
testing rouge scanner
korny Nov 27, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add alternative JSON scanners
  • Loading branch information
korny committed Mar 21, 2015
commit 615ac9604cf9f37009fa38e4320552c8735b4386
34 changes: 16 additions & 18 deletions lib/coderay/scanners/json.rb
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ class JSON < Scanner

ESCAPE = / [bfnrt\\"\/] /x # :nodoc:
UNICODE_ESCAPE = / u[a-fA-F0-9]{4} /x # :nodoc:
KEY = / (?> (?: [^\\"]+ | \\. )* ) " \s* : /mx
KEY = / (?> (?: [^\\"]+ | \\. )* ) " \s* : /x

protected

Expand All @@ -37,41 +37,40 @@ def scan_tokens encoder, options
when :initial
if match = scan(/ \s+ /x)
encoder.text_token match, :space
elsif match = scan(/ " (?=#{KEY}) /ox)
state = :key
encoder.begin_group :key
encoder.text_token match, :delimiter
elsif match = scan(/ " /x)
state = :string
encoder.begin_group :string
elsif match = scan(/"/)
state = check(/#{KEY}/o) ? :key : :string
encoder.begin_group state
encoder.text_token match, :delimiter
elsif match = scan(/ [:,\[{\]}] /x)
encoder.text_token match, :operator
elsif match = scan(/ true | false | null /x)
encoder.text_token match, :value
elsif match = scan(/ -? (?: 0 | [1-9]\d* ) (?: \.\d+ (?: [eE][-+]? \d+ )? | [eE][-+]? \d+ ) /x)
encoder.text_token match, :float
elsif match = scan(/ -? (?: 0 | [1-9]\d* ) /x)
encoder.text_token match, :integer
if scan(/ \.\d+ (?:[eE][-+]?\d+)? | [eE][-+]? \d+ /x)
match << matched
encoder.text_token match, :float
else
encoder.text_token match, :integer
end
else
encoder.text_token getch, :error
end

when :string, :key
if match = scan(/ [^\\"]+ /x)
if match = scan(/[^\\"]+/)
encoder.text_token match, :content
elsif match = scan(/ " /x)
elsif match = scan(/"/)
encoder.text_token match, :delimiter
encoder.end_group state
state = :initial
elsif match = scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /ox)
elsif match = scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
encoder.text_token match, :char
elsif match = scan(/ \\. /mx)
elsif match = scan(/\\./m)
encoder.text_token match, :content
elsif match = scan(/ \\ /x)
elsif match = scan(/ \\ | $ /x)
encoder.end_group state
encoder.text_token match, :error unless match.empty?
state = :initial
encoder.text_token match, :error
else
raise_inspect "else case \" reached; %p not handled." % peek(1), encoder
end
Expand All @@ -80,7 +79,6 @@ def scan_tokens encoder, options
raise_inspect 'Unknown state: %p' % [state], encoder

end

end

if options[:keep_state]
Expand Down
100 changes: 100 additions & 0 deletions lib/coderay/scanners/json1.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
module CodeRay
module Scanners

# Scanner for JSON (JavaScript Object Notation).
class JSON1 < Scanner

register_for :json1
file_extension 'json1'

KINDS_NOT_LOC = [
:float, :char, :content, :delimiter,
:error, :integer, :operator, :value,
] # :nodoc:

ESCAPE = / [bfnrt\\"\/] /x # :nodoc:
UNICODE_ESCAPE = / u[a-fA-F0-9]{4} /x # :nodoc:
KEY = / (?> (?: [^\\"]+ | \\. )* ) " \s* : /mx

protected

def setup
@state = :initial
end

# See http://json.org/ for a definition of the JSON lexic/grammar.
def scan_tokens encoder, options
state = options[:state] || @state

if [:string, :key].include? state
encoder.begin_group state
end

until eos?

case state

when :initial
if match = scan(/ \s+ /x)
encoder.text_token match, :space
elsif match = scan(/ " (?=#{KEY}) /ox)
state = :key
encoder.begin_group :key
encoder.text_token match, :delimiter
elsif match = scan(/ " /x)
state = :string
encoder.begin_group :string
encoder.text_token match, :delimiter
elsif match = scan(/ [:,\[{\]}] /x)
encoder.text_token match, :operator
elsif match = scan(/ true | false | null /x)
encoder.text_token match, :value
elsif match = scan(/ -? (?: 0 | [1-9]\d* ) (?: \.\d+ (?: [eE][-+]? \d+ )? | [eE][-+]? \d+ ) /x)
encoder.text_token match, :float
elsif match = scan(/ -? (?: 0 | [1-9]\d* ) /x)
encoder.text_token match, :integer
else
encoder.text_token getch, :error
end

when :string, :key
if match = scan(/ [^\\"]+ /x)
encoder.text_token match, :content
elsif match = scan(/ " /x)
encoder.text_token match, :delimiter
encoder.end_group state
state = :initial
elsif match = scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /ox)
encoder.text_token match, :char
elsif match = scan(/ \\. /mx)
encoder.text_token match, :content
elsif match = scan(/ \\ /x)
encoder.end_group state
state = :initial
encoder.text_token match, :error
else
raise_inspect "else case \" reached; %p not handled." % peek(1), encoder
end

else
raise_inspect 'Unknown state: %p' % [state], encoder

end

end

if options[:keep_state]
@state = state
end

if [:string, :key].include? state
encoder.end_group state
end

encoder
end

end

end
end
131 changes: 131 additions & 0 deletions lib/coderay/scanners/json2.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
module CodeRay
module Scanners

class RuleBasedScanner2 < Scanner
class << self
attr_accessor :states

def state *names, &block
@@states ||= {}

@@rules = []

instance_eval(&block)

for name in names
@@states[name] = @@rules
end

@@rules = nil
end

def token pattern, *actions
@@rules << [pattern, *actions]
end

def push_group name
[:begin_group, name]
end

def pop_group
[:end_group]
end
end
end

# Scanner for JSON (JavaScript Object Notation).
class JSON2 < RuleBasedScanner2

register_for :json2
file_extension 'json2'

KINDS_NOT_LOC = [
:float, :char, :content, :delimiter,
:error, :integer, :operator, :value,
] # :nodoc:

ESCAPE = / [bfnrt\\"\/] /x # :nodoc:
UNICODE_ESCAPE = / u[a-fA-F0-9]{4} /x # :nodoc:
KEY = / (?> (?: [^\\"]+ | \\. )* ) " \s* : /mx

state :initial do
token %r/ \s+ /x, :space

token %r/ " (?=#{KEY}) /x, push_group(:key), :delimiter
token %r/ " /x, push_group(:string), :delimiter

token %r/ [:,\[{\]}] /x, :operator

token %r/ true | false | null /x, :value
token %r/ -? (?: 0 | [1-9]\d* ) (?: \.\d+ (?: [eE][-+]? \d+ )? | [eE][-+]? \d+ ) /x, :float
token %r/ -? (?: 0 | [1-9]\d* ) /x, :integer
end

state :string, :key do
token %r/ [^\\"]+ /x, :content

token %r/ " /x, :delimiter, pop_group

token %r/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /x, :char
token %r/ \\. /mx, :content
token %r/ \\ /x, pop_group, :error

# token %r/$/, end_group
end

protected

def setup
@state = :initial
end

# See http://json.org/ for a definition of the JSON lexic/grammar.
def scan_tokens encoder, options
state = options[:state] || @state

if [:string, :key].include? state
encoder.begin_group state
end

states = [state]

until eos?
for pattern, *actions in @@states[state]
if match = scan(pattern)
for action in actions
case action
when Symbol
encoder.text_token match, action
when Array
case action.first
when :begin_group
encoder.begin_group action.last
state = action.last
states << state
when :end_group
encoder.end_group states.pop
state = states.last
end
end
end

break
end
end && encoder.text_token(getch, :error)
end

if options[:keep_state]
@state = state
end

if [:string, :key].include? state
encoder.end_group state
end

encoder
end

end

end
end
Loading
0