8000 Factor source code-related facilities into a new package by robrix · Pull Request #269 · github/semantic · GitHub
[go: up one dir, main page]

Skip to content
This repository was archived by the owner on Apr 1, 2025. It is now read-only.

Factor source code-related facilities into a new package #269

Merged
merged 49 commits into from
Sep 20, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
974e2ca
Define a semantic-source package.
robrix Sep 20, 2019
a66f459
Merge branch 'master' into semantic-source
robrix Sep 20, 2019
5802d46
Move the ToJSONFields instance for Range into Data.JSON.Fields.
robrix Sep 20, 2019
10e4bbb
Move the ToJSONFields instance for Span into Data.JSON.Fields.
robrix Sep 20, 2019
57ab2f6
Link the doctests against the lib.
robrix Sep 20, 2019
2c99f09
Link the doctests against QuickCheck.
robrix Sep 20, 2019
a126e39
Use the right dir for the doctests.
robrix Sep 20, 2019
4e40108
Copy Range in.
robrix Sep 20, 2019
325e1f1
Derive a Hashable instance for Range.
robrix Sep 20, 2019
ddef713
Copy Span in.
robrix Sep 20, 2019
81f43c9
Move the ToJSONFields instance for Location into Data.JSON.Fields.
robrix Sep 20, 2019
2748529
Copy Location in as Loc.
robrix Sep 20, 2019
cc82051
Depend on semantic-source.
robrix Sep 20, 2019
1d5e150
Switch everything over to using Source.Range.
robrix Sep 20, 2019
17c61c1
Switch everything over to using Source.Span.
robrix Sep 20, 2019
0f8e69c
Switch everything over to using Source.Loc.
robrix Sep 20, 2019
f6e4864
Move the span/range stuff into CMark.
robrix Sep 20, 2019
b20dcf4
Copy Source in.
robrix Sep 20, 2019
8aae312
Rename the Source symbols and recommend importing it qualified.
robrix Sep 20, 2019
ca6a785
Flip lineRangesWithin.
robrix Sep 20, 2019
d929a8c
Make Data.Source reexport Source.Source.
Sep 20, 2019
948deb4
Fixup remaining test cases.
Sep 20, 2019
7b599a6
Use Source.Source instead of Data.Source.
Sep 20, 2019
a422061
Delete Data.Source.
Sep 20, 2019
f17a2e8
Remove Data.Source from the .cabal file.
Sep 20, 2019
f0567fd
De-suffix dropSource and takeSource.
Sep 20, 2019
86682d8
De-suffix sourceBytes.
Sep 20, 2019
74693f4
Bring in the Source tests.
robrix Sep 20, 2019
2ce8b51
:fire: Data.Source.Spec.
robrix Sep 20, 2019
c86186a
:fire: a redundant import.
robrix Sep 20, 2019
a00a78e
Merge branch 'master' into semantic-source
robrix Sep 20, 2019
bb20471
Define lenses for the starts/ends of Range.
robrix Sep 20, 2019
64ef37e
Rename the line/column lenses to line_/column_.
robrix Sep 20, 2019
1e6ebd2
Rename posLine/posColumn to line/column.
robrix Sep 20, 2019
57c385d
Rename the HasSpan start/end lenses to start_/end_.
robrix Sep 20, 2019
d59a44b
Rename the HasSpan span lens to span_.
robrix Sep 20, 2019
7d1567e
:fire: a bunch of redundant hidden imports.
robrix Sep 20, 2019
0312300
Rename the spanStart/spanEnd fields to start/end.
robrix Sep 20, 2019
e08a495
Define a point fiunction for Range.
robrix Sep 20, 2019
935acb4
:memo: point.
robrix Sep 20, 2019
6356443
Define a point constructor for Span.
robrix Sep 20, 2019
e28e81b
:memo: point.
robrix Sep 20, 2019
9551742
Use point to define emptyTerm.
robrix Sep 20, 2019
52bc7e6
Rename locByteRange/locSpan to byteRange/span.
robrix Sep 20, 2019
4bc5491
Extract lens to the top level.
robrix Sep 20, 2019
909fa63
Define a byteRange_ lens for Loc.
robrix Sep 20, 2019
8df1345
Run semantic-source’s tests in CI.
robrix Sep 20, 2019
918bfb4
Apparently this should not exist.
robrix Sep 20, 2019
77ff50b
Run the doctests from the right place.
robrix Sep 20, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Make Data.Source reexport Source.Source.
  • Loading branch information
Patrick Thomson committed Sep 20, 2019
commit d929a8c78af916b0a60cad8cd45528c3cc9129bf
2 changes: 1 addition & 1 deletion src/Analysis/PackageDef.hs
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ class CustomHasPackageDef syntax where
instance CustomHasPackageDef Language.Go.Syntax.Package where
customToPackageDef Blob{..} _ (Language.Go.Syntax.Package (Term (In fromAnn _), _) _)
= Just $ PackageDef (getSource fromAnn)
where getSource = toText . flip Source.slice blobSource . locByteRange
where getSource = toText . Source.slice blobSource . locByteRange

-- | Produce a 'PackageDef' for 'Sum's using the 'HasPackageDef' instance & therefore using a 'CustomHasPackageDef' instance when one exists & the type is listed in 'PackageDefStrategy'.
instance Apply HasPackageDef fs => CustomHasPackageDef (Sum fs) where
Expand Down
2 changes: 1 addition & 1 deletion src/Assigning/Assignment.hs
Original file line number Diff line number Diff line change
Expand Up @@ -265,7 +265,7 @@ runAssignment source = \ assignment state -> go assignment state >>= requireExha
GetLocals -> yield stateLocals state
PutLocals l -> yield () (state { stateLocals = l })
CurrentNode -> yield (In node (() <$ f)) state
Source -> yield (Source.sourceBytes (Source.slice (nodeByteRange node) source)) (advanceState state)
Source -> yield (Source.sourceBytes (Source.slice source (nodeByteRange node))) (advanceState state)
Children child -> do
(a, state') <- go child state { stateNodes = toList f, stateCallSites = maybe id (:) (tracingCallSite t) stateCallSites } >>= requireExhaustive (tracingCallSite t)
yield a (advanceState state' { stateNodes = stateNodes, stateCallSites = stateCallSites })
Expand Down
5 changes: 3 additions & 2 deletions src/Data/Blob.hs
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,8 @@ import Data.Aeson
import qualified Data.ByteString.Lazy as BL
import Data.JSON.Fields
import Data.Language
import Data.Source as Source
import Source.Source (Source)
import qualified Source.Source as Source

-- | A 'FilePath' paired with its corresponding 'Language'.
-- Unpacked to have the same size overhead as (FilePath, Language).
Expand Down Expand Up @@ -70,7 +71,7 @@ instance FromJSON Blob where
<*> b .: "language"

nullBlob :: Blob -> Bool
nullBlob Blob{..} = nullSource blobSource
nullBlob Blob{..} = Source.null blobSource

sourceBlob :: FilePath -> Language -> Source -> Blob
sourceBlob filepath language source = makeBlob source filepath language mempty
Expand Down
13 changes: 7 additions & 6 deletions src/Data/Error.hs
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,11 @@ import Data.Ix (inRange)
import Data.List (intersperse, isSuffixOf)
import System.Console.ANSI

import Data.Blob
import Data.Flag as Flag
import Data.Source
import Source.Span
import Data.Blob
import Data.Flag as Flag
import Source.Source (Source)
import qualified Source.Source as Source
import Source.Span

data LogPrintSource = LogPrintSource
data Colourize = Colourize
Expand Down Expand Up @@ -61,8 +62,8 @@ showExcerpt colourize Span{..} Blob{..}
= showString context . (if "\n" `isSuffixOf` context then id else showChar '\n')
. showString (replicate (caretPaddingWidth + lineNumberDigits) ' ') . withSGRCode colourize [SetColor Foreground Vivid Green] (showString caret) . showChar '\n'
where context = fold contextLines
contextLines = [ showLineNumber i <> ": " <> unpack (sourceBytes l)
| (i, l) <- zip [1..] (sourceLines blobSource)
contextLines = [ showLineNumber i <> ": " <> unpack (Source.sourceBytes l)
| (i, l) <- zip [1..] (Source.lines blobSource)
, inRange (posLine spanStart - 2, posLine spanStart) i
]
showLineNumber n = let s = show n in replicate (lineNumberDigits - length s) ' ' <> s
Expand Down
126 changes: 2 additions & 124 deletions src/Data/Source.hs
import Source.Span hiding (HasSpan (..))
Original file line number Diff line number Diff line change
@@ -1,127 +1,5 @@
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
module Data.Source
( Source
, sourceBytes
, fromUTF8
-- Measurement
, sourceLength
, nullSource
, totalRange
, totalSpan
-- En/decoding
, fromText
, toText
-- Slicing
, slice
, dropSource
-- Splitting
, sourceLines
, sourceLineRanges
, sourceLineRangesWithin
, newlineIndices
( module Source.Source
) where

import Prologue

import Data.Aeson (FromJSON (..), withText)
import qualified Data.ByteString as B
import Data.Char (ord)
import Data.String (IsString (..))
import qualified Data.Text as T
import qualified Data.Text.Encoding as T
import Source.Range


-- | The contents of a source file. This is represented as a UTF-8
-- 'ByteString' under the hood. Construct these with 'fromUTF8'; obviously,
-- passing 'fromUTF8' non-UTF8 bytes will cause crashes.
newtype Source = Source { sourceBytes :: B.ByteString }
deriving (Eq, Semigroup, Monoid, IsString, Show, Generic)

fromUTF8 :: B.ByteString -> Source
fromUTF8 = Source

instance FromJSON Source where
parseJSON = withText "Source" (pure . fromText)

-- Measurement

sourceLength :: Source -> Int
sourceLength = B.length . sourceBytes

nullSource :: Source -> Bool
nullSource = B.null . sourceBytes

-- | Return a 'Range' that covers the entire text.
totalRange :: Source -> Range
totalRange = Range 0 . B.length . sourceBytes

-- | Return a 'Span' that covers the entire text.
totalSpan :: Source -> Span
totalSpan source = Span (Pos 1 1) (Pos (length ranges) (succ (end lastRange - start lastRange)))
where ranges = sourceLineRanges source
lastRange = fromMaybe lowerBound (getLast (foldMap (Last . Just) ranges))


-- En/decoding

-- | Return a 'Source' from a 'Text'.
fromText :: T.Text -> Source
fromText = Source . T.encodeUtf8

-- | Return the Text contained in the 'Sou B41A rce'.
toText :: Source -> T.Text
toText = T.decodeUtf8 . sourceBytes


-- | Return a 'Source' that contains a slice of the given 'Source'.
slice :: Range -> Source -> Source
slice range = take . drop
where drop = dropSource (start range)
take = takeSource (rangeLength range)

dropSource :: Int -> Source -> Source
dropSource i = Source . drop . sourceBytes
where drop = B.drop i

takeSource :: Int -> Source -> Source
takeSource i = Source . take . sourceBytes
where take = B.take i


-- Splitting

-- | Split the contents of the source after newlines.
sourceLines :: Source -> [Source]
sourceLines source = (`slice` source) <$> sourceLineRanges source

-- | Compute the 'Range's of each line in a 'Source'.
sourceLineRanges :: Source -> [Range]
sourceLineRanges source = sourceLineRangesWithin (totalRange source) source

-- | Compute the 'Range's of each line in a 'Range' of a 'Source'.
sourceLineRangesWithin :: Range -> Source -> [Range]
sourceLineRangesWithin range = uncurry (zipWith Range)
. ((start range:) &&& (<> [ end range ]))
. fmap (+ succ (start range))
. newlineIndices
. sourceBytes
. slice range

-- | Return all indices of newlines ('\n', '\r', and '\r\n') in the 'ByteString'.
newlineIndices :: B.ByteString -> [Int]
newlineIndices = go 0
where go n bs | B.null bs = []
| otherwise = case (searchCR bs, searchLF bs) of
(Nothing, Nothing) -> []
(Just i, Nothing) -> recur n i bs
(Nothing, Just i) -> recur n i bs
(Just crI, Just lfI)
| succ crI == lfI -> recur n lfI bs
| otherwise -> recur n (min crI lfI) bs
recur n i bs = let j = n + i in j : go (succ j) (B.drop (succ i) bs)
searchLF = B.elemIndex (toEnum (ord '\n'))
searchCR = B.elemIndex (toEnum (ord '\r'))

{-# INLINE newlineIndices #-}
import Source.Source
65 changes: 33 additions & 32 deletions src/Parsing/CMark.hs
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,15 @@ module Parsing.CMark
, toGrammar
) where

import CMarkGFM
import Data.Array
import CMarkGFM
import Data.Array
import qualified Data.AST as A
import Data.Source
import Data.Term
import Source.Loc
import Source.Span hiding (HasSpan(..))
import TreeSitter.Language (Symbol(..), SymbolType(..))
import Data.Term
import Source.Loc
import Source.Source (Source)
import qualified Source.Source as Source
import Source.Span hiding (HasSpan (..))
import TreeSitter.Language (Symbol (..), SymbolType (..))

data Grammar
= Document
Expand Down Expand Up @@ -50,7 +51,7 @@ exts = [
]

cmarkParser :: Source -> A.AST (TermF [] NodeType) Grammar
cmarkParser source = toTerm (totalRange source) (totalSpan source) $ commonmarkToNode [ optSourcePos ] exts (toText source)
cmarkParser source = toTerm (Source.totalRange source) (Source.totalSpan source) $ commonmarkToNode [ optSourcePos ] exts (Source.toText source)
where toTerm :: Range -> Span -> Node -> A.AST (TermF [] NodeType) Grammar
toTerm within withinSpan (Node position t children) =
let range = maybe within (spanToRangeInLineRanges lineRanges . toSpan) position
Expand All @@ -62,30 +63,30 @@ cmarkParser source = toTerm (totalRange source) (totalSpan source) $ commonmarkT
lineRanges = sourceLineRangesByLineNumber source

toGrammar :: NodeType -> Grammar
toGrammar DOCUMENT{} = Document
toGrammar DOCUMENT{} = Document
toGrammar THEMATIC_BREAK{} = ThematicBreak
toGrammar PARAGRAPH{} = Paragraph
toGrammar BLOCK_QUOTE{} = BlockQuote
toGrammar HTML_BLOCK{} = HTMLBlock
toGrammar CUSTOM_BLOCK{} = CustomBlock
toGrammar CODE_BLOCK{} = CodeBlock
toGrammar HEADING{} = Heading
toGrammar LIST{} = List
toGrammar ITEM{} = Item
toGrammar TEXT{} = Text
toGrammar SOFTBREAK{} = SoftBreak
toGrammar LINEBREAK{} = LineBreak
toGrammar HTML_INLINE{} = HTMLInline
toGrammar CUSTOM_INLINE{} = CustomInline
toGrammar CODE{} = Code
toGrammar EMPH{} = Emphasis
toGrammar STRONG{} = Strong
toGrammar LINK{} = Link
toGrammar IMAGE{} = Image
toGrammar STRIKETHROUGH{} = Strikethrough
toGrammar TABLE{} = Table
toGrammar TABLE_ROW{} = TableRow
toGrammar TABLE_CELL{} = TableCell
toGrammar PARAGRAPH{} = Paragraph
toGrammar BLOCK_QUOTE{} = BlockQuote
toGrammar HTML_BLOCK{} = HTMLBlock
toGrammar CUSTOM_BLOCK{} = CustomBlock
toGrammar CODE_BLOCK{} = CodeBlock
toGrammar HEADING{} = Heading
toGrammar LIST{} = List
toGrammar ITEM{} = Item
toGrammar TEXT{} = Text
toGrammar SOFTBREAK{} = SoftBreak
toGrammar LINEBREAK{} = LineBreak
toGrammar HTML_INLINE{} = HTMLInline
toGrammar CUSTOM_INLINE{} = CustomInline
toGrammar CODE{} = Code
toGrammar EMPH{} = Emphasis
toGrammar STRONG{} = Strong
toGrammar LINK{} = Link
toGrammar IMAGE{} = Image
toGrammar STRIKETHROUGH{} = Strikethrough
toGrammar TABLE{} = Table
toGrammar TABLE_ROW{} = TableRow
toGrammar TABLE_CELL{} = TableCell


instance Symbol Grammar where
Expand All @@ -99,4 +100,4 @@ spanToRangeInLineRanges lineRanges Span{..} = Range

sourceLineRangesByLineNumber :: Source -> Array Int Range
sourceLineRangesByLineNumber source = listArray (1, length lineRanges) lineRanges
where lineRanges = sourceLineRanges source
where lineRanges = Source.lineRanges source
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were only ever being used in this module, and so moving them here prevented having to have semantic-source depend on array.

2 changes: 1 addition & 1 deletion src/Reprinting/Tokenize.hs
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ descend t = do
let delimiter = Range crs (start r)
unless (delimiter == Range 0 0) $ do
log ("slicing: " <> show delimiter)
chunk (slice delimiter src)
chunk (slice src delimiter)
move (start r)
tokenize (fmap (withStrategy PrettyPrinting . into) t)
move (end r)
Expand Down
0