Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
O
outillage
Manage
Activity
Members
Labels
Plan
Issues
0
Issue boards
Milestones
Wiki
Code
Merge requests
0
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Alice Brenon
outillage
Commits
0124d259
Commit
0124d259
authored
1 year ago
by
Alice Brenon
Browse files
Options
Downloads
Patches
Plain Diff
Fix Tree indexer script + add 2 textometry scripts
parent
9ab20ad2
No related branches found
Branches containing commit
No related tags found
No related merge requests found
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
scripts/textometry/index-syntax-trees.hs
+3
-3
3 additions, 3 deletions
scripts/textometry/index-syntax-trees.hs
scripts/textometry/size.hs
+38
-0
38 additions, 0 deletions
scripts/textometry/size.hs
scripts/textometry/txm-results.hs
+47
-0
47 additions, 0 deletions
scripts/textometry/txm-results.hs
with
88 additions
and
3 deletions
scripts/textometry/index
S
yntax
T
rees.hs
→
scripts/textometry/index
-s
yntax
-t
rees.hs
+
3
−
3
View file @
0124d259
...
...
@@ -8,13 +8,13 @@ import Control.Monad.IO.Class (MonadIO(..))
import
Control.Monad.Reader
(
MonadReader
,
asks
,
runReaderT
)
import
Data.ByteString
as
ByteString
(
writeFile
)
import
Data.Serialize
(
encode
)
import
GEODE.Metadata
(
ArticleRecord
,
Record
(
..
)
,
r
ead
NamedTsv
)
import
GEODE.Metadata
(
ArticleRecord
,
Document
(
..
),
Record
(
..
),
R
ead
TSV
(
..
)
)
import
Options.Applicative
(
Parser
,
execParser
,
fullDesc
,
help
,
helper
,
info
,
long
,
metavar
,
progDesc
,
short
,
strOption
)
import
System.Directory
(
createDirectoryIfMissing
)
import
System.FilePath
((
</>
),
takeDirectory
)
import
System.Script
(
try
,
warn
)
import
System.Script
(
warn
)
data
Config
=
Config
{
inputRoot
::
FilePath
...
...
@@ -55,4 +55,4 @@ toTree articleRecord = do
main
::
IO
()
main
=
getConfig
>>=
runReaderT
chain
where
chain
=
try
(
asks
inputTsv
>>=
liftIO
.
read
NamedTsv
)
>>=
mapM_
toTree
chain
=
asks
inputTsv
>>=
liftIO
.
read
TSV
>>=
mapM_
toTree
.
rows
This diff is collapsed.
Click to expand it.
scripts/textometry/size.hs
0 → 100755
+
38
−
0
View file @
0124d259
#!/
usr
/
bin
/
env
-
S
runhaskell
--ghc-arg="-Wall" --ghc-arg="-i lib/haskell" --ghc-arg="-i ../ghc-geode/lib"
{-# LANGUAGE ExplicitNamespaces #-}
import
Conllu.Tree
(
IndexedDocument
(
..
))
import
Data.ByteString
as
ByteString
(
readFile
)
import
Data.Csv
(
DefaultOrdered
(
..
),
ToNamedRecord
(
..
))
import
Data.Serialize
(
decode
)
import
GEODE.Metadata
(
type
(
@
)(
..
),
ArticleRecord
(
..
),
Document
(
..
),
ReadTSV
(
..
),
Record
(
..
)
,
WriteTSV
(
..
),
for
,
getHeader
,
glue
)
import
GHC.Generics
(
Generic
)
import
System.Environment
(
getArgs
)
import
System.FilePath
((
</>
))
import
System.Script
(
syntax
,
warn
)
newtype
Size
=
Size
{
size
::
Int
}
deriving
(
Eq
,
Generic
,
Ord
,
Show
)
instance
DefaultOrdered
Size
instance
ToNamedRecord
Size
type
Result
=
ArticleRecord
@
Size
measureIn
::
FilePath
->
ArticleRecord
->
IO
()
measureIn
inputRoot
article
=
ByteString
.
readFile
path
>>=
either
warn
measure
.
decode
where
path
=
inputRoot
</>
relativePath
article
"tree"
--skipAndWarn msg = Nothing <$ warn msg
measure
(
IndexedDocument
{
_total
})
=
writeTSV
()
[
glue
article
$
Size
_total
]
main
::
IO
()
main
=
getArgs
>>=
run
where
run
[
inputRoot
]
=
do
Document
{
rows
}
<-
readTSV
()
writeTSV
()
[
getHeader
(
for
::
Result
)]
mapM_
(
measureIn
inputRoot
)
rows
run
_
=
syntax
"INPUT_ROOT"
This diff is collapsed.
Click to expand it.
scripts/textometry/txm-results.hs
0 → 100755
+
47
−
0
View file @
0124d259
#!/
usr
/
bin
/
env
-
S
runhaskell
--ghc-arg="-Wall" --ghc-arg="-ilib/haskell" --ghc-arg="-i/home/alice/Logiciel/ghc-geode/lib/GEODE/Metadata"
{-# LANGUAGE ExplicitNamespaces, OverloadedStrings #-}
import
Control.Monad
(
foldM
)
import
Data.Csv
(
DefaultOrdered
(
..
),
ToNamedRecord
(
..
))
import
Data.Text
as
Text
(
Text
,
split
,
unpack
)
import
Data.Text.IO
as
Text
(
readFile
)
import
Data.Vector
as
Vector
(
fromList
)
import
GEODE.Metadata
(
type
(
@
)(
..
),
ArticleRecord
(
..
),
Document
(
..
),
Record
(
..
),
WriteTSV
(
..
),
for
,
getHeader
)
import
GHC.Generics
(
Generic
)
import
System.Environment
(
getArgs
)
import
System.Script
(
syntax
,
warn
)
import
Text.Filter
(
Editable
(
..
))
import
Text.Printf
(
printf
)
newtype
Result
=
Result
{
result
::
Text
}
deriving
(
Eq
,
Generic
,
Ord
,
Show
)
instance
DefaultOrdered
Result
instance
ToNamedRecord
Result
type
Row
=
ArticleRecord
@
Result
getResults
::
[
Text
]
->
IO
(
Document
Row
)
getResults
=
fmap
buildDoc
.
foldM
parseRow
[]
.
drop
1
where
buildDoc
stack
=
Document
{
header
=
getHeader
(
for
::
Row
)
,
rows
=
Vector
.
fromList
$
reverse
stack
}
parseRow
::
[
Row
]
->
Text
->
IO
[
Row
]
parseRow
rows
row
=
pickColumns
(
Text
.
split
(
==
'
\t
'
)
row
)
where
pickColumns
(
uid
:
_
:
pivot
:
_
)
=
buildRow
(
onlyTarget
pivot
)
.
fromUID
$
Text
.
unpack
uid
pickColumns
_
=
rows
<$
warn
(
printf
"Found less than 3 columns in row: %s"
row
)
buildRow
_
(
Left
errorMessage
)
=
rows
<$
warn
errorMessage
buildRow
result
(
Right
article
)
=
pure
$
(
article
:@:
Result
{
result
})
:
rows
onlyTarget
t
=
let
l
=
Text
.
split
(`
elem
`
[
'['
,
']'
])
t
in
if
length
l
>
2
then
l
!!
1
else
t
main
::
IO
()
main
=
getArgs
>>=
run
where
run
[
input
]
=
Text
.
readFile
input
>>=
getResults
.
enter
>>=
writeTSV
()
run
_
=
syntax
"INPUT_TSV"
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment