Classical Language Toolkit
@CLTKorg
Followers
348
Following
120
Media
6
Statuses
114
NLP for Historical Languages
Joined December 2018
The CLTK has released a new major version. (1.0). For a quick introduction to our API: https://t.co/Ffom8EZoYA Docs for new code at: https://t.co/YajuGekJFp Old docs:
0
10
18
#ML4AL starts tomorrow @aclmeeting!💫 We thank our sponsors @GoogleDeepMind @scrollprize @athenaRICinfo, all the authors presenting, our Program+Organising Committees for their outstanding work. We received the highest number of workshop submissions at ACL2024! 💎 @iassael & John
1
2
6
Working on a way to merge the best parts of @CLTKorg , @spacy_io , and GliNER right now. I nearly have a working prototype for Latin. The benefit of this is flexibility. You can drop in your own spaCy or GLiNER models easily. For the spaCy pipeline, I'm using @diyclassics 's
4
6
24
Each day there is a new development for Latin machine learning, it seems. Check out this LLM pretrained exclusively on Latin. @CLTKorg
the first LLM pretrained exclusively on Latin! @WilliamGao1729 and i are merging modernity with antiquity in a series of experiments we found a dataset of ~500 billion Latin tokens and couldn't resist we pretrained using the GPT2 architecture and an H100 from @akashnet_
0
1
9
One of our maintainers will attend #LT4HALA2024 @LrecColing 🖐️
I'm going to attend #LT4HALA2024 online on Saturday @LrecColing. If you want to talk about @CLTKorg or other tools usable for ancient languages, I'll be available!
0
1
1
A new release of CLTK v1.3.0 is available. OdyCy model for Ancient Greek can now be called through CLTK.
2
4
6
The maintenance of CLTK is quite slow because we don't have enough volunteers to code, to review code, to analyze issues, etc. @clemsciences made the new pre-release in his free time but it is not enough. Who wants to join?
1
1
2
A new alpha release is out https://t.co/m2mH82mcVQ. CLTK is now available from Python 3.9 to Python 3.12. Tests as well as contributions are welcome!
github.com
Versions of Python supported: 3.9 to 3.12. Updated stanza and spacy. Fixed dataclasses structure. Fixed model paths for Old French and Latin. Replaced python-Levenshtein by rapidfuzz. Refactored so...
1
6
7
#johdnews The #CfP is now open for our new special collection "Representing the Ancient World through data" 📢 We invite submissions of #datapapers describing your work on Ancient World data📚📊 🗓️Deadline 1 September 2023 Full Call for Papers at 👉 https://t.co/Kf2KPbMBuY 1/2
1
17
40
LatinCy is an amazing new @spacy_io pipeline for parsing Latin texts natively with the spaCy Python Library. It was created by Patrick J. Burns (@diyclassics). LatinCy: https://t.co/jVDR5okXP4 Here is a quick video on it: https://t.co/7rACvGkXNP
1
16
52
If you want to participate improving LatinCy, it's here 👇
0
0
3
The @CLTKorg package is used by digital philologists. I sketched a way to search patterns inside texts. The idea is described here
github.com
Several months ago, I sketched a Query class (see master...clemsciences:cltk:doc-query). >>> from cltk import NLP >>> non_nlp = NLP("non", suppress_banner=True) >>...
0
1
2
A @spacy_io wrapper is being developed for @CLTKorg package, do you have ideas to improve it? See the pull request:
github.com
I added the spaCy process with a custom wrapper to translate Token from spacy to Word in cltk. The aim is to be able to use trained models provided by spaCy with CLTK.
1
2
3