explosion/spaCy: v2.2.2: Multiprocessing, future APIs, Luxembourgish base support & simpler GPU install

✨ New features and improvements NEW: Support multiprocessing in nlp.pipe via the n_process argument (Python 3 only). Base language support for Luxembourgish. Add noun chunks iterator for Swedish. Retrained models for Greek, Norwegian Bokmål and Lithuanian that now correctly support parser-based sentence segmentation. Repackaged models for Greek and German with improved lookup tables via spacy-lookups-data. Add warning in debug-data for low sentences per doc ratio. Improve checks and errors related to ill-formed IOB input in convert and debug-data CLI. Support training dict format as JSONL. Mak... Mehr ...

Verfasser: Matthew Honnibal
Ines Montani
Sofie Van Landeghem
Henning Peters
Maxim Samsonov
Jim Geovedi",adrianeboyd,"Jim Regan
György Orosz
Paul O'Leary McCann
Søren Lind Kristiansen
Duygu Altinok",Roman,"Grégory Howard
Wannaphong Phatthiyaphaibun
Sam Bozek
Explosion Bot
Björn Böing
Mark Amery
Leif Uwe Vogelsang
Pradeep Kumar Tippa",jeannefukumaru,GregDubbin,"Vadim Mazaev
Ramanan Balakrishnan
Jens Dahl Møllerhøj",wbwseeker,"Magnus Burton
Avadh Patel
Dokumenttyp: other
Erscheinungsdatum: 2019
Sprache: unknown
Permalink: https://search.fid-benelux.de/Record/base-26746357
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://zenodo.org/record/3524402

✨ New features and improvements NEW: Support multiprocessing in nlp.pipe via the n_process argument (Python 3 only). Base language support for Luxembourgish. Add noun chunks iterator for Swedish. Retrained models for Greek, Norwegian Bokmål and Lithuanian that now correctly support parser-based sentence segmentation. Repackaged models for Greek and German with improved lookup tables via spacy-lookups-data. Add warning in debug-data for low sentences per doc ratio. Improve checks and errors related to ill-formed IOB input in convert and debug-data CLI. Support training dict format as JSONL. Make EntityRuler ID resolution 2× faster and support "id" in patterns to set Token.ent_id. Improve rendering of named entity spans in displacy for RTL languages. Update Thinc to ditch thinc_gpu_ops for simpler GPU install. Support Mish activation in spacy pretrain. Add backwards-compatible support for new Language.disable_pipes API, which will become the default in the future. The method can now also take a list of component names as its first argument (instead of a variable number of arguments).- disabled = nlp.disable_pipes("tagger", "parser") + disabled = nlp.disable_pipes(["tagger", "parser"]) Add backwards-compatible support for new Matcher.add and PhraseMatcher.add API, which will become the default in the future. The patterns are now the second argument and a list (instead of a variable number of arguments). The on_match callback becomes an optional keyword argument.patterns = [[