huggingface/transformers: FlauBERT, MMBT, Dutch model, improved documentation, training from scratch, clean Python code

Verfasser:	Thomas Wolf Lysandre Debut Victor SANH Julien Chaumond Aymeric Augustin Rémi Louf Funtowicz Morgan Stefan Schweter",Denis,erenup,Matt,"Piero Molino Grégory Châtel Patrick von Platen Tim Rault MOI Anthony Catalin Voss Bilal Khan Bram Vanroy Fei Wang Julien Plu Malte Pietsch Louis Martin Davide Fiocco",dhanajitb,Jinoo,"Ananya Harsh Jha Juha Kiili Guillem García Subies",Clement
Dokumenttyp:	other
Erscheinungsdatum:	2020
Sprache:	unknown
Permalink:	https://search.fid-benelux.de/Record/base-27466214
Datenquelle:	BASE; Originalkatalog
Powered By:	BASE
Link(s) :	https://zenodo.org/record/3633003

FlauBERT, MMBT MMBT was added to the list of available models, as the first multi-modal model to make it in the library. It can accept a transformer model as well as a computer vision model, in order to classify image and text. The MMBT Model is from Supervised Multimodal Bitransformers for Classifying Images and Text by Douwe Kiela, Suvrat Bhooshan, Hamed Firooz, Davide Testuggine (https://github.com/facebookresearch/mmbt/) Added by @suvrat96. A new Dutch BERT model was added under the wietsedv/bert-base-dutch-cased identifier. Added by @wietsedv. Model page A new French model was added, FlauBERT, based on XLM. The FlauBERT model is from FlauBERT: Unsupervised Language Model Pre-training for French (https://github.com/getalp/Flaubert). Four checkpoints are added: small size, base uncased, base cased and large. Model page New TF architectures (@jplu) TensorFlow XLM-RoBERTa was added (@jplu ) TensorFlow CamemBERT was added (@jplu ) Python best practices (@aaugustin) Greatly improved the quality of the source code by leveraging black, isort and flake8. A test was added, check_code_quality, which checks that the contributions respect the contribution guidelines related to those tools. Similarly, optional imports are better handled and raise more precise errors. Cleaned up several requirements files, updated the contribution guidelines and rely on setup.py for the necessary dev dependencies. you can clean up your code for a PR with (more details in CONTRIBUTING.md):make style make quality Documentation (@LysandreJik) The documentation was uniformized and some better guidelines have been defined. This work is part of an ongoing effort of making transformers accessible to a larger audience. A glossary has been added, adding definitions for most frequently used inputs. Furthermore, some tips are given concerning each model in their documentation pages. The code samples are now tested on a weekly basis alongside other slow tests. Improved repository structure (@aaugustin) The source code was moved from ./transformers to ...