An Approach to Geotag a Web Sized Corpus of Documents with Addresses in Randstad, Netherlands

This paper describes a cluster compute workflow about how a web sized corpus of documents (3.6 ×10^9 documents, 260 TiB of data) can be geotagged and how semantic similarities of documents geotagged to the same address could be used to verify these tags.

Verfasser:	Czech, Alexander
Dokumenttyp:	conferenceObject
Erscheinungsdatum:	2018
Verlag/Hrsg.:	ETH Zurich
Schlagwörter:	Geotagging / Data Science / Data Mining / Natural Language Processing
Sprache:	Englisch
Permalink:	https://search.fid-benelux.de/Record/base-29174225
Datenquelle:	BASE; Originalkatalog
Powered By:	BASE
Link(s) :	https://hdl.handle.net/20.500.11850/225615

Verfügbarkeit prüfen

Suche in Bibliothekskatalogen:

	Prüfen Sie die Verfügbarkeit in Ihrer Heimatbibliothek
	Suche deutschlandweit und international (KVK – Karlsruher Virtueller Katalog)
	Suche weltweit im Worldcatworldwide_worldcat

Suche via Google:

Suche via Google

Suche in Google Scholar

Suche in Google Books