ScrImmo: A Real-time Web Scraper Monitoring the Belgian Real Estate Market

Web scraping (or Web crawling), a technique for automated data extraction from websites, has emerged as a valuable tool for scientific research and data analysis. This paper presents a comprehensive exploration of Web scraping, its methodologies and challenges.The discussion revolves around a concrete application, namely the automatic extraction of data concerning the Belgian real estate market. We introduce a real-time Web scraper called scrimmo~and tailored to collect data from websites containing real estate classified ads. The tool is developed in a continuous iterative process and based o... Mehr ...

Verfasser: Barzin, Félix
Yernaux, Gonzague
Vanhoof, Wim
Dokumenttyp: contributionToPeriodical
Erscheinungsdatum: 2023
Schlagwörter: Data analysis / Data extraction / Data gathering / Web crawling / Web scraping
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-28880505
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://researchportal.unamur.be/en/publications/18367766-da44-49e9-8f1c-bee722b65a77

Web scraping (or Web crawling), a technique for automated data extraction from websites, has emerged as a valuable tool for scientific research and data analysis. This paper presents a comprehensive exploration of Web scraping, its methodologies and challenges.The discussion revolves around a concrete application, namely the automatic extraction of data concerning the Belgian real estate market. We introduce a real-time Web scraper called scrimmo~and tailored to collect data from websites containing real estate classified ads. The tool is developed in a continuous iterative process and based on an innovative cloud architecture. The paper also briefly addresses the ethical aspects of Web scraping. By integrating insights from previous research and ethical guidelines, this study provides researchers with a comprehensive understanding of Web scraping and its potential benefits, while promoting responsible and ethical practices in data collection and analysis.