Logo do repositório
 
A carregar...
Miniatura
Publicação

Data extraction with a state-of-the-art nip model trained on synthetically generated German rental contracts

Utilize este identificador para referenciar este registo.

Orientador(es)

Resumo(s)

The flow of information in the real estate market is rapidly accelerating, and real estate investment companies are actively seeking automations to streamline transactions to minimize missed investment opportunities. As a complementary product feature to zoolo, a Property Technology start-up based in Germany, an extraction model was deployed to automatically derive information from scanned leases. Given the scarcity of data due to legal constraints, synthetic leases were constructed to train a fine-tuned spaCy v3.4 model. Thus, this paper reveals that three algorithmically generated synthetic lease paragraphs are suitable to provide a basis for training and applying the spaCy NLP model using Named Entity Recognition and token classification models

Descrição

Palavras-chave

Machine learning Natural language programming Nlp Named entity recognition Ner Synthetic data generation Data scarcity Proptech Real estate 4.0 Spacy

Contexto Educativo

Citação

Projetos de investigação

Unidades organizacionais

Fascículo

Editora

Licença CC