Data aggregated across 30k+ websites for this commercial and property intelligence provider

Background and business need

A leading player and provider of digital marketplace connecting several millions of consumers looking for property information and new or upcoming commercial and residential properties through its network of websites and applications. They had a requirement to aggregate construction tender information from the government/public sources across the world.

Their database provides information around global searches around rental properties, apartment rentals, apartment guides, upcoming properties for sale and rental, new projects, industry news, trends, and forecasts around the globe with unmatched search capability that is amplified by constantly refined tools and industry insights.

They identified X-tract.io as an ideal solution for their data aggregation needs. They needed technical expertise to build a robust database to dispense construction project information to customers and to track and aggregate thousands of trusted sources in real-time.

Challenges faced by the customer

The volume of data required by the customer was huge and in addition to aggregating this data, they also wanted to perform deduplication of the data. The following were the complexity involved in the process which is why they partnered with X-tract.io

  • Data had to be fetched from more than 30K websites which were from different geographies (the US, Canada, UK, and Europe) and data formats also varied

  • Data was required to be aggregated in near real-time

  • Data aggregated from these 30K websites were required to be de-duped

X-tract.io Solutions and analysis

The data experts at X-tract.io analyzed the challenges and implemented a step-by-step solution.

1

A single solution framework for aggregating data across five regions

A custom technology architecture was developed on our workflow platform - Worxtream that helped to track websites and download property and real-estate related documents in the form of PDFs, feeds, and many more.

2

Normalization of data aggregation from multiple online platforms

The gathered data was in multiple formats. The data was required to be standardized and normalized based on client system-specific conventions and their process information.

3

Unique database for different time zones

The data sources were based out of different geolocations that needed a unique time window for data extraction. We set-up a robust back-end database to handle the huge volume and scale of data and to manage the time-zone through specific crawl schedules.

4

Data safety and security

Worxtream provides secure login and role-based privileges to ensure that your data is secure and only available to appropriate users for operation.

Results and outcomes

The digitized output provided by X-tract.io had an accuracy of 99.56% as against the contractual standard of 95%

X-tract.io provided 47 digitized records per hour against the standards set by the customer which was only 39 digitized records per hour. There was an overall 21% increase in productivity

Looking out for a similar solution?

Have a similar requirement? Get started with your data aggregation project now!

© 2020 X-tract.io | All Rights Reserved.