Real estate portals are the number one victim of malicious web scraping, with a 300 percent increase in bad bot activity from 2014-2015 according to a new report from Distil Networks, a specialist in bot detection and mitigation.
The 2016 Economics of Web Scraping report revealed companies lose millions each year as a result of web scraping, defined as a computer software technique to extract information from the internet, usually transforming unstructured data on the web into structured data that can be stored and analysed in a central database.
The practice has been around for as long as the internet and, according to the report, is increasing exponentially, particularly in the real estate sector.
The main type is content scraping (38 percent) followed by research, where 26 percent of companies hire web scrapers to monitor consumer opinions about products and companies. And 19 percent of companies use web scraping for contact scraping to gain access to customers’ emails.
Charlie Minesinger, director of solution sales at Distil Networks, told Property Portal Watch bad bots are very nimble and can steal all manner of information from property portals.
“Bots can steal whatever data they’re programmed to scrape,” he said. “Property data and photos are highly valuable and relatively easy to steal using malicious web scraping bots. Any real estate website or portal with an inventory of listing data is a primary target.”
Minesinger said the huge jump in scraping for real estate sites was the result of new players in the market.
“According to our 2016 Bad Bot Landscape Report, bad bot activity in the real estate industry has gone up 300 percent in the past year,” he said. “This is likely due to the recent explosion of real estate startups, which may be taking a page out of the travel metasite playbook by scraping and aggregating data to get their businesses off the ground.
“Why license the data when you can scrape it for free, until your business model proves itself out?
He suggested the rise in mobile phone use had most likely also contributed.
“Mobile and APIs present new challenges as well,” he said. “As in many other industries, mobile is becoming an increasingly important component of the real estate marketing mix.
“Unfortunately, the same characteristics that make a mobile optimised site easy to navigate for humans also makes it a prime target for bad bots, because mobile sites provide structured access to data making it easy to scrape. Wherever humans go online, bots are not far behind.”
“Aggressive web scraping bots steal listing data for distribution on competing sites, pulling customers away from your site,” he said. “Web scraping is not syndication. When bots steal your data, photos and copyrights, your portal loses traffic while your competitor’s portal gains.”
Minesinger warned that once listing data has been scraped, a portal will lose control over the presentation of the property, the quality of the data, and the ads and agents that are displayed adjacent to the listing.
“Rogue websites do not update listing data frequently, and potential buyers may call the agent or broker to inquire about a listing which is either inaccurate or out-of-date,” he said.
“The resulting frustration creates a lack of trust, reflecting poorly on your brand.”
There’s also a significant danger of bad bots stealing information about agents who advertise on your portal.
“This data is used by your competitors to steal the advertising dollars those agents spend on your portal,” Minesinger said. “Your site can become a lead generator to sell competing ad space to your realtor and broker clients.”
Web scraping can affect the bottom line, as bad bots commit form spam and click fraud by filling in fraudulent registration forms and clicking on ads to mimic human behaviour.
“Agents, lenders, developers and other portal advertisers see the ROI of their advertising spend decrease as ads are clicked by bots,” he said. “Advertisers want, and will demand, certified human traffic, not bots.”
Worryingly, the practice is very cheap. The report reveals web scraping services can cost as little as $3.33 per hour, with the average web scraper makes $58,000 per annum.
“Real estate data is gathered tediously from multiple sources and historical files which often require fees,” he said. “When web scraping bots steal your listing data, they’re essentially stealing the value of your investment in collecting and compiling such data, and this affects your margins.”
“Trawling through server logs, identifying patterns, tracing IP addresses and rewriting rules on your web application firewall (WAF) works for a few minutes,” he said.
“Then the bad guys are back, having cycled through another set of IP addresses and anonymous proxies.”
He advises property portals to investigate a solution that incorporate bot detection techniques such as fingerprinting of existing bots and machine learning to understand anticipated site visitor behaviour, analyse your traffic and let you decide who’s welcome on your site and who isn’t.
And keep an eye on your security.
“You’ve heard it a million times before, but it’s always worth repeating, get – and stay – on top of patches and site security,” he advised.
“Fewer than 50 percent of enterprises patch quickly enough to block the bad guys, and fewer than 30 percent of corporate websites use SSL.”