Sorry for all Australian customers. Out of the 100+ databases I made, Australia Car Database was the MOST DIFFICULT of all web scraping projects. Keeping updates on-going for Australia required me to pay third-party programmers (some being not trusted) and took too much time and effort compared with other databases I made myself, and given by small population of Australia, effort no longer pay off from sales. I abandoned Australia Car Database to focus on increasing update frequency of European and American car databases that have higher sales volume.
Beside this, on 16 July 2020 I received a letter from Redbook requesting CEASE & DESIST selling Australian car database. If you want an Australia car database, instead of buying from me, you need to contact Redbook for a quote, beware, it may cost a 4-digit sum to buy legally from Redbook, I can imagine that you cannot afford this price if you are a startup company thus you may want to pay few hundreds $ to freelancers on Upwork for scraping. I was doing the same, not to steal customers from Redbook, but to help startup companies who were anyway not able to afford buying from Redbook (when they grow they could become Redbook direct customers and get realtime updates). Unfortunately Redbook TOS do not allow this practice and may take legal action against people scraping data. Quote from Redbook TOS:
Personal and Non Commercial Use Only
(a) Use of the RedBook Website is for your personal and non-commercial use only. Except for the Material held in your computers cache or a single permanent copy of the Material for your personal use you must not without our prior written approval:
– use any automated process of any sort to query, access, retrieve, scrape, data-mine or copy any Material on the RedBook Website or generate or compile any document, index or database based on the Material published on the RedBook Website;
I wrote this page in 2016 as possible future project, inviting people to contact me if are interested in a car database for Australia and suggest websites to scrap data from, other than Redbook because its TOS forbid scraping. I informed visitors that Redbook is selling data so would be ILLEGAL for me to scrap data from a seller and sell on my behalf, and asked them to suggest me other websites to scrap data from, but once few visitors insisted me to scrap from Redbook because is the BEST, I had no other choice to help them… I created Australia car database in June 2017 using Octoparse.com, a FREE scraper slow and buggy, had to run in small batches of 1000-2000 cars and took 2 weeks to complete project, 96 columns, without colors and features which were not scrap-able with Octoparse. I updated every 3-4 months, each update taking 2-3 days.
In February 2019 I worked with an Australian student who made a Python script that took only 2 days to scrap all cars, gave me a BETA .py scraper with errors and Colors and Optional Features missing, which I reported and he fixed afterwards but instead of giving me final scraper, he setup scraper to run and update on monthly basis on his server and the agreement was to give me login info to download CSV. Everything looked perfect, his database included also Colors and Optional Features, but in April 2019 he became uncontactable and server was closed, leaving me and my customers with no updates.
In July 2019 I hired another Python programmer, an indian who impressed me saying that have over 80 databases created in Python, I gave him the BETA scraper from Australian student to fix errors. He turned to be a liar, very inexperienced, he used a stupid way to fix errors in Australian student’s scraper which caused more and more errors, wasting my time for 4 months testing his scraper and report errors, the idiot abused my lack of knowledge of Python + lack of time to check his code carefully, kept demanding extra $ over the price we initially agreed (NO legitimate programmer would charge extra money for fixing his own errors).
In September he offered to sell me his portfolio of 80+ databases for 100 EURO, which I paid, and in October when I was less busy and I checked them, I realized that at least 90% of them were open data available for free download on various sites (he was LYING that created himself). After providing 2 temporary updates with 2019 models in September and November, I started a fresh scrap of all 1960-2019 models in late November and published final version on 17 December 2019, but due to coding typos 11 columns do not have data and car names were wrong initially (fixed in January). I fixed myself all errors in his code, and I was planning to run scraper again all 100,000 cars but did not had this chance anymore because in January 2020 the source website added a CAPTCHA. The indian idiot kept making false promises so I tried to hire other programmers to add a cookie feature in his Python script to bypass captcha, but… got this reply:
Sat, May 30, 8:02 PM
We looked at this in detail, and discoved that redbook are using a sophisticated AI based protection system called datadome, its not just the cookie that it is using to protect the data, its a load of other meterics, like access frequency, coverage and other technical metrics.
You can read about it here: https://datadome.co
We decided after some experiments that it would not viable to bypass this system.