Download FREE sample (one make):
Car make, model, version, no specs (5 columns)
Car make, model, version, basic specs (28 columns)
Car make, model, version, full specs & features (188 columns)
Alternate formats: CSV and SQL (full specs & features)
Buy FULL database (all makes) + FREE updates every month:
Coverage: oldest car included in database that indicate year is Daewoo from 1994, the FIRST foreign car manufacturer to enter in Indian market, followed by Ford and Opel (1996), Fiat (1997), Honda and Hyundai (1998). It may include pre-1994 models from Indian domestic brands but no year is indicated.
Makes included: Ashok Leyland, Aston Martin, Audi, Bentley, BMW, Bugatti, Caterham, Chevrolet, Chrysler, Datsun, Daewoo, DC, Eicher Polaris, Ferrari, Fiat, Force Motors, Ford, Honda, Hindustan Motors, Humber, Hyundai, ICML, Isuzu, Jaguar, Jeep, Kia, Lamborghini, Land Rover, Lexus, Mahindra, Mahindra-Renault, Maini, Maruti Suzuki, Maserati, Maybach, Mercedes-Benz, MG, Mini, Mitsubishi, Opel, Nissan, Porsche, Premier, Renault, Rolls-Royce, San, Skoda, Ssangyong, Tata, Toyota, Volkswagen, Volvo, Willys.
Download FREE sample (one make):
Bike make, model, version, full specs (93 columns)
Buy FULL database (all makes) + FREE updates when someone ask an update:
Coverage: since NO bike have production years indicated, I cannot answer which is oldest bike included in database, but I assume that 2000s-present at least.
Makes included: Aprilia, Ather, Avan Motors, Avanturaa Choppers, Bajaj, Benelli, BMW, Cleveland CycleWerks, Ducati, F.B Mondial, Harley-Davidson, Hero, Hero Electric, Hero Honda, Honda, Hyosung, Indian, Jawa, Kawasaki, KTM, LML, Mahindra, Moto Guzzi, MV Agusta, Norton, Okinawa, Royal Enfield, Suzuki, SWM, Tork, Triumph, TVS, UM, Vespa, Yamaha, Yo.
Description & sources of data
India Car Database is the first project I done using an automated software to extract data from websites, opposite of European car databases that I am making since 2003 entering data manually from books and magazines.
I made it in August 2015 when I finished a new phase of European databases, I looked online for web scraping software and after a week of learning and experiments, I managed to grab all data from www.carwale.com website, new and old cars. I was not aware that Carwale website been just redesigned between 14-18 August 2015 according www.archive.org.
I made it from my personal interest, because of many people from India contacting me asking if I have / can make a database for their country, and though that there is big potential. I was WRONG, just a small percentage of indians are willing to pay for data.
One of the people who contacted me during that week (and purchased India car database afterwards) wrongly understood that I created India Car Database “just for him” and asked me to make a 2-wheeler database too. I rejected, because my interest was only for cars, but once I mastered my data scraping skills and reduced effort to create a database to <1 day, I decided to offer web scraping as service. In January 2016 another customer wanted bike specifications, this was the moment I made a scraper for www.bikewale.com too.
After doing 3 updates in 8 months, database been purchased by 7 people and that was enough to offer monthly updates on 1st day of month, since May 2016. See list of updates.
Note about updates inconsistency
Between 2015 and 2017 I ran scraper on make pages to get model URLs, then every model URL to get version URLs, then every version URL to get specifications, remove all data from previous update and put new data. All cars got updated (including prices) to current month.
In February 2017 Carwale website removed (hide) URLs leading to discontinued models. So my database contains valuable data that you cannot get yourself from Carwale anymore.
I kept updating database by getting version URLs of new cars only, add URLs in existing data, compare the unique ID number from each URL, delete duplicates, then scrap all versions URL (new and discontinued) to get specifications including current price and last recorded price for discontinued cars.
In November 2017 Carwale removed unique ID from each URL, which was the ONLY way to distinguish multiple cars with exactly same name. All cars URL been changed and redirected, in 10 cases the old version URLs redirect to 404 Not Found, in 197 cases the old version URLs is redirecting to wrong car (multiple old URLs redirect to same new URL), making me impossible to re-scrap old cars for updates without risking loss of model versions.
The only way to update database is to run scraper on new cars only, add data into New & Old cars, use an Excel formula to identify duplicate URLs and delete them, remaining URLs I assume that they are cars launched last month and I add them at bottom of database. I add new cars each month, but cannot update older cars anymore (such as price, which change often). This is not 100% reliable, if Carwale change/correct a model name it will reflect in different URL and I will add in database as new model, and if a model is discontinued and replaced by a new model with same name and URL, it will be not included.
In 2019 Carwale choose to concatenate multiple specifications into a single field (such as cubic centimetres, cylinders, valves and camshaft), causing inconsistencies in my database between old and new cars. Since old cars aren’t showing anymore on Carwale website to scrap data again for ALL cars as I did in 2015-2017, my database’s quality is at risk if Carwale continue to do changes on website (if you purchase “new cars only” database, don’t worry, it is consistent).
In April 2019 I made new scraper for Bikewale, adding individual versions in the Indian bikes database (in the previous editions, if a bike had multiple versions, database contained only base version).
Car data fields included
Naming: ID, Make, Model, Version, Status 100%.
Price: Production cars 30.65%, Discontinued cars (last recorded price) 69.35%.
Body: Length (mm) 99.70%, Width (mm) 99.70%, Height (mm) 99.67%, Wheelbase (mm) 99.54%, Ground clearance (mm) 59.27%, Kerb weight (kg) 61.11%, Bootspace (litres) 48.75%, No of doors 99.54%, Seating capacity 99.57%, No of seating rows 72.36%.
Engine: Displacement (cc) 99.21%, Max power (bhp) 99.43%, Max power (rpm) 99.21%, Max torque (Nm) 99.43%, Max torque (rpm) 99.73%, Transmission type 99.78%, No of gears 97.15%, Drivetrain 86.74%, Engine type 87.39%, Cylinders 72.17%, Bore x Stroke (mm) 13.18%, Compression ratio 9.16%, Valves per cylinder 69.73%, Dual clutch 60.92%, Sport mode 61.74%, Fuel system 31.33%, Turbocharger/supercharger 50.60%, Turbocharge type 50.16%, Driving modes 51.01%, Manual shifting for automatic 50.03%, Engine start-stop 49.78%.
Fuel: Fuel type 99.67%, Alternate fuel type 63.53%, Mileage (kmpl) 87.31%, Fuel tank capacity (litres) 96.30%.
Drivetrain: Suspension front 91.30%, Suspension rear 90.68%, Brake type front 98.48%, Brake type rear 98.23%, Steering type 50.87%, Turning radius (m) 80.14%, Wheels 50.95%, Spare wheel 68.53%, Tyres front 71.28%, Tyres rear 71.22%.
Others: Colour names 93.89%, Colour RGB 93.89%, Image URL 87.85% (you can use Tab Save extension for Chrome to download image files).
Features: 131 columns, see SAMPLE file, I do not list them here to overload the page with too much text.
Bonus: Car class, Body style 100.00% (these 2 columns are NOT sourced from Carwale, but added manually from my personal experience, available ONLY in new+old cars package).
Percentages as 1 January 2017 (3680 cars).
Note: Car class and Body style are NOT available in new cars only package, because would take a lot of time to re-add them for 1200 cars every monthly update, so I add them only in new+old cars package for the cars added each month (about 20-50 cars per monthly update). Cars launched after June 2019 have some engine columns merged into 1 column due to changes in Carwale website.
Car engine codes database
Database made at request for someone who wanted production years for every model and engine, and engine codes, and posted here for other people who may be interested in same thing
Download FREE samples:
India Car Models Engines Database SAMPLE.xls
Buy FULL database:
Trucks and buses databases
Source of data: CarDekho.com, I made these 2 databases in September 2016 at request from a customer who just wasted my time and never purchased databases. First sale I done in January 2018 so I updated them for first time. I updated 1 more time in April 2018, then in August 2018 I noticed that CarDekho made each version URL to redirect to main model URL, effectively making me impossible to scrap specifications of other versions than base version. Poor sales of database made me to abandon them.
Somewhere in late 2019 or early 2020 CarDekho changed coding again making scraping specifications by version feasible again. I updated them again in May 2020. Due to low sale volumes, I will do updates on request basis rather than monthly like in case of India Car Database.
Buy FULL database + FREE updates when someone ask an update:
Bikes and cars DEALERS database
Several people told me to scrap dealer information from Carwale and Bikewale. Here is the database containing dealer name, street address, email and phone number.
Buy FULL database + FREE updates when someone ask an update:
DO NOT ask me about car owners database!
A number of people have trouble understanding what I am selling or don’t bother to check samples of what I am selling (database of car MODELS with specifications and features), they ask me straight to sell them a database of car OWNERS with registration number, name, address, profession, phone, email, insurance expiry date, etc. Strangely I do not get such questions from Europe and America, but ONLY from India.
I DO NOT have registration / owners data, and the companies who does have (car dealerships and insurance companies) must follow personal data protection laws and DO NOT share data of their customers to third-parties.
If you do a google search “car owners database” you see at least 10 sites selling personal data illegally, all them from India, the only country in the world where people have no respect for personal data and email/SMS spamming is national sport. But I am skeptical about how real is this data, how it was obtained and how updated it is, considering that vehicle registration authority do not keep records of emails and phone numbers, but only of residence address. Furthermore, most drivers do not even use email! They may be databases of users registered in some website that was hacked or a corrupted employee try to make extra money by selling their internal database.
The only way to get car registration data legally and up-to-date is to apply in vahan.nic.in “The Ministry has decided to offer the services to different stake holders like Banks, Insurance Companies etc on payment basis.” If you do apply, please inform me what are their prices.
News: in April 2019 been contacted by a strange person claiming to have database of customers from all new car dealerships across India, 1.5 million records. He was speaking very bad english mixed with hindi and we didn’t understood each other, when I asked for a SAMPLE with few rows to see what details he have about each car buyer/owner, he said “bulk orders only” and asked me to do bank transfer. How I can pay without knowing what I get?
I have his phone and email, he is living in Mumbai. I am not from India so I cannot meet him. Is there anyone from Mumbai who can help me and meet him in person, check database before paying, then share database with me (I can pay you half of the price of database) then I will sell it to other people interested?