Computer brands included
Desktop brands included: Acer, Alienware, ASUS, bluechip, DELL, Fujitsu, Hiditec, HP, iiyama, Intel, ISY, Lenovo, MEDION, MSI, Samsung, Shuttle, Zoostorm.
Laptop brands included: Acer, Alienware, ASUS, Canon, DELL, Dynabook, Fujitsu, Gigabyte, HP, Huawei, Lenovo, LG, MEDION, Microsoft, MSI, Samsung, Sony, Thomson, Toshiba.
TV brands included: Beko, Denver, Grundig, LG, Medion, Philips, Salora, Samsung, Sharp, Sony, TCL, Thomson, Toshiba.
Earliest laptop included is from 2006 but only 1178 models are included for 2006-2016, compared with 6046 models released 2017, 16557 models released 2018, 26511 models released 2019, 14929 models released 2020, and so on. Desktop and TV database don’t indicate release date to determine historical coverage.
Computer specifications included
Laptop database have 247 fields of specifications, grouped by 21 categories according source website: Naming (7 fields), Main specs (7 fields), Design (12 fields) Processor (15 fields), Memory (9 fields), Display (16 fields), Graphics (16 fields), Storage (15 fields), Optical drive (25 fields), Networking (12 fields), Ports & interfaces (19 fields), Audio (9 fields), Camera (11 fields), Keyboard (9 fields), Performance (6 fields), Software (7 fields), Battery (8 fields), Weight and dimensions (8 fields), Power (6 fields), Security (6 fields), Others (24 fields).
Desktop database have 266 fields of specifications, source website did not had any grouping so I tried to group them myself in similar fashion with laptops database.
Both source websites may have additional fields of specifications available for just a small number of laptops and desktops. I will check them deeper when I will be less busy and add next time I update databases.
This is a trouble: while each phone model have certain specifications, and we have websites like GSM Arena which shows all phone models from most popular manufacturers from 1994 to present, laptop manufacturers produce them with a lot of different specifications under same case, and allow you to further customize them before ordering. Laptops can also be customized aftermarket by anyone.
For example: I was myself looking for a laptop of older generation that support Windows XP but not too old to be powerful and have FULL HD screen. I realized that HP EliteBook series was one of the latest laptops with Windows XP support until xx70 generation launched in 2012. While looking on OLX I found multiple sellers and had to ask each one for specifications. I bought an HP EliteBook 8770w which can be found with 3 screen options: 1600×900 TN 1920×1080 TN, 1920×1080 DreamColor IPS, multiple processor options: i5 and i7, RAM memory from 8 to 32 GB, default HDD is 500 MB but the one I bought had 1 TB HDD, and so on. Probably few hundreds different combinations of components under SAME name: Elitebook 8770w.
I could not find a reliable and complete website to source data from, and I was afraid that regardless where I source data for laptops, the resulted database risk having poor accuracy (not matching the specifications of laptops available for sale on other websites).
For a moment I was thinking whenever is useful to make a database of individual computer components rather than whole computers?
Examples (made in 2019):
In november 2020 a customer sent me this message:
Offline Message left on 26 Nov 2020, 05:21 AM (GMT+0)
Hi I would like to scrap desktop and laptop data from https://www.laptoparena.net/ and https://desktopfind.com/ Do let me if thats possible and the price . Thanks
I published on 8 Decemeber 2020 a desktop database with 266 columns and laptop database with 247 columns. I opened few laptops with many data fields, wrote xPath codes for all them, then ran scraper on all laptops and used Excel formula to find products with most data fields and check them for possible additional data fields. I ran scraper 3 times. I cannot guarantee that I included all possible fields, been already spending 3 days on this project and I had many other projects on TO-DO list.
In March 2021 a friend gave me idea of Excel formula + scraping that helped me to find missing data fields faster. I realized that for desktops there are at least 580 unique field names, and for TVs 657 unique field names. In case of laptops I could not count because Excel CRASHES due to file size.
What I can do now:
1. change database format, instead of 1 row per product, into 1 row for each specifications (100 rows in average for each product), so all specifications will be shown in 2 columns only: field label and value
3. pay extra $100-200 to my programmer partner (for each website, laptops, desktops, TVs) to make a custom scraper that automatically identify specifications available and create a column for each (this means that columns risk to be shifted each time I re-run scraper in the future to update database, will be this a problem for you?)
2. keep adding manually columns (xpath) for each specification available (this will take most effort, especially to make sure columns are displayed in same order like in source website, and maintain order at each future update)
I choose option 3 for moment, with rest of fields to be gradually added in future updates:
April 2021: 8296 desktops (310 columns), 65961 laptops (289 columns), 4401 TVs (168 columns).
25 June 2021: 8715 desktops (356 columns), 71478 laptops (289 columns).
13 January 2022: 78709 laptops (355 columns), 6337 TV. Desktop databases cannot be updated right now because source website is offline.
20 January 2022: 6373 TV (190 columns).
1 May 2022: 83761 laptops (355 columns), 6429 TV (190 columns).
13 September 2022: 89661 laptops (355 columns), 6428 TV (190 columns).
2 August 2023: 102005 laptops (373 columns), 6429 TV (193 columns).