Web Analytics Made Easy -
StatCounter

– Index of databases –

Here is the list of 70+ database projects made by me, from personal interest to distribute freely (small ones) or sell for professional use (big ones), as well as databases made at request of single or multiple customers and published on website to allow other customers to purchase them if interested.

Not included in below list are the databases created as one-time project for single customers outside of my fields of interest, few of them made under non-disclosure agreement.

I offer web scraping services, making custom databases according your requirements

Project File
type
Size
(KB)
Pages /
rows
How was
made
Date made /
maintained
World geography
Solar System (Word version, DELETED) DOC manual 2000 ?
Solar System (Excel version) XLS 45 manual 2014
World countries & facts
(Word, based on old atlas, DELETED)
DOC manual 1998-2000
World countries & facts (based on Encarta 2002) XLS 81 ~200 rows manual 2004
World countries & facts (based on The World Factbook) XLSX 2280 268 rows scraping 2017
World cities population (original version, DELETED) DOC 727 130 pages manual 2003-2005
World cities population (Word simple) DOC 477 50 pages manual 2016-present
World cities population (Word detailed) DOC 781 150 pages manual 2016-present
World cities population (Excel detailed) XLS 1075 7500+ rows manual 2016-present
World tallest buildings database XLS 5425 15000+ rows scraping 2015
United States all buildings database XLS 91627 160000+ rows scraping 2016
Singapore real estate
Database of HDB Blocks XLS 6000+ 14000+ rows manual 2009-present
Database of HDB Resale Flat Prices (2008-2013) XLS 24307 160000+ rows copy-paste 2009-present
Database of HDB Resale Flat Prices (1990-2018) XLSX 60000+ 160000+ rows copy 2017-present
List of BTO & DBSS projects XLS 103 300+ rows manual 2009-present
List of BTO prices XLS 183 700+ rows manual 2015-present
List of SERS sites XLS 49 80 rows manual 2009-present
List of HUDC estates XLS 64 24 rows manual 2011-present
List of Executive Condominiums XLS 132 66 rows manual 2011-present
Condo Database SingaporeExpats XLS 2981 3100+ rows scraping 2015-present
Condo Database PropertyGuru XLS 2334 3200+ rows scraping 2016-present
Managing Agent Database XLS 1185 4200+ rows scraping 2017-present
Hong Kong real estate
Hong Kong Public & Private Housing Estates XLS 592 1000+ rows manual 2011
Hong Kong Housing Database Estates (Centadata) XLS 806 3200+ rows scraping 2016-present
Hong Kong Housing Database Buildings (Centadata) XLS 4093 24370 rows scraping 2018-present
Hong Kong Housing Authority XLS 420 476 rows scraping 2018-present
Romania geography – download from www.teoalida.ro/geografie
Geografia Romaniei (geographic features) DOC 198 23 pages manual 1998-2006
Drumuri nationale (list of roads) DOC 72 6 pages manual 1999-2006
Cai ferate linii (list of railways) DOC 74 6 pages manual 1999-2006
Cai ferate linii & statii (list of railways with stations) DOC 347 51 pages manual 2000-2006
Impartirea administrativa interbelica (1925-1940) DOC 166 11 pages manual 2006 ?
Impartirea administrativa in regiuni (1950-1968) DOC 81 8 pages manual 1999-2006
Impartirea administrativa actuala (1968-prezent) DOC 105 10 pages manual 1998-2006
Populatia judetelor si oraselor (city population) XLS 213 320 rows manual 2004-2006
Automobile research – download from www.teoalida.com/cardatabase
Car Models List DOC 403 59 pages manual 2003-2015
Car Models List XLS 1545 4500+ rows manual 2012-present
Car Nameplates List XLS 524 3000+ rows manual 2016-present
Car Models Timeline XLS 584 manual 2003-2015
Car Models Encyclopedia DOC 1680 360 pages manual 2005-2013
Car Models Database XLS 1346 3500+ rows manual 2005-present
Car Models & Engines Database XLS 13532 18000+ rows manual 2003-present
American Year-Make-Model XLS 1567 14000+ rows manual 2013-present
American Year-Make-Model-Trim-Specs XLS 74720 50000+ rows scraping 2014-present
German Car Database XLS 250178 100000+ rows scraping 2015-present
Indian Car Database XLS 17152 3000+ rows scraping 2015-present
Middle East Car Database XLS 10000+ rows scraping 2016-present
Australian Car Database XLS 96467 90000+ rows scraping 2017-present
Motorcycles Database XLS 24569 30000+ rows scraping 2016-present
AutoKatalog Statistics XLS 212 manual 2012-present
Automobile Production XLS 130 copy-paste 2012-present
Automobile Sales Figures XLS 833 scraping 2017-present
Computer & technology
Screen size calculator XLS 50.5 manual 2014
Screen resolution statistics XLS 326 copy-paste 2014-present
Screen resolution statistics by country XLS 2380 copy-paste 2014-present
Mobile Phones Database XLS 8968 8000+ rows scraping 2016-present
Digital Cameras Database XLS 2778 3000+ rows scraping 2017-present
Gaming stuff – see www.teoalida.com/games
Age of Empires DOC 208 manual 2001 ?
Age of Empires (base game)
(table of military units)
XLS 106 manual remade 2007?
Age of Empires: The Rise of Rome
(table of military units)
XLS 112 manual remade 2007?
Beetle Crazy Cup
(vehicle stats)
XLS 71 manual 2002?
GTA Vice City
(mission tree, vehicle stats)
XLS 117 manual 2005
GTA San Andreas
(mission tree, vehicle stats)
XLS 134 manual 2007
Commander Keen 4 – Second game I ever played!
(statistics of levels and items)
XLS 117 manual ~2000? DOC
remade 2007
Midtown Madness 1
(vehicle stats)
XLS 40 manual 2003
Midtown Madness 2
(vehicle stats)
XLS 35 manual 2005
Need For Speed: Porsche Unleashed
(vehicle stats, personal records on every track)
XLS 217 manual 2003-2006
Need For Speed: Hot Pursuit 2
(career levels, vehicle top speed stats)
XLS 58 manual 2003
Need For Speed: Underground
(career events, vehicle performance, tuning items)
XLS 161 manual 2006
Quake 3 maps statistics
(not yet public due to messy work)
XLS ? manual 2000-2007
The Sims 2/3 List of houses made by me XLS 24 manual 2013
The Sims 2 Career Tracks XLS 219 copy-paste 2012
The Sims 2 List of Items XLS 216 manual 2012
The Sims 3 Worlds and List of Lots XLS 398 manual 2013
Miscellaneous works
Music Database XLS 2200+ 6000+ rows manual 2005-present
Stadiums Database XLS 3000+ rows scraping 2018

Since childhood I love writing books, doing research, making databases and statistics. I started using computers in 1997, my dad taught me to use Word, but around 2003 I started using Excel more than Word. Analyzing data, making tables and charts about everything encountered in my life! For example, in racing games I measure the speed of each car and write the numbers in an Excel spreadsheet then make a chart.

Early works were pure hobbies, made from personal interest, with no plans to commercialize them.

Hong Kong Housing Database was also made from personal interest of analyzing public housing estates, few months later, an insurance company asked me if I can expand it to private housing estates. It became my FIRST large project made for a customer.

Car Database were also started as hobby for my personal use, but had an unexpected turnout into business after 2012 when I realized that many companies in automotive industry are paying big $$ to get a complete, accurate, and frequently updated database. I went through extensive transformation to make my hobby databases suitable for professional use.

Seeing the success with car database, I decided to transform other hobby databases and sell them for professional use, such as HDB Database  and World Cities Database.

Starting from 2015 I learned web scraping. Scraping usually means running a software to visit a list of given pages, extract specific data and put it in a database automatically. This allow me to create very large databases with little effort, spending ~30 min to write codes and leave scraping software to run in background for few hours or days.

Between 2015 to 2017 I created over 50 databases via web scraping, some from personal interest to sell to multiple people (India, Middle East and Australia car databases, mobile phones databases, etc) and others for single customers who requested them (web scraping services).

Keeping all databases regularly updated takes is a huge workload. Scraping more websites, if they take too much time, will create additional workload and will delay everyone’s updates. As 2018 I decided to STOP updating databases having less than 5 sales per year so I can focus on the ~20 best-selling databases that produce 80% of my income.

So… unless you come with a GREAT IDEA of database that can be sold to multiple customers, I have the right to NOT do your web scraping project if it takes more than ~2 hours of manual work and more than ~50 hours of running scraper in background.

Some projects (example Database of HDB Resale Flat Prices) involve copying the data from a source website and pasting in Excel, then few visual adjustments to make it beautiful, which takes just few hours to create a 20 MB Excel database.

Other projects (example Database of HDB blocks) involve compiling data from various sources and manual data entry in Excel, taking hundreds hours of work!

All my projects are made primarily for visualization in Excel, but some (especially the Car Database) are often used by professionals who are converting the Excel spreadsheet to CSV and MySQL and use in web design and mobile app development.

Working style

My parents encouraged me to work in Microsoft Word, and told me to finish and print the work because “what is not finished have ZERO value“. I never understand why they wanted to print… some works for example Car Database should NOT be printed and cannot be “finished”, it need to be updated constantly with new launched cars. They promised me that will help me publishing a book… but this never happened (there is a possibility that they encouraged me just to give me a solitary occupation at computer to prevent me disturbing them, instead of letting me to have a social life).

My dad even set me rules, how a book should be written: Arial font, 12pt body text, 16-20pt titles, all titles centered, bolded, underlined. However, since my writings were not really a book but a list of… something, the rules imposed by dad created excessive bold and centered text.

Around 2001-2003 I got fascinated by Notepad and by fixed-width font and I was using lines of — and === signs full page width to enhance titles.

Since 2003 I broke away from dad rules and started using Excel more than Word. New works in Word were optimized for on-screen display instead of printing, often using non-standard page sizes, to make exactly 1 page for each subject. I write with 10pt font for body text, 20pt and 15pt for titles, 10pt and 5pt for empty spaces.

Since 2003 for both Word and Excel works, the titles were white text on blue background for full page width, which does not look very well on paper. I used same style when created my website in 2009.

Since 2010, one of the distinctive features of all my Excel works are the coloured columns, older works were coloured to this format too. I combine the Excel databases with my graphic design hobby, making Excel files also artistic!

Since 2015 I changed the standard of Word files, removing full-width coloured backgrounds of titles and putting instead full-width horizontal lines (similar with what I was doing in 2001-2003 in Notepad). This will be better for printing (even if I assume that nobody will print my works). I changed also my website design to this format.

One of my friends said that my website looks like being made by an expert in typography rather than by a webdesigner!

Example of styling in my books: 2012, 2013, 2015 editions of Car Models Encyclopedia
  

I also done some kind of competition between my works, a race to create biggest database in Excel or biggest book in Word, in terms of pages and file size, under certain standards (10pt font, no duplicate stuff, no large open spaces, NO bullshit but useful content, etc). I kept track of file size in an Excel table similar with above table.

Mobile phones database

If you are looking for a database of mobile phones specifications in Excel format to create a website, use in a GSM shop or anything similar, I created an Excel database for you using a scraping software to extract data from www.gsmarena.com.

Download SAMPLE: Mobile Phones Database.xls.
LITE version include 7 columns: Brand, Phone, Notes, Image URL, Technology, Announced, Status. Instant download (example email).

Buy FULL database + 1 year of FREE monthly updates:

Buy & download

DO NOT ask for phone numbers database. I do not support telemarketing / SMS spamming. Please check above sample Excel file and watch video below to understand what I sell before wasting my time asking things that I don’t sell (like this guy).

Looking for digital cameraslaptops, TV, and other electronics? Suggest websites where I can extract data and create databases!

Mobile phone database coverage

The mobile phone specification database include classic phones and smartphones, tablets and smartwatches, from most popular phone brands in western world. There may be missing brands especially from China which have a lot of brands totally unknown outside their domestic market.

Earliest phones included are Ericsson models launched in 1994, but I guess that some early phone models are missing, as number of gadgets launched per year is significant only after 2003. See changelog.

114 mobile phone brands included

Acer, alcatel, Allview, Amazon, Amoi, Apple, Archos, Asus, AT&T, Benefon, BenQ, BenQ-Siemens, Bird, BlackBerry, Blackview, BLU, Bosch, BQ, Casio, Cat, Celkon, Chea, Coolpad, Dell, Emporia, Energizer, Ericsson, Eten, Fujitsu Siemens, Garmin-Asus, Gigabyte, Gionee, Google, Haier, Honor, HP, HTC, Huawei, Icemobile, i-mate, i-mobile, Infinix, Innostream, iNQ, Intex, Jolla, Karbonn, Kyocera, Lava, LeEco, Lenovo, LG, Maxon, Maxwest, Meizu, Micromax, Microsoft, Mitac, Mitsubishi, Modu, Motorola, MWg, NEC, Neonode, NIU, Nokia, Nvidia, O2, OnePlus, Oppo, Orange, Palm, Panasonic, Pantech, Parla, Philips, Plum, Posh, Prestigio, QMobile, Qtek, Razer, Realme, Sagem, Samsung, Sendo, Sewon, Sharp, Siemens, Sonim, Sony, Sony Ericsson, Spice, TECNO, Tel.Me., Telit, Thuraya, T-Mobile, Toshiba, Unnecto, Vertu, verykool, vivo, VK Mobile, Vodafone, Wiko, WND, XCute, Xiaomi, XOLO, Yezz, Yota, YU, ZTE.

Mobile phone specifications included

Database include 85 columns, each one having the following completion percentages (as 1 August 2019 update):

Naming: ID 100.00%, Brand 100.00%, Phone 100.00%, Notes 34.57%;

Images: Picture URL small 100.00%, Picture URL big 80.02%, Fans 100.00%, Hits 100.00%, Hits % 100.00%;

Network: Technology 100.00%, 2G bands 100.00%, 3G bands 60.73%, 4G bands 26.41%, 5G bands 5.73%, Speed 61.01%, GPRS 41.49%, Edge 41.57%;

Launch: Announced 99.84%, Status 100.00%;

Body: Dimensions 99.78%, Weight 99.06%, Build 7.72%, Keyboard 7.26%, SIM 99.99%, Other 12.49%;

Display: Type 99.96%, Size 87.46%, Resolution 87.46%, Multitouch 0.00%, Protection 18.62%, Other 29.37%

Platform: OS 61.37%, Chipset 48.45%, CPU 60.66%, GPU 47.28%;

Memory: Card slot 100.00%, Phonebook 37.94%, Internal 81.92%, Call records 37.18%;

Main Camera: Single 81.98%, Dual 4.30%, Triple 0.94%, Four 0.15%, Features 52.61%, Video 86.65%;

Selfie Camera: Single 51.74%, Dual 0.79%, Triple 0.02%, Four 0.00%, Features 9.09%, Video 12.34%;

Sound: Alert types 25.37%, Loudspeaker 100.00%, 3.5mm jack 100.00%, unnamed 26.50%;

Comms: WLAN 99.97%, Bluetooth 99.98%, GPS 99.69%, NFC 11.40%, Radio 99.09%, USB 90.66%;

Features: Sensors 58.04%, Messaging 38.62%, Browser 37.95%, Clock 5.02%, Alarm 5.02%, Games 37.96%, Languages 2.86%, Java 38.06%, Other 58.36%;

Battery: Battery 99.97%, Charging 8.85%, Stand by 70.40%, Talk time 73.43%, Music play 7.76%;

Misc: Colors 93.77%, SAR US 21.46%, SAR EU 25.00%, Price group 60.42%;

Tests: Performance 60.42%, Display 6.49%, Camera 7.95%, Loudspeaker 9.37%, Audio quality 8.54%, Battery life 6.26%.

Indian mobile phones

This page has been getting significant traffic from India, with many people contacting me but refusing to buy above worldwide mobile phones database and asking me if I can create another mobile phones database with only models available in Indian market. Job done!

I sourced data from 91mobiles.com, this website have 10 category of products, such as Tablets, Cameras, TVs, Home Theaters, Smartwatches, Washing machines, Air conditioners, Refrigerators, Microwave oven, but I do not intend to scrap all categories and create databases for each, because would take a lot of time to regularly update each of them, for their sale volume, they will become products sell-able only in India and we all know how little indians pay, if people from western world contact me if they are able to make a purchase, many indians contact me regardless they are willing to pay or not, and just few leads actually convert to sales.

Download SAMPLE: India Mobile Phones Database.xls.

Buy FULL database + 1 year of FREE updates:

Buy & download

How the Excel phone database can be used

Filter the mobile phone database and get a list of mobile phones having specific features.
Filter the mobile phone table by year and make list of features introduced in each year of history.
Analyze data and make statistics what are the best phones in each price range.
Convert Excel spreadsheet to CSV or MySQL and create your own phone comparison website or mobile app.
Etc…

Actually I do not recommend using this database in making websites. Feel free to use it for research and analysis purposes. I am NOT responsible for any copyright troubles you may face if you use their data commercially, making own website, etc. This is NOT an original database “Made by Teoalida”, but rather a database showcasing abilities of data scraping, and the data belongs to www.gsmarena.com. Please consider the price a fee for data scraping service from website rather than author of data.

Mobile phones specifications database

Car database

The hobby for cars started in 1999 but only in 2003 I decided to start making an Excel database of all cars. The research was done independently from the internet world (I connected to internet in 2005), sourcing data from AutoKatalog books (German publication), making an original compilation that you cannot find anywhere else online (except on websites that purchased the database from me).

I published the car databases on my website only in 2011, intending to share my research with other car buyers, hobbyists, car experts, etc, without expecting that I will be visited by various companies (auto insurance, auto parts shops, car shipping services, etc), programmers, web designers and mobile app developers, and I can make a business from this! Most of these visitors have zero experience in cars, and make often mistakes such as buying wrong database, buying an American car database while they do business in Europe, or buying from other data providers selling bad quality database just because it is cheaper or have higher number of model variations.

First sale was done in May 2012. Had to do some changes to make it appealing for this unexpected audience, both in data structure and in website presentation. The rising flow of customers gave me a REAL motivation to dedicate time for updating car database constantly. Since late 2012, NO month had passed without adding or changing something, creating the MOST UPDATED car database ever found on the internet.

Beside original European car database manually compiled from AutoKatalog books, since 2013 I create database for American market, and seeing the success, I created additional databases for India, Middle East, Australia, as well as real estate databases, mobile phones, etc, via web scraping.

Sales been growing and in 2015 I exceeded 100 databases sold, producing 80% of my income, making me to quit my job of AutoCAD and architectural design and dedicate my life to data providing industry!

Enter Car Database sub-website

Car Database

Web scraping services

I offer data mining and web scraping services. “Scraping” usually means coding a bot that visit a list of given pages, copy specific data from each page and put it in an Excel / CSV file automatically, at rate of few pages per second. Watch the video!

If you are building a website, a mobile app, or just require specific data but cannot find in usable format, just give me link to a website having required data, I will make a scraper and turn website into an Excel database, for you and for future customers.

Note: I have a LIMITED amount of time and I love creating useful databases that I can sell via website to as many people is possible, due to this reason I may reject projects that takes more than few hours if the data collected have no use for anyone else than yourself, or I can pass your project to my partners. See examples of projects done and their price.

For many years, manual data entry in Excel (sourcing from books, as seen in this video) or manual copy-pasting from websites, was the only way I created databases. A slow process which limited the size of the databases I could make. Even in this slow process I made about 40 databases in the fields of personal interest: automobiles, geography, real estate, computers, gaming, etc, from pure hobby.

I started in web scraping in August 2015 when I found import.io (a free scraper for simple HTML websites), and in November 2015 I allied with a programmer to create custom scrapers for more complex websites. This allowed me to create large databases with minimum effort, in a matter of hours.

import.io turned into a paid service in April 2016 and suspended my free account. New sign-ups were limited to 500 pages/month in free plan, paid plans prices were increased in 2017 to $299/month to scrap up to 5000 pages. This gave idea to my programmer to develop own “universal” scraping software in Visual Studio, comparable with the tools available online, but with no limit in number of pages or simultaneous projects, this allow me to scrap any simple website at lower price that you can do yourself.

Once I mastered my scraping skills, in early 2016 I wrote this article to offer freelance scraping services. In 2016 and 2017 I was doing every project that was technically feasible… until I overloaded myself with responsibility to provide regular updates for about 50 projects. In 2018 abandoned databases having less than 5 sales per year so I can focus on the ~20 best-selling databases that produce 80% of my income.

Note: you CAN scrap yourself using tools like import.io, some are free but slow and limited in functionality, limited in one project at time, limited number of pages you can extract, unless you upgrade to paid subscription. Although you can scrap yourself for free (small number of pages), may take few days to learn to use them efficiently. Most people do not have time to learn or cannot pay expensive monthly subscription. I can help you!

Simple data scraping service

This apply on websites with a distinct URL for each page and all data in HTML code. Data can be extrated with our “universal” scraper, analyzing website and writing codes that indicate what data to extract takes usually 10 min – 1 hour. Indicative prices:

  • Number of pages to be extracted: 1,000 pages = $50, 10,000 pages = $100, 100,000 pages = $300 (average speed 1 second per page, if website is slower I may charge higher) since each website have a random number of pages, these prices are just relative and final price will be given at request.
  • Number of columns to be extracted = 50 cents per column.
  • Multi-level scraping = $10 per level. Many car websites require a scraper for makes pages to get models URL, a second scraper for model pages to get versions URL, a third scraper to get in versions pages to extract car details which is what you need. Infinite scrolling, pagination, enter data in search boxes also add few $.
  • Cleaning data after scraping = extra $ if raw data from scraper include annoying spaces and line breaks, or unwanted characters such as unit of measurement after value, which need to be removed with Excel find-replace.

You NEED to provide website URL and I will quote a price in your preferred currency (USD, EUR, GBP, AUD, SGD, etc). For example for scraping Parkers.co.uk seen in demonstration video, 1-level scraping, 101 pages to be extracted, 4 columns, no cleaning needed, I charged only €23.66 which is the number of rows (2366 rows).

Web scraping services

Complex data scraping service

This apply on websites having drop-down lists, search boxes, JSON data, a login is required to access data, selecting various items do not produce a different URL, etc. In this case online scraping tools do not work, my friend universal scraping software also do not work, so he need to make in Visual Studio a custom scraper just for that particular website, this may take few days depending by his available time.

News: in 2019 other 2 people, from India and Australia, joined to take web scraping projects that are too complex for me.

Price: usually within $200 to $500 range which I share with my partner, price vary depending by complexity of website rather than number of pages to be extracted.

For less than 200 records may be faster to copy-paste manually than coding a custom scraping software.

Complex scraping services sometimes require screenshots (as below) for my programmer to indicate to bot where to click and what data to extract.

Complex scraping service

What cannot be scraped

Theoretically I can scrap data from any website, but only websites having the required data in a consistent structure from page to page, can produce a good usable database. An example of non-consistent website is Wikipedia.

Some websites look simple to scrap, but after starting job I get IP blocked, a CAPTCHA page, etc, anti-scraping features made to prevent copying data or to prevent DDOS attacks. If you ask for price before starting the job, you should be prepared for price changes if I find anti-scraping features, captchas that require a human to sit at computer all time and solve them, change IP or do manual data entry, making project too costly for the value of the data we can get.

Do not get angry at me if I fail scraping data from one website, just give me another website and I may succeed with it.

I know how useful is a phone or email database, for example if you are a car insurance company to spam emails to car owners posting listings in classifieds websites, but most classifieds websites protect seller phone number and contact email from being scraped and spammed with unsolicited emails, by using a Contact button, or need to click a button to reveal email, or email is shown in an image rather than text format. In this case the job can be done via manual data entry, a job more suitable for a child than for us, busy skilled programmers.

Advantages of my service and future updates

The main advantage of working with me is that once I create a database I can post on website to be purchased by multiple people, and offer everyone FREE updates for one year. When a new customer pay for database, if he require an update, I run scraper again and offer updated database for previous customers too, free of charge (this is valid for databases in my personal interest: cars worldwide, real estate of Singapore, and few more).

But if you ask me to scrap a website “just for you” outside of my fields of interest, you need to pay each time you want an update, 20-50% of the price you paid for initial database creation. While I usually give data only, if you need frequent updates I can give you scraper (an EXE file) to run on your own computer, at 2x-3x price of one-time scraping.

Can you make a database for our exclusive use, and not sell to anyone else?

I like web scraping jobs because I can publish databases on website for other people if are interested to purchase them too, generating a lifetime income with a small amount of time needed for regular updates.

BUT if you want to be exclusive user of a database, I will no longer like this job. You need to pay a price equivalent with the estimated sales for one year… and answer a big question: what I should do if someone else ask me for SAME data? Would be rude to say NO to the second customer just because a previous customer requested exclusivity (how I can “compile” again? I just need to sell the database I compiled for you), even if I say NO, he will pay another freelancer and obtain anyway the data I sold you exclusively.

In conclusion: exclusive use is nearly impossible, I can avoid publishing a database on website, but I reserve right to sell to other people if they ask for same kind of database.

Legal issues of web scraping

Scraping data from a website is usually LEGAL, but using scraped data in another website, is usually ILLEGAL.

Depends… if the data is added by volunteers, or by sellers in classifieds websites, scraping is most likely legal. But if authors of website hardworked to compile data from sources like car brochures or manufacturer websites, scraping is most likely illegal, especially if you use their data in making your own website or other commercial purpose. Although data is freely available, compilation can be copyrighted. Most websites contains dummy data (example: a bunch of cars having +/- 1 horsepower than official value) and if you use data copied from them, they can prove that you copied their data compilation and make a lawsuit against you. BEWARE!

For a moment I became concerned if my European Car Models & Engines Database sourced from AutoKatalog books is a copyright violation, but I came in conclusion that it is fine, because my databases is an original compilation writing data in a different data structure than the book, and it target online audience, while the AutoKatalog is a book sold in shops targeting car hobbyists. I am doing each year over 100 sales without having a single person worrying about copyright.

In case of America, Year-Make-Model is my original compilation sourced from Wikipedia and 3 more websites, while Year-Make-Model-Trim-Specs is web scraping from Edmunds.com website who is also offering API thus allow other websites using their data, so again is legal.

But, since I created India car database in 2015 sourcing data from Carwale.com I started being concerned that what I am doing may be illegal.

Country matters: I had many customers in India asking me to scrap data from various websites. However, when someone from Europe or America ask me certain data that I do not have and I propose him scraping services from a website, some people bring attention to legal issues of web scraping.

Funny case: someone offered to sell me a car database that he claimed to have been creating it by working for 4 months, 8 hours per day, copy-pasting data from a website, with rights to resell on my website. From copyright point of view does NOT matter if you extracted data using an automatic software or typed every letter manually, as long you copied data from a website your work is not original. He was probably not aware of scraping software. If you wasted few months doing something that could have been done in few hours using scraping software, you are an IDIOT (I was an idiot too doing such jobs before 2015 being not aware of scraping software, but small jobs only) and I am still doing in case of European database because I source data from books (offline sources), making an original product on the web.

Example of data extraction / scraping projects done and their price

All scraping software save data in CSV format, but if I decide to publish on website, I make XLS files with borders, colors, headers and other visual features to match the style of other products “Made by Teoalida” that give impression of work done with care.

India Car Database – source: www.carwale.com – Made in August 2015 from personal interest because of numerous people asking me about indian car database. Being my first scraping project, it took initially 8 days to figure out how to use import.io and do it, once my programmer partner made own universal scraper, time required to do each update was reduced to 3 hours. Over 3000 rows and 188 columns. Sold in 3 different packages 30, 60, 120 euro depending by number of columns. During first year it has been purchased by 8 people, I also made a FREE “make & model only” package, hoping to encourage customers to make a free purchase before paying for big database, but contrary happened. Once removing the free package, number of sales increased.

India Bike Database – source: www.bikewale.com – Made in January 2016 after 2nd person requested a database of bikes sold in India. One of easiest projects, having no drop-down boxes but plain links to each bike page. 250 records, price: 25 euro.

CarWale On-Road Prices – source: www.carwale.com – Made in January 2016 for a customer, a difficult project taking about 20 hours of coding in Visual Studio to make an application sending javascript requests to CarWale website to get price of each car in each city, we agreed for $300 of which $200 paid to my programmer, the scraper did 2 requests per second, so 3100 cars × 510 cities = 1632000 seconds = 226 hours needed to get all on-road prices, RTO tax and insurance. Had to keep scraper running for a month. In early 2016 this was OK but once more customers started to come I couldn’t do this anymore. I agreed with customer to reduce number of cities to 47 so scraping time was reduced to 4 days, and pay $50 per update. Due to GST in July 2017 customer said that this project is no longer required.

Skyscrapers Buildings Database – source: www.emporis.com – Made in November 2015 from personal interest, put for sale for $150 (15000 buildings) and turned into a marketing failure, 1 year passed and nobody purchased it (except a customer asking me for make US buildings database, see below). Took about 20 hours to compile manually list of cities with buildings over 100 meters, then list of buildings from these cities, then used import.io to automatically extract each building details. 15000+ buildings. Emporis block my IP for 2 days if I access more than 3000 pages in one day, so data extraction with import.io (not able to change IP) was limited to 3000 buildings per day, which took about 1 hour daily for 6 days.

US Buildings Database – source: www.emporis.com – Made in November 2016 for a customer seeing above Skyscrapers database told me to make a similar databases with all types of buildings from USA, 160,000+ buildings, had to run over 100 batches of max 2000 buildings, now using my partner’s universal scraper from my computer, I could change IP after each batch, running again and again blocked URLs until I was able to get all buildings. 60 hours of work. Price: $600.

Singapore Condo Database – source: www.singaporeexpats.com – Made for a customer in 2015, took 3 hours and sold database with 2809 condos for $140.50 SGD. In one year several other people purchased.

Singapore Condo Database II – source: www.propertyguru.com.sg – Made for a customer in 2016. Apparently an easy project, having plain links to all condos, it turned impossible to do with import.io because of a fucking CAPTCHA appearing randomly after 10-50 pages extracted. My programmer spend 2 weekends in Visual Studio making a custom scraper that allow me to input CAPTCHA when needed, charged me $300 USD, and I sold database with 3176 condos for $317.60 SGD (about 240 USD), leaving me in loss, but because other customers have purchased it, profit came.

World countries database – source: The World Factbook – Made in 2017 from personal interest, a database with an impressive amount of 362 columns and only 268 rows. Took about 5 hours to write XPath codes for each column, and only 35 minutes to scrap data.

Mobile Phones Database – source: GSMarena.com – Made in August 2016 from personal interest. A simple project made with our niversal scraper. During first year it has been purchased by over 10 people, this allowed me to provide FREE monthly updates, each scrap taking about 1 hour.

Australia car database – Made in June 2017 after a year of hiatus because I wasn’t sure if Australia can provide sufficient sale volume to cover my effort. Scraping was a headache because the source website use anti-scraping features that blocks my partner universal scraper. Had to use another scraper which was slow (12 seconds per page) and frequent crashes. Took 14 days to scrap all 90000+ cars, future updating is done by scraping only last year of cars. Price $450 with discounts offered for partial purchases. It had a happy turnout, during first year it over 10 people purchased it.

Chiptuning database – source: celtictuning.co.uk, br-performance.be, dyno-chiptuningfiles.com, made for a customer in February 2018 (CelticTuning), a simple project that took about 2 hours and price was $100. Several other people purchased it during a year.

Sulekha.xls – source: www.sulekha.com – A bit unusual data scraping, an one-time use database for SMS and email marketing, instead of creating a saleable product containing all car models, all buildings, all of something.

Postal code scraping – a customer gave me a list of postal codes which I input in www.streetdirectory.com to get building name and street address (in Singapore every building have unique postal code).

Flickr scraping – a customer downloaded a large amount of car images from Flickr and realized that to use in his website he needs to specify author name, link to source page and link to Creative Commons license. I scraped this info, 223,000 images for 223 euro at 0.6 seconds per page.

Used cars images – a customer asked me to scrap an used cars website, to get image URL beside Make, Model, Year. Took only FEW HOURS and I got over 100.000 car images, all in same resolution. He told me to keep it private and do not publish or resell on website. So I am telling you only the idea. If anyone wants to scrap car images in this way, let me know what website to scrap!

I done few more databases but the customers told me to NOT publish on website, or they are in fields unrelated to topics covered by my website so even if published, they won’t get sales.

Solar System Database

Are you looking for a database of planets and satellites in Excel format with their facts and figures, for research or to make a website? I made a database for you, sourcing data from solarsystem.nasa.gov and offer it here for download.

Buy FULL database (free preview in below image):

Buy & download

Solar System Database - planets and satellites facts and figures

Description & history

I have a long hobby for geography and astronomy. First time I made a solar system database in Word around year 2000 sourcing data from old 1970s atlases dating back from the time my parents were in school, and another one made in 2004 sourcing data from Encarta Encyclopedia 2002, including facts about planets only (no satellites).

In 2014 I make a new database, this time in Excel, I sourced data from NASA website, for the Sun, 8 planets, 5 dwarf planets, 21 satellites (2 satellites of Mars and 19 satellites big enough to be in hydrostatic equilibrium – over 400 km diameter), took me about 4 hours of manual copy-pasting.

I made it for personal interest of comparing facts of all planets and satellites at same time, while the NASA website allow only 1 vs 1 comparison. I published it on my website in case anyone else need it, to download it for free.

Given by high number of people who downloaded it, in 2019 I made a larger Excel database intended to offer it for paid download, containing EVERY celestial body listed on NASA website, this time using Chrome extensions to speed up the work, allowed me to collect in just 1 hour data for 203 celestial bodies listed below:

Sun (1): Sun.

Planets (8): Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, Neptune.

Dwarf Planets (5): Pluto, Ceres, Makemake, Haumea, Eris.

Earth’s Moon (1): Earth’s Moon.

Mars Moons (2): Deimos, Phobos.

Jupiter Moons (79): Adrastea, Aitne, Amalthea, Ananke, Aoede, Arche, Autonoe, Callirrhoe, Callisto, Carme, Carpo, Chaldene, Cyllene, Dia, Elara, Erinome, Euanthe, Eukelade, Euporie, Europa, Eurydome, Ganymede, Harpalyke, Hegemone, Helike, Hermippe, Herse, Himalia, Io, Iocaste, Isonoe, Jupiter LI, Jupiter LII, Kale, Kallichore, Kalyke, Kore, Leda, Lysithea, Megaclite, Metis, Mneme, Orthosie, Pasiphae, Pasithee, Praxidike, S/2003 J10, S/2003 J12, S/2003 J15, S/2003 J16, S/2003 J18, S/2003 J19, S/2003 J2, S/2003 J23, S/2003 J3, S/2003 J4, S/2003 J5, S/2003 J9, S/2011 J1, S/2011 J2, S/2016 J 1, S/2016 J2 (Valetudo), S/2017 J 1, S/2017 J2, S/2017 J3, S/2017 J4, S/2017 J5, S/2017 J6, S/2017 J7, S/2017 J8, S/2017 J9, S/2018 J1, Sinope, Sponde, Taygete, Thebe, Thelxinoe, Themisto, Thyone.

Saturn Moons (62): Aegaeon, Aegir, Albiorix, Anthe, Atlas, Bebhionn, Bergelmir, Bestla, Calypso, Daphnis, Dione, Enceladus, Epimetheus, Erriapus, Farbauti, Fenrir, Fornjot, Greip, Hati, Helene, Hyperion, Hyrrokkin, Iapetus, Ijiraq, Janus, Jarnsaxa, Kari, Kiviuq, Loge, Methone, Mimas, Mundilfari, Narvi, Paaliaq, Pallene, Pan, Pandora, Phoebe, Polydeuces, Prometheus, Rhea, S/2004 S12, S/2004 S13, S/2004 S17, S/2004 S7, S/2006 S1, S/2006 S3, S/2007 S2, S/2007 S3, S/2009 S1, Siarnaq, Skathi, Skoll, Surtur, Suttungr, Tarqeq, Tarvos, Telesto, Tethys, Thrymyr, Titan, Ymir.

Uranus Moons (27): Ariel, Belinda, Bianca, Caliban, Cordelia, Cressida, Cupid, Desdemona, Ferdinand, Francisco, Juliet, Mab, Margaret, Miranda, Oberon, Ophelia, Perdita, Portia, prospero, Puck, Rosalind, Setebos, Stephano, Sycorax, Titania, Trinculo, Umbriel.

Neptune Moons (14): Despina, Galatea, Halimede, Hippocamp, Laomedeia, Larissa, Naiad, Nereid, Neso, Proteus, Psamathe, Sao, Thalassa, Triton.

Pluto Moons (5): Charon, Hydra, Kerberos, Nix, Styx.

According Wikipedia page List of natural satellites, 35 moons were observed from Earth as 1978. A large number of moons have been discovered by Voyager 1 and 2 spacecrafts between 1979 and 1990, making a total of 63 moons. Starting from 1997 many small moons were discovered using Earth-based telescopes, reaching a total of 194 known moons in 2018 orbiting the 8 planets and 5 officially-recognized dwarf planets. Additional moons have been discovered to be orbiting around asteroids and trans-Neptunian objects.

solarsystem.nasa.gov do not include moons of Haumea (2), Makemake (1), Eris (1), dwarf planets that have not been visited yet by any spacecraft to gather precise information about them. Of 190 moons listed on NASA website and included in my Excel database, 152 moons are confirmed and include “By the Numbers” page, the rest being provisional moons, less than 1 km radius and not studied by any spacecraft.

Data fields included: Title, Description, Date of discovery, Discovered by, Average orbit distance, Mean orbit velocity, Orbit eccentricity, Equatorial inclination, Equatorial radius, Equatorial circumference, Volume, Density, Mass, Surface area, Surface gravity, Escape velocity, Effective temperature, Atmospheric constituents, Source URL. Some fields include 3 columns (metric, english, scientific), making a total of 42 columns.

Word document

I created this file from my hobby of making book-style printer-friendly documents in Microsoft Word and see how many pages, words and characters it would have been if NASA Solar System website was printed as a book, what planets and satellites have longest articles, etc.

Made in July 2019, it took 5 hours of constant copy-pasting from NASA website and another 5 hours to add formatting (all headings use a style with automatically update so you can easily adjust formatting of one style and whole document will change accordingly).

According Word Count: 249 pages, 119,645 words, 709,086 characters (with spaces).

Note: I do not recommend anyone to print it, do not waste 249 sheets of paper.

Solar System in depth Solar System in depth

See also

solarsystemscope.com, a website showing 3D model of not just solar system but whole galaxy!

Small bodies of the Solar System

Page updated in July 2019 for the Apollo 11 50th anniversary (first human on Moon on 20th July 1969), expecting a large number of visitors.
Apollo 11 50th anniversary

World countries database

Are you looking for a database of countries in Excel format with their facts and figures, for research or to create a website? I made a database for you. Took few hours to make a scraping script which extract data from The World Factbook and create a CSV file. Takes about 35 min to extract the 268 entries, and I can update anytime you want!

Buy country database:

Countries database - facts and figures

The database include 268 entries, sovereign countries, dependent territories, as well as oceans, World and European Union.

LITE version include area and population for all countries, as well as full facts for United States and United Kingdom.

FULL version include 362 facts, covering everything possible from Geography, People and Society, Government, Economy, Energy, Communications, Transportations, Military and Security, Transnational Issues.

Contact me for custom packages (specific selection of columns)

List of entries in country database

World, Afghanistan, Akrotiri, Albania, Algeria, American Samoa, Andorra, Angola, Anguilla, Antarctica, Antigua and Barbuda, Arctic Ocean, Argentina, Armenia, Aruba, Ashmore and Cartier Islands, Atlantic Ocean, Australia, Austria, Azerbaijan, Bahamas, The, Bahrain, Baker Island, Bangladesh, Barbados, Belarus, Belgium, Belize, Benin, Bermuda, Bhutan, Bolivia, Bosnia and Herzegovina, Botswana, Bouvet Island, Brazil, British Indian Ocean Territory, British Virgin Islands, Brunei, Bulgaria, Burkina Faso, Burma, Burundi, Cabo Verde, Cambodia, Cameroon, Canada, Cayman Islands, Central African Republic, Chad, Chile, China, Christmas Island, Clipperton Island, Cocos (Keeling) Islands, Colombia, Comoros, Congo, Democratic Republic of the, Congo, Republic of the, Cook Islands, Coral Sea Islands, Costa Rica, Cote d’Ivoire, Croatia, Cuba, Curacao, Cyprus, Czechia, Denmark, Dhekelia, Djibouti, Dominica, Dominican Republic, Ecuador, Egypt, El Salvador, Equatorial Guinea, Eritrea, Estonia, Ethiopia, Falkland Islands (Islas Malvinas), Faroe Islands, Fiji, Finland, France, French Polynesia, French Southern and Antarctic Lands, Gabon, Gambia, The, Gaza Strip, Georgia, Germany, Ghana, Gibraltar, Greece, Greenland, Grenada, Guam, Guatemala, Guernsey, Guinea, Guinea-Bissau, Guyana, Haiti, Heard Island and McDonald Islands, Holy See (Vatican City), Honduras, Hong Kong, Howland Island, Hungary, Iceland, India, Indian Ocean, Indonesia, Iran, Iraq, Ireland, Isle of Man, Israel, Italy, Jamaica, Jan Mayen, Japan, Jarvis Island, Jersey, Johnston Atoll, Jordan, Kazakhstan, Kenya, Kingman Reef, Kiribati, Korea, North, Korea, South, Kosovo, Kuwait, Kyrgyzstan, Laos, Latvia, Lebanon, Lesotho, Liberia, Libya, Liechtenstein, Lithuania, Luxembourg, Macau, Macedonia, Madagascar, Malawi, Malaysia, Maldives, Mali, Malta, Marshall Islands, Mauritania, Mauritius, Mexico, Micronesia, Federated States of, Midway Islands, Moldova, Monaco, Mongolia, Montenegro, Montserrat, Morocco, Mozambique, Namibia, Nauru, Navassa Island, Nepal, Netherlands, New Caledonia, New Zealand, Nicaragua, Niger, Nigeria, Niue, Norfolk Island, Northern Mariana Islands, Norway, Oman, Pacific Ocean, Pakistan, Palau, Palmyra Atoll, Panama, Papua New Guinea, Paracel Islands, Paraguay, Peru, Philippines, Pitcairn Islands, Poland, Portugal, Puerto Rico, Qatar, Romania, Russia, Rwanda, Saint Barthelemy, Saint Helena, Ascension, and Tristan da Cunha, Saint Kitts and Nevis, Saint Lucia, Saint Martin, Saint Pierre and Miquelon, Saint Vincent and the Grenadines, Samoa, San Marino, Sao Tome and Principe, Saudi Arabia, Senegal, Serbia, Seychelles, Sierra Leone, Singapore, Sint Maarten, Slovakia, Slovenia, Solomon Islands, Somalia, South Africa, Southern Ocean, South Georgia and South Sandwich Islands, South Sudan, Spain, Spratly Islands, Sri Lanka, Sudan, Suriname, Svalbard, Swaziland, Sweden, Switzerland, Syria, Taiwan, Tajikistan, Tanzania, Thailand, Timor-Leste, Togo, Tokelau, Tonga, Trinidad and Tobago, Tunisia, Turkey, Turkmenistan, Turks and Caicos Islands, Tuvalu, Uganda, Ukraine, United Arab Emirates, United Kingdom, United States, United States Pacific Island Wildlife Refuges, Uruguay, Uzbekistan, Vanuatu, Venezuela, Vietnam, Virgin Islands, Wake Island, Wallis and Futuna, West Bank, Western Sahara, Yemen, Zambia, Zimbabwe, European Union.

World countries population

This table indicate estimated population for EVERY year from 1960 to 2016. 264 countries or regions included.

Source of data: World Bank.

Buy & download

World countries population

World cities database

NEW: in 2019 one of my programmer partners has shared with me a HUGE world city database with few million rows, with permission to sell via my website.

Download FREE sample: World Cities Database SAMPLE (Germany).

Excel can display max 1,048,576 rows that include cities from Andorra to Greece, but opening in Notepad shows that database ending in Zimbabwe have 3,173,958 lines. Obviously not all are cities, most of entries are actually villages. Each row include latitude / longitude.

Buy FULL database:

Buy & download

The original world city database made by Teoalida based on Encarta Encyclopedia

This is a handmade database that I started in 2003, sourcing data from Encarta Encyclopedia 2002 that I got from a friend. It was the ONLY recent geography encyclopedia I had in my hands, all others being world atlas books from my parents childhood (1970s). I did not had connection to internet until 2005 to get other sources of data. Over next years I improved the Word file using Encarta 2006 and 2009, completing all countries. Since 2016 I offer an Excel version beside Word file, and also updated all countries to Encarta 2009.

Being an original product “Made by Teoalida”, you will NOT find this database elsewhere on internet.

Download FREE samples:
World Cities Database simple version .DOC
World Cities Database detailed version .DOC
World Cities Database detailed version .XLS

Buy FULL city database:

Buy & download

World city database World city database

As 2017, the world city database contains:

  • 294 cities with over 1,000,000 people from all countries.
  • 305 cities with 500,000 – 1,000,000 people from all countries.
  • 3085 cities with 100,000 – 500,000 people from all countries.
  • 4000+ with 20,000 – 100,000 people from selected countries that do not have too many cities: North Europe, Eastern Europe (east of Germany), former U.S.S.R countries, Korea, Middle East, South-East Asia, Africa.

If sufficient people buy city database, I will offer an update adding 20,000-100,000 cities from North and South America, Western Europe, South Asia, Japan, China, Australia, etc, which may bring number of cities in database to about 15,000-20,000.

Countries included

191 independent countries: Afganistran, Albania, Algeria, Andorra, Angola, Antigua and Barbuda, Argentina, Armenia, Australia, Austria, Azerbaijan, Bahrain, Bangladesh, Barbados, Belarus, Belgium, Belize, Benin, Bhutan, Bolivia, Bosnia, Botswana, Brazil, Brunei, Bulgaria, Burkina Faso, Burundi, Cambodia, Cameroon, Canada, Cape Verde, Central African Republic, Chad, Chile, China, Colombia, Comoros, Congo, Costa Rica, Côte d’Ivoire, Croatia, Cuba, Cyprus, Czech, Democratic Republic of the Congo, Denmark, Djibouti, Dominica, Dominican Republic, Ecuador, Egypt, El Salvador, Equatorial Guinea, Eritrea, Estonia, Ethiopia, Federal Republic of Iugoslavia, Federated States of Micronezia, Federation of Saint Kitts and Newis, Fiji Islands, Finland, Former Yugoslav Republic of Macedonia, France, Gabon, Georgia, Germany, Ghana, Greece, Grenada, Guatemala, Guinea, Guinea-Bissau, Guyana, Haiti, Honduras, Hungary, Iceland, India, Indonezia, Iran, Iraq, Ireland, Israel, Italy, Jamaica, Japan, Jordan, Kazakhstan, Kenya, Kiribati, Kuwait, Kyrgyzstan, Laos, Latvia, Lebanon, Lesotho, Liberia, Libya, Liechtenstein, Lithuania, Luxembourg, Madagascar, Malawi, Malaysia, Maldives, Mali, Malta, Marshall Islands, Mauritania, Mauritius, Mexico, Moldova, Monaco, Mongolia, Morocco, Mozambique, Myanmar, Namibia, Nauru, Nepal, Netherlands, New Zeeland, Nicaragua, Niger, Nigeria, North Coreea, Norway, Oman, Pakistan, Palau, Panama, Papua New Guinea, Paraguay, Peru, Philippines, Poland, Portugal, Quatar, Romania, Russia, Rwanda, Saint Lucia, Saint Vincent and The Grenadines, Samoa, San Marino, Sao Tomé and Príncipe, Saudi Arabia, Senegal, Seychelles, Sierra Leone, Singapore, Siria, Slovakia, Slovenia, Solomon Islands, Somalia, South Africa, South Coreea, Spain, Sri Lanka, Sudan, Suriname, Swaziland, Sweden, Switzerland, Tajikistan, Tanzania, Thailand, The Bahamas, The Gambia, Togo, Tonga, Trinidad Tobago, Tunisia, Turkie, Turkmenistan, Tuvalu, Uganda, Ukraine, United Arab Emirates, United Kingdom, United States, Uruguay, Uzbekistan, Vanuatu, Vatican City, Venezuela, Vietnam, Yemen, Zambia, Zimbabwe.

Plus dependent territories.

Administrative divisions (states, regions, provinces, etc) are included for 30 most major countries, second-level administrative divisions are included only for Italy at this moment, but I am able to add them also for United Kingdom, France, Spain, Greece and United States if you want.

100,000+ cities represented on map

Cyan – 1,000,000+, Green 500,000+, Yellow 100,000+
The database itself does NOT contain GPS coordinates, on this map the city location is generated automatically by Google Fusion Tables, some cities appear in wrong location due to having same name with another city or alternate spellings.

City database history

Geography is my oldest hobby, starting around 6 years old I was studying geographic atlases and already learned all the countries and capitals. Between 1998 and 2005 I done few works in Word / Excel based on World atlas books or digital encyclopedias, updated them and published on website for first time in 2015.

2003-2005 – I compiled list of cities of Europe, former USSR countries, Middle East and Africa. Using Encarta 2002, which have a limited amount  of cities shown on map so I was planning to include ALL cities shown on map, meaning cities over 20.000 people, but for biggest countries of Europe would have been too many so I have included only cities over 100.000 people.

2005 – my family installed internet connection. I discovered Wikipedia and no longer saw a purpose of making my own world cities database, the internet provided already up-to-date population numbers and updating my table constantly is a waste of effort. I left geography and migrated to other hobbies such as architecture, without finishing the list of World Cities as intended (to include all cities over 20,000 people from EVERY country).

2006-2007 – I included few more countries: China, Japan, Mexico, etc, using Encarta 2006, this one having a lot more cities shown on map, would take a lifetime to include all, so I decided to include only cities over 100,000+ people. Then I let the city database abandoned.

2015 – given by the success of one of my other hobbies, Car Database, which turned into a business by selling Excel databases to companies and web developers, I started questioning whenever the World Cities hobby, if converted to Excel, may be useful for web developers for CSV and MySQL databases. I published for the first time the incomplete city database on my website, one-country SAMPLE, inviting people to contact me if they wants the Word file for free, or Excel conversion as paid service.

Over next months I completed remaining countries from Asia, North and South America, including cities over 100,000 people, using Encarta 2009 (last Encarta). I created 2 Word files, one for all countries including cities over 100,000 (60 pages) and one for 20 biggest countries, including cities by province over 20,000 or 100,000 depending by case (70 pages).

2016 August – first person contact me to ask for city database. An Indian wanted list of cities in Excel, I offered him to convert the Word to Excel but he was not patient and asked me to give what I currently have (the Word file). However, I started converting it to Excel to be ready for sale when next customer is coming. I created 2 Word files, both containing all countries, LITE one containing cities over 100,000 and BIG one contain cities over 20,000 by province, plus the Excel file similar with Word BIG file, and put them for sale so people can buy directly (instead of contacting me to get complete database). Also expanded the article to get more traffic from Google.

By November 2017, 5 people purchased it, reason for which I decided to offer an update adding more cities from specific countries.

Note: city classification is done based on the icon shown in Encarta Encyclopedia 2009, and it do not always reflect the actual population indicated in city article. For example there are cities with icon of 1,000,000+ but having population around 900,000, as well as cities with icon of 500,000+ but having population over 1 million. Most cities under 100,000 do not even have article to indicate exact population. This was Encarta Encyclopedia, the only resource of information available for me before connecting to internet.

Number of cities in each country is nearly proportional with country urban population.

This city database does NOT pay attention to administrative status, which vary from country to country so making a single worldwide database of official “cities” excluding towns, communes, villages and other places” pose major troubles, especially since few countries does not have any “city”.

For example: Romania have officially 103 municipalities and 217 towns, but my database include 116 cities that have icon of 20,000+ population in Encarta Encyclopedia. Germany have 2058 “towns”, Italy have 7954 “communes”, Spain have 8122 “municipalities”, there is no “city” or “town” designation.

Another world city database

In November 2017 one customer gave me his world city database in exchange of a database made by me, with permission to resell it on my website.

This world city database contains 76,799 cities, sorted by 212 countries, 51 US states, 13 Canadian provinces, 4 UK countries. For other countries it do not contain administrative divisions. The database focus primarily on United States, as it contains 43,299 cities from US compared with 33,500 cities from rest of world. Price: 1 dollar per 1000 cities.

Download FREE sample:

WorldCitiesDatabase-SAMPLE.xlsx

Buy FULL database:

Buy & download

City database per country

Government census websites may provide complete list of cities in their countries, for example http://censusindia.gov.in/Tables_Published/Admin_Units/admin.html – List of Towns in India – 5161 towns.

I think about compiling a database from all countries, sourcing data from government websites. With ~200 countries in the world, this will be a huge effort. I would choose to make this database without population because each country have different census year. Anyone interested?

Romania city population .XLS – I made this originally in 1999 in Word sourcing data from books then remade in 2006 in Excel format sourcing data from Wikipedia pages of each city (hours of work!). It does contains all 320 cities of Romania with their population at every census from 1912 to 2011 (next census probably in 2020) except when they did not had city status at the census year. Why Romania? Because here I was born.

United States city population .XLS – including 304 cities with over 100.000 inhabitants. Source of data: Wikipedia, copy-pasted in Excel and enhanced visually. Contains population at 2010 census, 2015 estimate, as well as area, density and GIS coordinates, latitude and longitude.

If you want for other countries, please ask!

Other geography stuff

In 1998-2003 I also wrote lists of natural features, seas, rivers, etc, and astronomy-related stuff…. all in Word. I started using Excel more than Word in 2003. Today I consider them useless because such info exist for FREE on Wikipedia and is not necessary to make my own databases.

Buildings database

I am sorry for owners of Emporis and SkyscrapersCenter.

On 9 September 2019 I had an unusual large number of visitors on this page and on 10 September my website was suspended for abuse, I contacted hosting company and they told me that received an abuse report regarding “Buildings Database” page and I need to delete this page in order to have website unsuspended.

Message to the person who made abuse report: why were you so RUDE to complain to hosting provider and suspend WHOLE website (which have over 2000 visitors per day) instead of contacting me nicely (via email of Live chat – I was ONLINE at the time you visited me) and ask removal of ONE page if it infringed your rights (which barely gets 2-3 visitors per day)? And by this way I can know which database was the problem?

This page was providing analysis of skyscraper construction over last century based on 2 different databases one from Emporis.com and one from SkyscraperCenter.com (both abandoned due to lack of sales), I don’t even know which website made abuse report to my hosting company.

Customers interested in databases can look at my other databases available.


Music Database

Update October 2018: I made a web scraping software to extract data from Apple Music with a speed of ~2 seconds per music album. I can provide Excel / CSV files of your favorite artists / bands in a matter of minutes, up to 10 artists FREE of charge. Watch the video!

If you are professional looking to pay for a larger database (dozens, hundreds, thousands artists), you are invited to discuss project requirements.

Original music database with songs rating

Shortly after connecting to internet in 2005 (I was 16 years old), I started creating a database in Excel with the MP3 songs downloaded, to review each song and give a rating from 0 to 16, and make top best artists, best albums, best genres, etc, using complex mathematical formulas, to show to friends exactly what music I like and how much.

This music database was NOT intended to contain every possible song released, but ONLY my favorite artists / bands plus small a selection of artists / bands representative for each region of the world and each music genre.

Database is useful for me to organize my own music. How useful is for other people, I don’t know… your feedback is needed to make a better database for YOU and future customers!

Buy FULL music database

Buy & download

Free SAMPLE

Music Database

Some local friends, the ones who know me in real life or the ones I sent my music to them via internet, blamed me for listening to the shittest music possible. What is the problem if I rarely listen music from my own country? if I can speak 4 languages and I listen music in languages that they do not understand, as well as other languages that I do not understand myself too, does not mean that the music is a fucking piece of shit. I love collecting music sung in as many languages possible and I do not always care about the lyrics.

Also the friends said that this Excel Music Database is the craziest thing made by me, or most useless thing they ever saw, and suggested me to STOP wasting time doing such things.

Music database evolution & releases:
Jun 2007 edition – 2000 rated songs, 37 artists in top
Feb 2008 edition – 3000 rated songs, 57 artists in top
Aug 2009 edition – 4000 rated songs, 71 artists in top
Apr 2012 edition – 5000 rated songs, 92 artists in top
Late 2013 edition – 7523 total songs, 5467 rated songs, 97 artists in top
Jun 2016 edition – 8365 total songs, 6043 rated songs, 105 artists in top
Feb 2019 edition – 32078 total songs, 6839 rated songs, 110 artists in top

Note: until 2016 edition with 6000 rated songs, the database contained only artists from which I have downloaded mp3 files, listened them and rated at least a couple of their songs. The 2019 release included also the best-selling artists regardless I downloaded mp3 and listened them to add ratings. The top include only artists with minimum 10 songs and at least half of their songs rated.

How my hobby for music started

Hobby for music started in 1999, after replacing our 486 computer with 200 MB hard disk with an AMD K6-2 333 Mhz with 4.3 GB hard disk, my dad brought from friends some CDs with mp3 songs, mostly pop and rock songs from 1960s to 1990s, and made a selection of songs according his own preferences, unfortunately deleting many songs that I liked. The he burned the selected songs on 2 CDs. I had no rights to decide what to listen, only parents were putting music in our home, and for many years, my dad’s songs selection was the only music I was listening. I wonder, if I did not had these restrictive parents, were my music preferences different today?

Out of my dad selection, I also made own selection of few dozens songs which I was listening only when I was home alone.

After 2003 revolution and removal of restrictions imposed by family, I was able to listen my music anytime I wanted, and soon I got bored by my few dozens songs, I was desperate to get more music, and started recording via TV-Tuner (ending in having lots of songs in bad quality). I wanted more songs from certain artists, I went to music stores in the city but my parents did not agreed to pay money for music CDs.

In 2005 we connected to internet so I was able to download music freely for first time using DC++ file sharing network (Youtube was not yet launched). In November 2004 I accidentally turned on TV when was Junior Eurovision Song Contest, won by Maria Isabel. So, first songs that I downloaded after connecting to internet were Maria Isabel’s 2 albums and 2 more children artists discovered while looking for Maria Isabel: 3+2 and Danna Paola, additional songs from the artists I already had songs (Aqua, Shakira, Shakin Stevens, Thalia), also originals of about 100 songs recorded from TV in bad quality.

In 2006 while looking on Youtube for Danna Paola I found accidentally a video featuring Tatiana, so I started looking for Tatiana music too, on ARES (file sharing software popular in Latin America) and direct downloads (like MegaUpload) and by this way I downloaded also Fandango, Flans, Timbiriche, R.B.D, and got addicted to Mexican 1980s-1990s pop-rock. In 2008 I had 3000+ songs of which 30% being from Mexico. Tatiana remaining my all-time favorite even in 2013. In the same time I started watching Mexican TV shows and I learned Spanish.

I was looking for more diversity, so since 2009 I also downloaded music from other Latin American countries, and got addicted to Brazil country music as well as 3 big artists hosting children shows (Angelica, Eliana, Xuxa), which generated bad comments from my overseas friends (are you retarded? why do you listen to children music?), by this way in just one year I learned Portuguese to the level I am able to understand lyrics of any song. I also downloaded American country music (Alan Jackson, Garth Brooks, Shania Twain, Taylor Swift, etc), British, French, German, Italian pop, rock and folk (ABBA, Al Bano & Romina Power, Alizee, Andrea Berg, Ricchi e Poveri, etc), Japanese and Chinese pop music (many small artists), I liked all them but none caused long-term addiction until 2013 discovery of Kyary Pamyu Pamyu. There is also music that I can’t tolerate: Arabic and Indian music, and most of hip-hop music.

The idea of creating an Excel music database

The idea of using Excel to make table with songs, and rate each song, dates back from 2002. I added all songs in WinAmp, clicked “generate HTML playlist” and copied into Excel, then added a numerical rating. This means no complete discography of any artist, no year of release, etc.

In 2005, thanks to the internet connection, access to internet music stores and filesharing networks, I could get information about artists and complete discographies, with album names and release date, I decided that is the time to start making a serious music database.

I did not intended to include ALL songs from my computer, or to reach certain number of songs in database within specified deadline. I just added artist by artist at random basis, originally adding only my favorite artists (most of them having short music career), and since 2008 I paid attention to famous artists, adding in database a selection of artists representative for every region of world and every genre of music.

When I started database in 2005, iTunes was the biggest music store and I could copy-paste whole album’s table of song with just few clicks, so columns in my database matched columns in iTunes. iTunes app was redesigned in 2010 so I had to copy songs name one by one. Database also contains albums sourced from other websites if they are not available on iTunes, as well as names of mp3 files found on the internet (possible incorrect spelling).

In 4 years, the music database reached over 4000 rated songs, after which I continued to add new songs at slower rate.

I published music database on my website in 2010 with a simple download link. While my other childhood hobbies such as car database and real estate databases gained interest along professionals paying big $$ for a database and allowed me to make a living, the music database turned to be one of most USELESS things that I ever made!

As 2016 I released a new edition with 6000 songs, together with writing this long article. I made a “free purchase” button that require visitors to enter an email in order to download files. Over 200 people downloaded it and I emailed a couple of them asking if my database helped their needs and how they use it. Only 2-3 replied saying that were needing an Excel spreadsheet for a school project. They were not even interested in music!

Starting from March 2018 I put for sale at $1, inviting people to contact me IF you want to download it for free. By end of year 10 people paid $1 without any prior communication with me, meaning that people are willing to pay money for a music database, but what should contain the database to be usable for you?

A new era started in 2018: I discovered Spotify, a music streaming service where you can listen full songs free of charge. I also used for the first time a web scraping software to get data faster from Apple Music, at rate of 2 seconds per page (album), Amazon Music Store may be bigger than Apple, I found many artists on Amazon that do not exist on Apple, but due to inconsistencies between various albums on Amazon I need to spend extra time to clean up scraped data, so I prefer to source data from Apple Music unless it miss my favorite artists.

I expanded database with few extra columns such as Source of data or Record labels, and using scraping software I quickly added thousands new songs in database (also able to create custom music database of any artist / band at your choice) then using Spotify I listened full songs and rated them without having to dig for mp3 files on torrents and other pirated music download websites.

Songs rating system

Since the Music Database was started in the era I was fascinated by Base-16 numbering system, the songs are rated with numbers ranging from 0 to 16, originally lower values being better, but in 2016 I inverted the ratings, making higher values better. Total: 17 possible values, which is my birthday and my favorite number.

Rating is composed from 4 categories, each having value from 0 to 4.

Sound: I love instrumental diversity and guitars. Some rock and country songs can win rating 4 in this category, pop songs are around 1-3, while hip-hop songs have rating 0.

Voice: I love nice voice and lyrics diversity, but I don’t care about the lyrics content. The songs sung in languages unknown by me or artificial languages can win rating 4 too. The rating drops if lyrics contains too many repeating words, or if the song is only instrumental, the rating is 0.

Mix: I love the songs which have a continuous and fast rhythm. Some dance songs can win rating 4 in this category. most rock songs have rating 2-3, most pop songs have rating 1-3, slow songs or bad mixed songs gets rating 0.

Addiction: some songs attract me so much that I listen them again and again for hours, they win rating 4 in this category, they are bubblegum dance, Japanese pop as well as songs from children show of Latin America (this is what attract negative comments from my friends, that I listen childish music, music for retarded people, etc). Rock and country despite of winning in other categories, makes me bored after listening few times so they have rating 1-2, while the louder songs like hard rock which make pain for my ears that I cannot listen a song until its end have rating 0.

To rate each song, is enough to listen 30-second preview on iTunes, but I prefer to rate only when I download full songs. Addiction rating is hard to be decided initially and sometimes I modify it after days or months. The 17 ratings are distributed like Gaussian curve, but asymmetric, rating 8 having 10% of songs, rating 0 having 2% and rating 16 having 0.2%.

My everyday playlist is composed by songs rated from 12 to 16, including songs with rating 8-11 temporarily and keep them if addiction rating is 3 or higher. This create a playlist of about 20% of songs included in database.

Artist ranking system

In 2005 I made a ranking based on average ratings of all songs of each artist. But this turned into a problem: the top places were occupied by small artists that produced just few but good songs, while the most famous artists occupied last places. Is natural that the artists with long career to not be able to make many songs good as the few good songs.

In 2006 I added a SCORE for each artist calculated by a more complex formula. I added columns for number of songs and the total value of songs. Song value is calculated like inverted binary logarithms: value 16 divided by every rating, a song rated 0 have value 1, a song rated 8 have value 2, a song rated 12 have value 4, a song rated 14 have value 8, exception for rating 1 which have value 12 and rating 0 which have value 16.

In 2008 I further improved the ranking by adding a multiply factor for song diversity, calculated like this: total value of songs divided by number of songs divided by average song rating, sum resulted square rooted and and multiplied by 2, resulting a multiply factor between 1 and 1.5. Artists having diversity, few good songs in a total of mostly bad songs, are helped by having higher multiply factor than the artists who have all songs at same medium rating.

How the score is calculated: average song rating (ranging 0 to 16) multiplied with 4 (I can increase this multiply factor to boost artists with one but good album or reduce the multiply factor to boost artists with long career), plus square root of total value of songs (ranging 4 for one-album rappers to 30 for Tatiana’s 20+ albums), sum of these 2 is multiplied with diversity factor between 1 and 1.5, them multiplied with 128 to get a nice-looking 4-digit score for all artists varying from 2500 to 9000+. This numerical value have no other meaning than classifying artist in top. Do not consider that an artist with score 8000 is two times better than an artist with score 4000.

Digital cameras database

If you are looking for a database of digital cameras specifications in Excel format to create a website, use in a repair shop or anything similar, I created an Excel database for you using a scraping software to extract data from digicamdb.com.

Download SAMPLE: Digital Cameras Database.xls
(LITE package include URL, image, brand, model, year).

Buy FULL database + 1 year of FREE updates:

Buy & download

Price in Singapore Dollar (1 SGD = 0.7 USD)

Description

Database created in January 2018 (3679 models) using our “universal” web scraping software. While my other projects such as Car Database and Mobile Phones Database get a constant flow of sales and this encourage me to update regularly, Digital Cameras Database is poorly selling. I update it when someone ask for an update, each update taking about 30 minutes.

Updated January 2019 (3705 models).

Looking for TV, laptops, or other kind of electronics? Please recommend me similar websites to scrap data from and I will create more databases!

Digital camera brands included

Acer, AgfaPhoto, BenQ, Canon, Casio, Concord, Contax, Epson, Fujifilm, GE, HP, Jenoptik, JVC, Kodak, Konica, Konica-Minolta, Kyocera, Leica, Minolta, Minox, Nikon, Nokia, Olympus, Panasonic, Pentax, Praktica, Ricoh, Rollei, Samsung, Sanyo, Sigma, Sony, Toshiba, Vivitar, Yakumo.

Digital camera specifications included

URL 100%, Image URL 100%, Brand 100%, Model 100%, Also known as 3.97%, Megapixels 50.96%, Effective megapixels 49.04%, Total megapixels 49.04%, Sensor size 100%, Sensor type 100%, Sensor resolution 100%, Max. image resolution 100%, Crop factor 100%, Optical zoom 100%, Digital zoom 100%, ISO 100%, RAW support 9.92%, Manual focus 96.17%, Normal focus range 80.84%, Macro focus range 79.18%, Focal length (35mm equiv.) 86.90%, Aperture priority 98.72%, Max aperture 100%, Max. aperture (35mm equiv.) 100%, Depth of field 99.54%, Metering 100%, Exposure Compensation 100%, Shutter priority 100%, Min. shutter speed 92.25%, Max. shutter speed 92.42%, Built-in flash 98.94%, External flash 98.23%, Viewfinder 99.92%, White balance presets 92.96%, Screen size 98.15%, Screen resolution 87.66%, Video capture 9.92%, Max. video resolution 9.76%, Storage types 98.48%, USB, HDMI 9.95%, Wireless 9.95%, GPS, Battery 97.96%, Weight 95.95%, Dimensions 97.34%, Year 100%.

How the Excel camera database can be used

Filter the digital camera database and get a list of cameras phones having specific features.
Filter the digital camera table by year and make list of features introduced in each year of history.
Analyze data and make statistics what are the best cameras in each range.
Convert Excel spreadsheet to CSV or MySQL and create your own digital camera comparison website or mobile app.
Etc…

Digital cameras specifications database