Travel data has its own vocabulary, and mixing up the terms causes real confusion in projects. This A–Z reference from TravelScrape defines the words travel data and scraping teams use every day — clearly, in one place.
These are the core terms you'll meet across travel data scraping — from metrics like ADR and RevPAR to scraping concepts like proxies, rendering and rate limiting. Each definition is written to stand alone, so you can look up a single term or read the whole set.
Travel data & scraping terms (A–Z)
| Term | Definition |
|---|---|
| ADR | Average Daily Rate — the average rental income per paid occupied room over a period. A core hotel revenue metric. |
| Anti-bot system | Technology OTAs use to detect and block automated traffic — captchas, fingerprinting, behavioural analysis and rate limiting. |
| API | Application Programming Interface — a sanctioned way for software to request data directly from a provider. |
| Availability | Whether a room, seat or product is bookable for a given date. |
| Backfill | Collecting historical data after the fact to build a complete time series. |
| Batch scraping | Collecting data on a fixed schedule (e.g. nightly) rather than continuously in real time. |
| Bot | An automated program that performs tasks online, such as visiting and reading web pages. |
| Captcha | A challenge designed to tell humans from bots, often triggered when a site suspects automation. |
| Crawl | Systematically visiting many pages of a site, usually to discover URLs to scrape. |
| CSS selector | A rule that targets an element on a page so a scraper knows where a value sits. |
| Currency normalisation | Converting prices to a common currency so they can be compared fairly. |
| Data pipeline | The end-to-end flow that collects, cleans, stores and delivers data. |
| De-duplication | Removing repeated records so each observation appears once. |
| Dynamic pricing | Adjusting prices automatically based on demand, competition and other signals. |
| Fingerprinting | Identifying a visitor by browser and device characteristics, used to detect bots. |
| GDS | Global Distribution System — networks like Amadeus and Sabre that distribute travel inventory to agents. |
| Geo-targeting | Collecting from a specific country or city, since OTAs vary prices by location. |
| Headless browser | A real browser run without a visible window, used to render JavaScript-heavy pages. |
| HTTP 403 | A 'Forbidden' response, often meaning an IP or request has been blocked. |
| HTTP 429 | A 'Too Many Requests' response, indicating rate limiting. |
| Ingestion | The step where collected data enters your storage and validation layer. |
| IP ban | Blocking requests from a specific IP address that a site has flagged. |
| JSON | A common structured data format used to deliver scraped data. |
| Metasearch | Sites like Kayak or Skyscanner that compare prices across multiple OTAs and suppliers. |
| Occupancy rate | The percentage of available rooms occupied over a period. |
| OTA | Online Travel Agency — a site that sells travel inventory online, e.g. Booking.com or Expedia. |
| Parsing | Reading raw page content and extracting the specific values you need. |
| Proxy | An intermediary server used to route requests through different IP addresses to avoid blocking. |
| Rate limiting | A site's restriction on how many requests an IP can make in a time window. |
| Rate parity | Keeping a hotel's price consistent across all the channels it is sold on. |
| Real-time scraping | Collecting data on demand, the moment it's needed, rather than on a schedule. |
| Rendering | Running a page's JavaScript so dynamic content (like prices) appears before extraction. |
| Residential proxy | A proxy using a real consumer IP, harder for sites to detect than datacentre IPs. |
| RevPAR | Revenue Per Available Room — ADR multiplied by occupancy. A key hotel performance metric. |
| Scraper | Software that automatically reads web pages and extracts structured data. |
| Selector | A rule (CSS or XPath) that tells a scraper where a piece of data sits on a page. |
| Throttling | Deliberately slowing requests to stay within a site's limits. |
| User agent | A string identifying the browser/device making a request; rotated to look natural. |
| Validation | Checking collected data for completeness and correctness before it's used. |
| XPath | A path expression used to locate elements within a page's structure. |
Want these terms in context?
Definitions are most useful alongside real workflows. See how the metrics and concepts above appear in practice in What Is OTA Scraping?, How to Structure Travel Data and Common Scraping Errors. Need a term we haven't covered? The TravelScrape team is happy to explain.